16878
The Development of an Intelligent Virtual Reality Intervention Application

Friday, May 16, 2014
Meeting Room A601 & A602 (Marriott Marquis Atlanta)
E. Bekele1, J. W. Wade2, D. Bian2, L. Zhang2, A. Swanson3, M. S. Sarkar4, Z. Warren1 and N. Sarkar5, (1)Vanderbilt University, Nashville, TN, (2)Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, (3)Vanderbilt Kennedy Center, Department of Pediatrics, Department of Psychiatry, Vanderbilt University, Nashville, TN, (4)Computer Science, Middle Tennessee State University, Murfreesboro, TN, (5)Mechanical Engineering, Vanderbilt University, Nashville, TN
Background:  Recent advances in human machine interaction have enabled the use of computer technology (Goodwin, 2008), robot-mediated systems (Bekele et al., 2012, Warren et al., 2013), and virtual reality (VR) based systems (Bekele et al., 2013; Kandalaft et al., Lahiri et al., 2013) for potential use in social interaction and intervention paradigms for individuals with ASD.   VR platforms have preliminary demonstrated the capacity to improve social skills in individuals with ASD (Bekele et al., 2013; Kandalaft et al., Lahiri et al., 2013; however, currently utilized paradigms have often been reliant on confederates or modalities (e.g., menu response paradigms) that potently limit the ability of systems to mimic, and potential impact, naturalistic interactions beyond the paradigms themselves. 

Objectives: This current work describes the development and preliminary validation  results of a multimodal VR interface and a dynamic fusion strategy for incorporating these modes into within system decision making.  This system was constructed to be able to measure and respond in real-time to user speech, gaze patterns, and physiological responses.

Methods: Three interfaces were integrated into a VR interaction platform: 1) speech-based turn-taking dialog management, 2) multi-channel peripheral physiological signal detection (Liu et al., 2008), and 3) an eye gaze sensitive module (see Bekele et al., 2013).   While the physiological signal detection and eye gaze modules were developed in previous work, this specific research examined the development of speech-based recognition and integration of all three systems into one interactive environment.  For the speech interface we developed domain dependent conversation threads for more reliable speech-based recognition.  . We also developed a dialog management engine that parses these threads and perform a lexical comparison between each of the options and what the user utterances as captured from a speech interface module within a specified time interval.   Initial validity of these interfaces was tested individually across user studies.  Further, the output of the physiological detection algorithm and the gaze sensitive module were assessed during the user performance via multimodal input fusion (Dumas et al., 2009; Jaimes et al., 2007) tested against clinician ratings as a ground-truth for decision making.

Results: Validation results for the dialog management system in terms of the performance of the speech recognition, lexical similarity and overall option selection will be presented and available for real-time demonstration. For the physiological-based affect recognition, we randomly divided the data set in to training, validation and testing sets with proportion of 70%, 15%, and 15%, respectively, for all the classifiers and used 10-fold cross validation to fit the best model in each case. Results of four channel physiological data classification utilize varied machine learning algorithms exceeded 80%. 

Conclusions: The current work provides a preliminary demonstration of the ability to develop VR environments and paradigms sensitive to not only performance within systems, but also gaze patterns, physiological responses, and naturalistic speech.  The ability to harness such behaviors within future intelligent systems may dramatically enhance VR environments as social intervention tools.