31271
Connected Robot Platform for Children with Autism Spectrum Disorders

Poster Presentation
Friday, May 3, 2019: 10:00 AM-1:30 PM
Room: 710 (Palais des congres de Montreal)
X. Jin1, Z. Tan1, W. Cao1, H. Zhu2 and J. Chen1,3,4, (1)South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou, China, (2)Child Developmental & Behavioral Center, Third Affiliated Hospital of SUN YAT-SEN University, Guangzhou, China, (3)Chalmers University of Technology, Gothenburg, Sweden, (4)School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
Background: Children with autism spectrum disorders (ASD) often show certain barriers to language and behavior compared to typical developing children. By introducing real-time human-robot interactions, social robots could be applied for language and behavior training for children with ASD in daily life. However, data processing, such as emotion and gesture recognitions, can be computing-intensive tasks, which might be difficult to be realized with high quality if only using limited computation power offered by the robot itself.

Objectives: We aim at building cloud computing enabled social robot platform for children with ASD. Specifically, we are adding emotion and gesture recognition modules on top of the existing platform. Such functions could greatly help recognize children’s emotion states and gestures, based on which the robots could react in a real-time manner to children with ASD.

Methods: As shown in Figure 1, we use the social robot NAO as a front end to interact with the participants. NAO can record the audio and video data together with the surrounding equipment, including microphones, high-definition video cameras, motion sensing input devices (e.g., Kinect produced by Microsoft), and eye-trackers. Emotion recognition can be either carried out by local processing at NAO or a third-party cloud platform, e.g., FACE++, which is a commercially-available facial recognition and image recognition service platform (https://www.faceplusplus.com.cn/). Local processing means NAO runs emotion recognition locally to identify the participants' emotions. Participants' emotions can be obtained via NAOQi interface, which is a hardware interface to control Nao using a web client. NAO can also upload its collected image data to FACE++ through Internet for emotion recognition. Apart from a third-party cloud platform, video and audio data can also be sent to our own data center for gesture recognition and other services.

Results: We measured the accuracy and total waiting time of emotion recognition, 10 times for each tested emotion. The total waiting time here is defined as the time duration from the moment of taking pictures by NAO to that Nao receives emotion information. The results for each emotion are listed in Table 1. The accuracy of emotion recognition carried out by Nao directly is not as good as FACE++, although the average total waiting time is much lower in local processing compared to using a third-party cloud platform. It is because the total waiting time is greatly affected by network delay when using a third-party cloud platform.

Conclusions: We have been working on adding emotion and gesture recognition functions on top of our existing cloud computing enabled social robot platform. The results revealed that using cloud computing certainly helped improving the quality of emotion recognition but suffering from higher response time. Improving Internet connection or pushing computing facility close to the user end (referred to as edge/fog computing) can be a potential solution to address the latency issue in our connected robot platform.