31271
Connected Robot Platform for Children with Autism Spectrum Disorders
Objectives: We aim at building cloud computing enabled social robot platform for children with ASD. Specifically, we are adding emotion and gesture recognition modules on top of the existing platform. Such functions could greatly help recognize children’s emotion states and gestures, based on which the robots could react in a real-time manner to children with ASD.
Methods: As shown in Figure 1, we use the social robot NAO as a front end to interact with the participants. NAO can record the audio and video data together with the surrounding equipment, including microphones, high-definition video cameras, motion sensing input devices (e.g., Kinect produced by Microsoft), and eye-trackers. Emotion recognition can be either carried out by local processing at NAO or a third-party cloud platform, e.g., FACE++, which is a commercially-available facial recognition and image recognition service platform (https://www.faceplusplus.com.cn/). Local processing means NAO runs emotion recognition locally to identify the participants' emotions. Participants' emotions can be obtained via NAOQi interface, which is a hardware interface to control Nao using a web client. NAO can also upload its collected image data to FACE++ through Internet for emotion recognition. Apart from a third-party cloud platform, video and audio data can also be sent to our own data center for gesture recognition and other services.
Results: We measured the accuracy and total waiting time of emotion recognition, 10 times for each tested emotion. The total waiting time here is defined as the time duration from the moment of taking pictures by NAO to that Nao receives emotion information. The results for each emotion are listed in Table 1. The accuracy of emotion recognition carried out by Nao directly is not as good as FACE++, although the average total waiting time is much lower in local processing compared to using a third-party cloud platform. It is because the total waiting time is greatly affected by network delay when using a third-party cloud platform.
Conclusions: We have been working on adding emotion and gesture recognition functions on top of our existing cloud computing enabled social robot platform. The results revealed that using cloud computing certainly helped improving the quality of emotion recognition but suffering from higher response time. Improving Internet connection or pushing computing facility close to the user end (referred to as edge/fog computing) can be a potential solution to address the latency issue in our connected robot platform.