Towards Empathetic Human-Robot Interactions

Fung, Pascale; Bertero, Dario; Wan, Yan; Dey, Anik; Chan, Ricky Ho Yin; Bin Siddique, Farhad; Yang, Yang; Wu, Chien-Sheng; Lin, Ruixi

doi:10.1007/978-3-319-75487-1_14

Pascale Fung¹⁴,
Dario Bertero¹⁴,
Yan Wan¹⁴,
Anik Dey¹⁴,
Ricky Ho Yin Chan¹⁴,
Farhad Bin Siddique¹⁴,
Yang Yang¹⁴,
Chien-Sheng Wu¹⁴ &
…
Ruixi Lin¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9624))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2101 Accesses
16 Citations

Abstract

Since the late 1990s when speech companies began providing their customer-service software in the market, people have gotten used to speaking to machines. As people interact more often with voice and gesture controlled machines, they expect the machines to recognize different emotions, and understand other high level communication features such as humor, sarcasm and intention. In order to make such communication possible, the machines need an empathy module in them, which is a software system that can extract emotions from human speech and behavior and can decide the correct response of the robot. Although research on empathetic robots is still in the primary stage, current methods involve using signal processing techniques, sentiment analysis and machine learning algorithms to make robots that can ‘understand’ human emotion. Other aspects of human-robot interaction include facial expression and gesture recognition, as well as robot movement to convey emotion and intent. We propose Zara the Supergirl as a prototype system of empathetic robots. It is a software-based virtual android, with an animated cartoon character to present itself on the screen. She will get ‘smarter’ and more empathetic, by having machine learning algorithms, and gathering more data and learning from it. In this paper, we present our work so far in the areas of deep learning of emotion and sentiment recognition, as well as humor recognition. We hope to explore the future direction of android development and how it can help improve people’s lives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Evaluation of Robot Emotion Expressions for Human–Robot Interaction

Article 30 September 2024

A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots

Article 08 February 2018

NeuroRobo: Bridging the Emotional Gap in Human-Robot Interaction with Facial Sentiment Analysis, Object Detection, and Behavior Prediction

Notes

1.
https://catalog.ldc.upenn.edu/LDC94S13A.
2.
http://liwc.wpengine.com.
3.
Extension of the text8 corpus, obtained from http://mattmahoney.net/dc/textdata.
4.
From bigbangtrans.wordpress.com and http://www.livesinabox.com/friends/scripts.shtml.

References

Attardo, S.: Linguistic Theories of Humor, vol. 1. Walter de Gruyter, Berlin (1994)
Google Scholar
Attardo, S.: The semantic foundations of cognitive theories of humor. Humor-Int. J. Humor Res. 10(4), 395–420 (1997)
Article Google Scholar
Bahdanau, D., Chorowski, J., Serdyuk, D., Brakel, P., Bengio, Y.: End-to-end attention-based large vocabulary speech recognition. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016)
Google Scholar
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy), vol. 4, p. 3 (2010)
Google Scholar
Bertero, D., Fung, P.: Deep learning of audio and language features for humor prediction. In: International Conference on Language Resources and Evaluation (LREC) (2016)
Google Scholar
Bertero, D., Fung, P.: Predicting humor response in dialogues from TV sitcoms. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016)
Google Scholar
Bertero, D., Fung, P.: A long short-term memory framework for predicting humor in dialogues. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Duffy, B.R., Joue, G.: Intelligent robots: the question of embodiment. In: Proceedings of the Brain-Machine Workshop (2000)
Google Scholar
Esuli, A., Sebastiani, F.: SENTIWORDNET: a publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6, pp. 417–422 (2006)
Google Scholar
Eyben, F., Wöllmer, M., Schuller, B.: openSMILE: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 1459–1462. ACM (2010)
Google Scholar
Han, K., Yu, D., Tashev, I.: Speech emotion recognition using deep neural network and extreme learning machine. In: INTERSPEECH, pp. 223–227 (2014)
Google Scholar
He, K., Sun, J.: Convolutional neural networks at constrained time cost. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5353–5360 (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (2015)
Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (2014)
Google Scholar
Kim, Y.: Conditional neural networks for sentence classification. In: EMNLP 2014 (2014)
Google Scholar
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data (2001)
Google Scholar
Liscombe, J., Venditti, J., Hirschberg, J.B.: Classifying subject ratings of emotional speech using acoustic features. Columbia University Academic Commons (2003)
Google Scholar
Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Intell. Res. 30, 457–500 (2007)
MATH Google Scholar
Mataric, M.J.: The role of embodiment in assistive interactive robotics for the elderly. In: AAAI Fall Symposium on Caring Machines: AI for the Elderly, Arlington, VA (2005)
Google Scholar
Okazaki, N.: CRFsuite: a fast implementation of conditional random fields (CRFs) (2007). http://www.chokkan.org/software/crfsuite
Polzehl, T., Möller, S., Metze, F.: Automatically assessing personality from speech. In: 2010 IEEE Fourth International Conference on Semantic Computing (ICSC), pp. 134–140. IEEE (2010)
Google Scholar
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Silovsky, J.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (No. EPFL-CONF-192584). IEEE Signal Processing Society (2011)
Google Scholar
Purver, M.: The theory and use of clarification requests in dialogue. Unpublished Doctoral Dissertation, University of London (2004)
Google Scholar
Rakov, R., Rosenberg, A.: “Sure, i did the right thing”: a system for sarcasm detection in speech. In: INTERSPEECH, pp. 842–846 (2013)
Google Scholar
Řehůřek, R., Sojka, P.: Gensim–python framework for vector space modelling. NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic (2011)
Google Scholar
Reyes, A., Rosso, P., Veale, T.: A multidimensional approach for detecting irony in Twitter. Lang. Resour. Eval. 47(1), 239–268 (2013)
Article Google Scholar
Roth, S., Cohen, L.J.: Approach, avoidance, and coping with stress. Am. Psychol. 41(7), 813 (1986)
Article Google Scholar
Rousseau, A., Deléglise, P., Estève, Y.: Enhancing the TED-LIUM corpus with selected data for language modeling and more TED talks. In: LREC, pp. 3935–3939 (2014)
Google Scholar
Sainath, T.N., Mohamed, A.R., Kingsbury, B., Ramabhadran, B.: Deep convolutional neural networks for LVCSR. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8614–8618. IEEE (2013)
Google Scholar
Scaringella, N., Zoia, G., Mlynek, D.: Automatic genre classification of music content: a survey. IEEE Sig. Process. Mag. 23(2), 133–141 (2006)
Article Google Scholar
Schermerhorn, P., Scheutz, M.: Disentangling the effects of robot affect, embodiment, and autonomy on human team members in a mixed-initiative task. In: Proceedings from the International Conference on Advances in Computer-Human Interactions, pp. 236–241 (2011)
Google Scholar
Schuller, B., Steidl, S., Batliner, A.: The INTERSPEECH 2009 emotion challenge. In: INTERSPEECH, vol. 2009, pp. 312–315 (2009)
Google Scholar
Schuller, B., Steidl, S., Batliner, A., Burkhardt, F., Devillers, L., Müller, C.A., Narayanan, S.S.: The INTERSPEECH 2010 paralinguistic challenge. In: INTERSPEECH, vol. 2010, pp. 2795–2798, September 2010
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Taylor, J., Mazlack, L.: Toward computational recognition of humorous intent. In: Proceedings of Cognitive Science Conference, pp. 2166–2171 (2005)
Google Scholar
Wainer, J., Feil-Seifer, D.J., Shell, D.A., Matarić, M.J.: The role of physical embodiment in human-robot interaction. In: The 15th IEEE International Symposium on Robot and Human Interactive Communication, ROMAN 2006, pp. 117–122. IEEE (2006)
Google Scholar
Wang, M., Manning, C.D.: Effect of non-linear deep architecture in sequence labeling. In: IJCNLP, pp. 1285–1291 (2013)
Google Scholar
Wheeless, L.R., Grotz, J.: The measurement of trust and its relationship to self-disclosure. Hum. Commun. Res. 3(3), 250–257 (1977)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic and Computer Engineering, Human Language Technology Center, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong
Pascale Fung, Dario Bertero, Yan Wan, Anik Dey, Ricky Ho Yin Chan, Farhad Bin Siddique, Yang Yang, Chien-Sheng Wu & Ruixi Lin

Authors

Pascale Fung
View author publications
You can also search for this author in PubMed Google Scholar
Dario Bertero
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wan
View author publications
You can also search for this author in PubMed Google Scholar
Anik Dey
View author publications
You can also search for this author in PubMed Google Scholar
Ricky Ho Yin Chan
View author publications
You can also search for this author in PubMed Google Scholar
Farhad Bin Siddique
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chien-Sheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ruixi Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pascale Fung .

Editor information

Editors and Affiliations

CIC, Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fung, P. et al. (2018). Towards Empathetic Human-Robot Interactions. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9624. Springer, Cham. https://doi.org/10.1007/978-3-319-75487-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-75487-1_14
Published: 21 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75486-4
Online ISBN: 978-3-319-75487-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Empathetic Human-Robot Interactions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation of Robot Emotion Expressions for Human–Robot Interaction

A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots

NeuroRobo: Bridging the Emotional Gap in Human-Robot Interaction with Facial Sentiment Analysis, Object Detection, and Behavior Prediction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Towards Empathetic Human-Robot Interactions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Evaluation of Robot Emotion Expressions for Human–Robot Interaction

A hybrid deep learning neural approach for emotion recognition from facial expressions for socially assistive robots

NeuroRobo: Bridging the Emotional Gap in Human-Robot Interaction with Facial Sentiment Analysis, Object Detection, and Behavior Prediction

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation