5 Review - Allouch等 - 2021 - HCISE - Conversational Agents - goals Technologies Vision and Challenges
5 Review - Allouch等 - 2021 - HCISE - Conversational Agents - goals Technologies Vision and Challenges
5 Review - Allouch等 - 2021 - HCISE - Conversational Agents - goals Technologies Vision and Challenges
Review
Conversational Agents: Goals, Technologies, Vision
and Challenges
Merav Allouch 1 , Amos Azaria 1 and Rina Azoulay 2, *
1 Computer Science Department, Ariel University, Ariel 40700, Israel; [email protected] (M.A.);
[email protected] (A.A.)
2 Department of Computer Science, Jerusalem College of Technology, Jerusalem 9116001, Israel
* Correspondence: [email protected]
Abstract: In recent years, conversational agents (CAs) have become ubiquitous and are a presence
in our daily routines. It seems that the technology has finally ripened to advance the use of CAs in
various domains, including commercial, healthcare, educational, political, industrial, and personal
domains. In this study, the main areas in which CAs are successful are described along with the main
technologies that enable the creation of CAs. Capable of conducting ongoing communication with
humans, CAs are encountered in natural-language processing, deep learning, and technologies that
integrate emotional aspects. The technologies used for the evaluation of CAs and publicly available
datasets are outlined. In addition, several areas for future research are identified to address moral and
security issues, given the current state of CA-related technological developments. The uniqueness of
our review is that an overview of the concepts and building blocks of CAs is provided, and CAs are
categorized according to their abilities and main application domains. In addition, the primary tools
and datasets that may be useful for the development and evaluation of CAs of different categories
are described. Finally, some thoughts and directions for future research are provided, and domains
that may benefit from conversational agents are introduced.
Citation: Allouch, M.; Azaria, A.;
Azoulay, R. Conversational Agents: Keywords: smart environments; human–agent interaction; conversational agents
Goals, Technologies, Vision and
Challenges. Sensors 2021, 21, 8448.
https://doi.org/10.3390/s21248448
1. Introduction
Academic Editor: Carina Soledad
González González Conversational agents (CA) are agents that interact with users via written or spoken
natural language. CAs accept as input natural language as speech, text, or video; in
Received: 19 November 2021 addition, they may receive input from several different sensors. CAs are required to
Accepted: 10 December 2021 process the input and provide relevant advice or feedback in a form of text or speech or
Published: 17 December 2021 by manipulating a physical or a virtual body. Some CAs are capable of taking specific
actions either in the real world or in the virtual world. Most CAs use natural-language
Publisher’s Note: MDPI stays neutral processing to understand and generate speech, and some may also have engagement and
with regard to jurisdictional claims in personalization abilities. The rapidly growing abilities introduced by modern machine
published maps and institutional affil- learning techniques facilitate the development of CAs capable of carrying out meaningful
iations. conversations with humans, learning to generate better and more relevant responses,
expanding their knowledge-base, and performing actions beneficial to their users.
Current technological development enables the increasing use of CAs in several
domains, such as assistance agents in the educational domain and health system, customer
Copyright: © 2021 by the authors. support agents in the commercial domain, and influence bots in the political domain.
Licensee MDPI, Basel, Switzerland. Commercial CAs for personal use, such as Siri [1] of Apple, Meena [2] of Google, and
This article is an open access article Cortana [3] of Microsoft, are widely used around the world. The aim of our study was to
distributed under the terms and outline the principles behind the development of CAs and to survey the main domains in
conditions of the Creative Commons which conversational agents are successfully used.
Attribution (CC BY) license (https:// Several recent studies have been carried out over the last years on CAs and, in
creativecommons.org/licenses/by/ particular, on text-based CAs that are called chatbots (as defined in Section 2). Some studies
4.0/).
concentrate on the technologies behind the development of CAs, and other studies examine
their impact on people, i.e., the way people interact with them and perceive them.
Several recent reviews survey CA development and usage, at times referring to them as
chatbots. Adamopoulou and Moussiades [4] provide a historical perspective of the chatbot
development process, present a complete chatbot-categorization system, and analyze the
two main approaches in chatbot development: pattern matching and machine learning.
They mention two limitations of the current generation chatbots in understanding and
producing natural speech, and they also point out that today’s technology aims to build
chatbots that can learn to talk but that cannot learn to think.
In another study, Adamopoulou and Moussiades [5] present an overview of the
evolution of the international community’s interest in chatbots and discuss the motivations
that drive the use of chatbots and their usefulness in a variety of areas. They clarify the
technological concepts and classify them based on various criteria, such as the area of
knowledge and the need they serve. Furthermore, they present the general architecture
of modern chatbots while also mentioning the main platforms they were created for. In
another study, Nuruzzaman et al. [6] present a survey on commonly used chatbots and the
underlying techniques. They focus on response-generating chatbots. In this category, the
various response models can be categorized into four groups: template-based, generative,
retrieval-based, and search engines. They compare the 11 most-popular chatbot application
systems and present the similarities, differences, and limitations. They conclude that
despite recent technological advances, chatbots conversing in a human-like manner are
still hard to achieve.
Another survey concentrating on the technologies used by CAs is that of Borah et al. [7].
They describe the overall architecture of CAs, concentrating on the machine learning
layer and analyze the recent development of text-based CAs. Chen et al. [8] describe the
technology behind CAs and dialogue systems in real-world applications and discuss the
effect of recent advances in deep learning on CA development. They emphasize that “big
data” available from conversations on social media can be useful in building data-driven,
open-domain CAs capable of responding to nearly any query. They further state that deep
learning technologies can be used to leverage the massive amount of data to advance
CAs from different perspectives. Gao et al. [9] concentrate on deep learning based CAs.
They group the conversational agents into three categories: question-answering agents,
task-oriented dialogue agents, and chatbots. For each category, they present a review of
state-of-the-art neural approaches, draw the connection between neural and traditional
approaches, and discuss the progress that has been made and challenges still being faced
using specific systems and models as case studies.
Diederich et al. [10] review 36 studies on CAs in information systems (IS). They
classify the literature along five dimensions. Three dimensions are related to CAs: the
mode of communication, the context, and embodiment; and the other two dimensions
are related to IS: the theory type and the research method. Wolff et al. [11] define a set
of criteria to categorize chatbot applications. They review 52 articles describing chatbots.
Most of the articles focus on customer-support chatbots, e.g., chatbots used to acquire
information on specific services or products. In this article, we provide an overview of the
concepts and building blocks of CAs and categorized them according to their abilities as
well as the main domains of application. We emphasize the challenges and issues related
to CA development for each domain while describing the tools and datasets useful for
the development and evaluation of CAs of different categories. Finally, we provide some
thoughts and directions for future studies and introduce domains that may benefit from
conversational agents. For each of the topics in this survey, we focus on studies from
the recent five years, though we also include earlier seminal studies as well as classical
evaluation methods. In addition, the datasets provided in Section 8 include any relevant
dataset that we found and are not limited to recent datasets.
The remainder of this article is organized as follows. Section 2 provides the terms and
concepts used in the domain of conversational agents and defines the terms used in this
Sensors 2021, 21, 8448 3 of 48
study. Section 3 describes the design components of primary CA types. Sections 4 and 5
survey the main technologies used for conversational software development, including
machine learning (ML) methods and advanced technologies that enhance emotional abili-
ties. Section 6 surveys recent CA applications, including personal assistants, healthcare
agents, e-learning agents, and customer-support chatbots. The second part of this review
focuses on technological issues. Sections 7 and 8 review commonly used datasets for CA
development and testing and the technologies used to evaluate CAs. Finally, Section 9
concludes by providing ideas and directions for future developments.
Figure 1. Conversational agents and chatbots: the definitions used in this article.
CAs can also be classified according to their effector capabilities and actions.
Communication-only agents merely communicate with a user and do not execute any
action, e.g., ELIZA [19], Cleverbot [21,22] or CAs used only to answer questions. Other
CAs, known as virtual or personal assistants, e.g., Alexa [23], are capable of executing
physical or virtual actions, such as turning on an AC or booking a flight (see Figure 2).
Finally, CAs can be classified according to the application: (a) Open domain/general
purpose CAs are mainly used to answer questions in various domains or in entertainment
and are mostly communication-only agents. (b) Goal-oriented CAs assist users in complet-
ing tasks requiring multiple steps and decisions. Goal-oriented CAs are also task-oriented
Sensors 2021, 21, 8448 5 of 48
dialogue systems [24] and are referred to as taskbots according to the Alexa Prize competi-
tion [25]. These agents may be used both in the business domain or as personal assistants.
In the business domain, they operate as customer-service and sales representatives. As per-
sonal support agents, they can assist the user in particular tasks, such as driving, vacation
planning, or trip management. (c) Social-supporting agents can support patients in medical
conditions or support students in the learning process. (d) Social-network bots, also known
as influence agents, are intelligent CAs acting in social media to advertise a product or to
influence opinions (see Figure 3). The rest of the article uses the terms defined in Figure 1
while considering various CA applications, as detailed in Figure 3. A detailed survey on
CA usage in various domains is provided in Section 6.
traditional NLU stack is based on the following five components: phonology, morphology,
syntax, semantics, and reasoning [45].
In particular, morphological analysis or parsing can be viewed as resolving natural-
language ambiguity at different levels by mapping a natural language sentence to a series
of human-defined, unambiguous, symbolic representations, such as part-of-speech (POS)
tags, context-free grammar, and first-order predicate calculus. NLU includes the following
sub areas: resolution, discourse analysis, machine translation, morphological segmen-
tation, named-entity recognition, POS tagging, and more [27]. For a review on natural
language understanding, the reader is referred to the survey of Navigli [46], in which sev-
eral NLU approaches and modes are reviewed, including explicit versus implicit learning,
representation of words and semantics, and a vision on what machines are expected to
understand.
In the remainder of this section, the focus is on studies that use NLU for CA devel-
opment. Initially, CAs using classical NLU technologies are described. Next, CAs using a
parser as their NLU component are described. To conclude, recent CAs that use advanced
technologies for NLU are described.
A classical approach for designing chatbots is the pattern-matching approach, in which
the CA matches the user input with a pattern and chooses the most-suitable response stored
in its predefined text corpus. One example of a CA that is based solely on simple pattern
matching is ELIZA [19]. Over the years, several studies have developed additional rules
and corpora to develop more-adaptive and advanced CAs. Inui et al. [47] use a linguistic
corpus to design a CA interface. The dialogue corpus is based on a series of dialogues, and
NLU is achieved by adopting corpus-based methods like the stochastic model, the n-gram
model, keyword matching, and structural matching.
ALICE [48] is a chatbot based on AIML [49], an XML-based language designed to
create chatbots based on pattern matching. ALICE won the Loebner Prize as “the most
human computer” at the annual Turing Test contests of 2000, 2001, and 2004. ALICE
answers the user’s query by using its pattern-matching engine, which searches for a lexical
correspondence between the user’s query and the chatbot’s patterns.
Agostaro et al. [50] outline the limitations of the pattern-matching approach. Pattern
matching may fail to answer the user query when the query is composed of words that do
not match any pattern. Therefore, when the query is grammatically incorrect, the pattern-
matching mechanism will fail. To overcome these limitations, Agostaro et al. developed
LSA-bot [50], which is a chatbot based on latent semantic analysis (LSA). LSA applies
statistical computations to a large corpus of text to extract and represent the meaning of
words. LSA-bot uses LSA to map its knowledge base into a conceptual space. The user
input is mapped into the same conceptual space, allowing LSA-bot to find an appropriate
response.
The informal response interactive system (IRIS) chatbot, developed by Banchs and
Li [51], uses a large database of dialogues to provide candidate responses to a given user
utterance. The IRIS response-selection process chooses the candidate utterances using
two scores. The first score is determined by the cosine similarities between the current
user input vector and all single utterances stored in the database. The second score is
determined by the cosine similarity between the current vector dialogue and the dialogue
history of the user. The two scores are combined using a log-linear scheme. The IRIS
randomly selects one of the top-ranked utterances as its response.
A context-free-grammar (CFG) parser [52] is often used by CAs for NLU. A CFG
parser builds a constituency parse tree from the given user utterance based on a grammar,
which is composed of parsing rules. A more generalized CFG, which is more suitable for
solving ambiguity, is the probabilistic CFG (PCFG) [53,54]. In a PCFG parser, each rule in
the grammar is associated with some probability. A PCFG parser outputs the parse tree
with the highest probability.
Azaria et al. [55] present LIA, an agent that uses a combinatory categorial grammar
(CCG) parser as its NLU component. The parser maps the commands, which are given
Sensors 2021, 21, 8448 11 of 48
in natural language, to logical forms, which contain functions and concepts that can later
be executed by the dialogue manager. CCGs benefit from being more expressive than
CFGs as they can represent the long-range dependencies appearing in some sentences
(e.g., relative clauses), which cannot be expressed using CFGs. Recent ML methods and
word-embedding methods are widely adapted to achieve NLU components with higher
performance. Rasa NLU and Rasa Core [56] are open-source Python libraries for building
conversational software. Rasa NLU allows the use of a predefined pipeline for the NLU
process.
Recent ML methods and word embedding methods are widely adapted for achieving
NLU components with higher performance. Rasa NLU and Rasa Core [56] are open-source
Python libraries for building conversational software.
Rasa NLU allows the use of a predefined pipline for the NLU process. Their recom-
mended pipeline process starts by tokenizing the user input, followed by the conversion of
each token to a GloVe embedding vector [57]. Then, a multiclass support vector machine
(SVM) [58] is used for deciding which action to take. Custom entities are recognized using
a conditional random field [59].
ConvLab-2 [24], which is an open-source toolkit for building goal oriented CAs,
provides three NLU models: a semantic tuple classifier, a multi-intent language under-
standing model [60], and a fine-tuned BERT- [61] based NLU model with the ability of
intent classification and slot tagging.
are the output of the dialogue manager and the corresponding natural-language texts.
They find that seq2seq NLG systems generally score high in terms of word-overlap metrics
and human evaluations of naturalness but often fail to correctly express a given meaning
or representation if they lack a strong semantic-control mechanism during decoding.
Moreover, they can be outperformed by hand-engineered systems in terms of the quality,
complexity, and diversity of outputs.
to integrate external systems to provide an explanation for the particular responses. They
present an end-to-end monolithic neural model that learns to follow the core steps in
the dialogue-management pipeline. The model outputs all the intermediate results in
the dialogue-management pipeline to enable integration with the external system and to
interpret why the system generates a particular response.
Kim [90] presents an end-to-end document-grounded, goal-oriented CA that utilizes a
pretrained language model with an encoder–decoder structure. The encoder solves both
the knowledge-seeking turn-detection task and the knowledge-selection task; the decoder
solves the response-generation task.
Das et al. [91] suggest using DRL to learn the policies of goal-oriented CAs to answer
visual questions. They pose a cooperative dialogue between two CAs communicating by
natural language. The dialogue involves two collaborative CAs; one CA sees the image;
and the second CA asks the first one questions about the image. DRL is used for learning
the policies of these agents during the multi-round dialogue. As a result, the two trained
CAs invent their own communication protocol without any human supervision.
supervised learning, and then their abilities are improved by allowing them to conduct
task-oriented dialogues while iteratively improving the policies using DRL.
5. Human-Related Issues
In addition to the technical issues of natural language understanding and genera-
tion, good conversational agents should be aware of human characteristics, observe user
emotions, provide empathy in their responses, and engage the user.
According to Clark et al. [97], humans perceive the communication with CA as a means
to achieve functional goals. In their study, Clark et al. present the results of semi-structured
interviews on how people view the conversation between humans and CAs. They found
that several social features reported as crucial in human–human conversation, such as
understanding and common ground, trust, active listenership, and humor, are not listed as
required for human–CA conversations. CA conversations are described almost exclusively
by transactional and utilitarian terms. However, this view of CAs is not satisfactory in
domains that require the user to engage and form an emotional bond with the CA.
Yand et al. [98] argue that understanding users’ affective experience is crucial to the
design of compelling CAs. To elaborate on this claim, they surveyed 171 CA users of
Google assistant and examined the affective responses in four major usage scenarios. In
addition, they observed the factors that influence affective responses. They found that the
overall experience of the user was positive, with the most salient emotion being interest.
Both pragmatic and hedonic qualities influence affective experience. The factors
underlying the pragmatic quality are helpfulness, proactivity, fluidity, seamlessness, and
responsiveness. The factors underlying the hedonic quality are comfort in human–machine
conversation, the pride of using cutting-edge technology, fun during use, the perception of
having a human-like assistant, a concern about privacy, and the fear of causing distraction.
In the remainder of this section, several issues are discussed that can assist in establishing
a deeper connection between the user and the CA during conversations. The focus is on
the following aspects: emotional issues, CA personality, and adaptation to the taste and
needs of the user.
The challenge of listening to the user and understanding the user’s emotional feelings
is considered in Sarder’s [102] thesis work, which studies the issue of conversational-agent
development for mental-health intervention. Sarder built an embodied conversational
agent with three different levels of backchannel strategies and ran a within-subject study
with a convenience sample of 24 participants. He showed that the emotional content
recognized in the words of the user increases as the CA listening capabilities increase.
As stated above, the second challenge for a CA with emotional abilities is to provide
the appropriate response given the user’s emotional state. The ability to recognize the
emotions and feelings of others and replying accordingly is known as empathy, which is
a crucial socio-emotional behavior for smooth interpersonal interactions. Therefore, the
second emotional challenge is to assimilate empathy into CAs.
Empathy can be verbal and non-verbal. Yalcin [103] suggests that embodied CAs
should be equipped with real-time multimodal empathic-interaction capabilities. The
empathic framework leverages three hierarchical levels of capabilities to model empathy
for CAs. Following the theoretical background on empathic behavior in humans, the
embodied CA can express empathy by using facial expressions; gaze, head, and body
gestures; as well as verbal responses.
Tellols et al. [104] propose equipping the CA with sentient capacities, using ML
technologies. They illustrate their proposal by embedding a virtual tutor in an educational
application for children. Their CA has a unique personality, emotional understanding,
and needs that the user has to meet. The CA’s needs can be expressed by Maslow’s
hierarchy of needs [105]. Tellols et al. tested the two CA versions with 10–12 year-old
students and found that the second version, equipped with ML capabilities, displays higher
understanding capacity and yields a nearly 100% user satisfaction rate. Emotional effects,
as well as properties of the speaking style, can be added to the CA to generate speech that
is closer to human dialogue.
Chen et al. [106] proposed a conditional text-generative adversarial network (CTGAN),
in which an emotion label is adopted as an input channel to specify the output text.
To match the generated text data to the real scene, they designed an automated word-
level replacement strategy such that after generating initial texts by CTGAN, they extract
keywords from the training texts and replace them in the generated texts.
XiaoIce is a popular social CA, developed in 2014 by Microsoft. Zhou et al. [107]
describe the design of XiaoIce as an AI companion with an emotional connection. The
XiaoIce design includes the intelligence quotient (IQ), the emotional quotient (EQ), and a
culturally sensitive personality. The IQ capacity is achieved by knowledge and memory
modeling. The EQ capacity includes two key components: empathy and social skills. Both
IQ and EQ are combined in a unique personality. The CA personality is defined as the
characteristic set of behaviors, cognition, and emotional patterns that form an individual’s
distinctive character. XiaoIce’s developers have designed different personas for XiaoIce to
suit the preferences and desires of users in different cultures and regions. By analyzing
the XiaoIce online logs, Zhou et al. show that XiaoIce understands user intent, recognizes
human feelings, generates appropriate responses, and is capable of establishing a long-term
relationship.
Asghar et al. [108] propose three methods to incorporate emotional aspects into
encoder–decoder neural-conversation models: affective word embeddings, augmenting
affective objectives in the loss function, and incorporating a search for affective responses
during text decoding. Affective word embedding, in 3D space, can be performed using a
cognitive-engineering affective dictionary. Affective objectives can be augmented in the
cross-entropy loss function to generate additional emotional responses. Finally, the CA
can be guided to search for effective responses during decoding. Asghar et al. show that
incorporating these emotional aspects improves the quality of the CA responses in terms
of syntactic coherence, naturalness, and emotional appropriateness.
Zhou et al. [109] explain the range of challenges that exist in addressing the emotion
factor in large-scale conversation generation. These include: (i) the difficulty of obtaining
Sensors 2021, 21, 8448 17 of 48
high-quality emotion-labeled data since emotion annotation is a subjective task, (ii) the
need to balance grammar and emotion in expressions, and (iii) the challenge of embedding
emotion information. To express emotion naturally and coherently in a sentence, they de-
signed a seq2seq generation model equipped with new mechanisms for emotion-expression
generation.
To summarize, considering that the user’s emotional experience and engagement are
of great importance in various social and health domains, several studies suggest methods
to recognize user’s emotional state to provide an appropriate empathic response. The
emotional awareness of CAs can make the user more satisfied and can yield longer and
meaningful human–CA conversations.
more likely to select same-race agent personas when they were given an opportunity to
customize the ECA.
Go and Sundar [117] tested the distinct and combined effects of three types of cues
that potentially enhance the humanness of chat agents: human-like visual cues, the use of
human names or identities, and the use of human language. For these three factors, the
authors examined how interactions among these cues influence psychological, attitudinal,
and behavioral outcomes. Their experimental results indicate that CA interactivity is an
important factor in determining psychological, attitudinal, and behavioral outcomes, while
the identity cue turns out to be a key factor in eliciting certain expectations regarding
CA’s performance in conversation. However, message interactivity can compensate for the
impersonal CA nature.
A good open-domain CA should be able to seamlessly blend all its skills, including
the ability to be engaging, knowledgeable, and empathetic into one conversational flow.
Smith et al. [118] present a method for training a CA with blended skills and testing it.
They show that existing single-skill tasks can effectively be combined to obtain a model
that blends all skills into a single CA. To preclude unwanted biases when selecting the skill,
fine-tuning was done on the blended data.
Figure 8. Human-related aspects of the CA: emotion sensitivity, personality expression, and adapta-
tion to the user’s taste and needs.
Finally, MILABOT [74] is a DRL-based CA, developed for the Amazon Alexa Prize
competition. MILABOT is capable of chatting with humans through speech or text. It was
trained on crowdsource data and real-world-user interactions.
virtual assistants, such as Alexa, include extensions that enable the learning of foreign
languages [148]. Alexa has the skills to assist in building a vocabulary and handling a
conversation in a foreign language. Pham et al. [149] developed English Practice, which is
a mobile chatbot application to assist a user in learning new vocabulary and to carry on
a conversation. Another CA dedicated to language learning is Lucy [150], an embodied
virtual agent, designed to help users to learn vocabulary and grammar and to carry on a
conversation.
CAs can also be used to support the administration in educational systems. For
example, Hien et al. [151] present FIT-EBot, a chatbot that responds to student questions
related to services provided by the education system on behalf of the academic staff.
Similarly, Ranoliya et al. [152] introduced a chatbot designed to answer visitor questions
at Manipal University. It provides an answer based on a dataset of frequently asked
questions (FAQ) using AIML. When a user asks a query, the chatbot searches for a similar
question and provides the answer to that question. Another chatbot was developed by
Keeheon et al. [153] to provide information in educational systems by answering frequently
asked questions The chatbot was successfully used by students and department offices in
Underwood International College, Korea.
The authors reported that the use of the chatbot had a positive influence on adminis-
trative work in reducing workload.
Discussion-bot [154], developed by Feng et al., provides answers to students’ discussion-
board questions using natural language. Given a question, it mines suitable answers from an
annotated corpus of archived discussions and course documents and chooses an appropriate
response.
Vanderborght et al. [162] developed Probo, which is a social story-telling robot capable
of expressing emotions via facial expressions and gaze. Probo uses stories to teach children
with ASD how to react in different situations, such as saying “hello” or “thank you.”
Probo also teaches children to share their toys. Vanderborght et al. showed that there are
situations where the social performance of autistic children improves when using Probo.
Another known robot developed in the same project is Nao. [163], an embedded CA
that has been tested and deployed in several healthcare scenarios, including care homes
and schools.
However, most of the physicians believe that CAs cannot effectively take care of patients’
needs or provide detailed diagnosis and treatment. Nadarzynski et al. [178] studied the
acceptability of CAs in healthcare from the perspective of the general public. While the
participants in the study recognized the potential of CAs in healthcare, they stated that
their experience is not satisfactory enough and that they are concerned about security
issues. Scholten et al. [179] surveyed several CAs in the field of healthcare. They concluded
that while CAs can increase the motivation of patients and promote behavioral change,
user needs are many times implicit, and these needs cannot be addressed by CAs.
DARPA as influence chatbots. The leading group detected all influence chatbots, using a
combination of machine learning techniques along with a user support system.
Lee et al. [201] deployed honeypots in the Twitter social network to identify and
analyze content polluters. They investigated the attributes of Twitter users, including user
behavior over time, user followers, and user following. They also enumerate features that
may assist in identifying content polluters automatically, and they present a classification
model. Finally, they show that their model successfully identifies content polluters.
To summarize this section, Figure 9 refers to the CA definitions (provided in Figure 1)
and, for each type of CA, details the domain of applicability.
7. Evaluation Metrics
Three main approaches are used in the literature for evaluating the quality of a
conversation agent: human-based evaluation procedures, machine evaluation metrics
based on language characteristics, and an ML approach trained on a dataset consisting
of human evaluations. The advantages of human evaluation are clear, as humans can
evaluate whether the CA responses seem appropriate and resemble responses. However,
since human evaluation procedures are expensive, several automatic metrics have been
proposed for the evaluation process. Unfortunately, due to the linguistic richness of natural
languages and the wide variety of reasonable response options, it is still challenging to
achieve accurate and meaningful evaluation when using automatic tools. Therefore, the
ML approach tries to benefit from both approaches; on the one side, it is based on human
evaluation, and, on the other side, it does not require new implicit costly evaluation
methods for each new dialogue situation.
Radziwill and Benton [14] present a literature review of quality issues related to CA
development and implementation, focusing on two topics: quality-attributes and quality-
assessment approaches. Deriu et al. [202] surveyed the main concepts and methods of CA
evaluation. For each type of CA, task-oriented, conversational, and question-answering
dialogue systems, they defined the main technologies and the evaluation methods that
are appropriate for that type. The requirements of the evaluation methods are stated
with respect to automated or partially automated evaluation, repeatability of the results,
correlation with human judgment, ability to focus on CA features, and explainability.
Finally, Masche and Le [16] divide the different evaluation methods into four classes:
qualitative analysis, quantitative analysis, pre/post-test, and CA competition.
Sensors 2021, 21, 8448 27 of 48
In this section, the evaluation methods are divided into three classes, according to the
way they are obtained, namely, human-based evaluation, machine-based evaluation, and
the ML approach, and some popular evaluation methods are further described for each of
these three classes.
This is a strong assumption for CAs, which exhibit a significant diversity in the space of
valid responses. They show that many commonly used metrics for CA evaluation do not
correlate strongly with human judgment, and they conclude that there is a need for a new
metric that correlates more strongly with human judgment.
Educational CAs
CA Short Description Main Technology Evaluation Method
Sara [125] student’s assistant scaffolding strategy pretest and posttest
scores of learners
pro-survey and post-survey
AutoTutor [139] computer tutor LSA, pattern-matching learning gain
speech act classification
MSRbot [140] sofware related Q&A Dialogflow effectiveness, efficience
Zhorai [145] CA for children NLTK package accuracy, child’s level
to explore ML concepts Website visualizer of engagement
MathBot [146] math teaching chatbot rule based crowd worker preferences
English
Personal Assistant for Dialogflow statistics about
Practice [149]
Mobile Language Learning platform real users
embodied on-line virtual agent
Lucy [150] ALICE offshoot demonstrative examples
for
language learning
FIT-EBot [151] administrative chatbot DialogFlow students reports
QTrobot [161] social robot to assist bodied humanoid robot interviews with
children with ASD the users
Probo [162] social robot compliant actuation systems children performance
for children with ASD
Healthcare CAs
CA Short Description Main Technology Evaluation Method
CoachAI [168] patient’s support task-oriented finite state user’s engagement, system
chatbot machine (FSM) architecture accaptance and rating.
Woebot [174] therapist CA AI,NLP,empathy engine users’ reports
Mandy [126] a primary care CA NLU, NLG, word2vec accuracy
Tanya [175] graphically embodied female increased
agent that supports breastfeeding breastfeeding success
KR-DS [173] diagnosis chatbot Bi-LSTM, Deep Q-network diagnosis accuracy
Commercial CAs
CA Short Description Main Technology Evaluation Method
SuperAgent [183] customer-service chatbot AIML + LSA 2 customer reviews
SamBot [187] question-answering CA AIML Loebner Prize Competition
+ user interaction
dialogues generated by conversational agents, and others are related to other domains,
such as image and video captioning, computer vision, speech recognition, and synthesis.
Deriu et al. [202] provide another list of available conversation corpora focusing on
task related conversations in several domains, such as the restaurant domain and the tourist
information domain. They note that question answering dialogue systems can be extracted
either from chat logs or from several available literature sources, news, scientific resources,
Wikipedia articles, FAQ sites, and even cooking domains.
In the remainder of this section, some of the most useful corpora for conversation
understanding, generation, and evaluation are described and classified according to their
applications, using the terms defined in Section 2.
Twitter-specific structures, such as hashtags. Similar to the issue with artificial datasets,
Serben et al. note that dialogues extracted from social media may be missing context. In
addition, as stated by Kourosh [225], the use of auto-correction by users of social media
may cause an additional layer of complication.
General-Purpose Datasets
Dataset Source Description Size Used for
DailyDialog [213] hand written, daily interactions 13,118 dialogs, general
manualy labeled 7̃.9 turns purpose
[216] subtitles interaction–response purpose
pairs
Movie dialogue dataset movie metadata OMDb, MovieLens, 3.1 M simulated Movies QA and
[217] as knowledge triples and Reddit QA pairs recommendation
Cornell Movie Dialogues Short conversations movie metadata 220 K understanding
Corpus [218] from film scripts conversations linguistic style
Ubuntu dialogue Ubuntu chat stream human–human chat 930 K response
corpus [224] conversations generation
Question-Answering Datasets
Squad Version 1.1 questions and answers 1̃00 K questions 100 K q&a machine reading
[227] on Wikipedia articles on Wikipedia articles comprehension
Squad Version 2 questions and answers Squad 1.1 + 100 K Q&A + machine reading
and additional
[228] 50 k questions 50 k questions comprehension
questions
with no answers with no answers
CNN/Daily Mail queries from the CNN cont.–query–answer 1̃ M stories+ machine reading
and Daily Mail
comprehension [229] triples associated queries training dataset
websites
Natural Questions Google search queries+ Google question+ 307,372 training &
dataset [230] Wikipedia answers long answer+ training examples evaluation of
by crowd workers short answers answ. systems
TriviaQA crowdworkers question-answer- 95 K quest.-ans. reading
[231] questions evidence triples pairs + 6 evidence comprehension
doc. per quest.
Sensors 2021, 21, 8448 35 of 48
influence agents, designed to impact the business and public sectors. Figure 11 summarizes
the information provided by the different illustration diagrams, which appear in this survey,
categorized according to their aims.
There are, however, various additional situations where CAs can be utilized to assist
and support people. With state-of-the-art CAs, the most advanced improve themselves
based on new data. There are very few CAs, however, that allow humans to teach them
additional knowledge and new capabilities or to provide them with the ability to direct
their learning process. One of the few systems that can learn directly from humans is
commonsense reasoning by instruction (CORGI) [247]. CORGI performs the commonsense
reasoning required in applying if-then rules, by initiating a conversation with the user.
Another example is Safebot [248], which is taught new responses by the user to avoid
learning inappropriate responses. Finally, the learning-by-instruction agent (LIA) [249]
asks the user to explain how to execute a new command and associates a sequence of
natural-language steps with it. Such systems enable users to fine-tune CAs to adapt them to
personal needs and preferences. To further enhance such systems, additional appropriate
protocols, algorithms, and rules should be developed and examined.
Another domain where CAs may be useful is in explanatory interactive systems [250,251],
which aim to explain to humans the reasons behind decisions made by an automated
system. Such explanations are necessary to strengthen the trust between agents and people.
CAs may be used to make machine explanations understandable to the human user.
Another area in which CAs are expected to be more prominent is related to consulting
a person during his/her conversations. Such a consulting agent would be expected to
support people in their daily interactions with other people. The agent is required to model
all participants of the conversation to identify their needs in complex social situations to
be able to advise them on how to act, talk, or respond in complex social interactions. In
our ongoing study [100,240], technology is being developed to assist children with special
needs in their daily interaction while monitoring the environment for them.
It should also be emphasized that as CAs become ubiquitous and their ability to
provide human-like responses improves, a significant moral question arises: Is there a
need to declare the identity of the service or the technical-support representative? Do CAs
acting as support or sales agents have the obligation to share their nature with the clients?
While studies have revealed that people feel more engaged when conversing with other
humans [97], it remains questionable whether maintaining the obscurity of the agent is
right, fair, or justified [252].
Another related moral issue arises when considering influential agents. Considering
the current state of the technology, any company, party, or ideological movement may
develop a CA as a representative to describe its agenda and influence public opinion to
garner support for its position. To what extent is such a practice considered moral? Situa-
tions where the CA identity is known or hidden should be distinguished, and situations
where the company or party is represented by a single CA or by several, hundreds, or even
thousands, to create a representation of mass support should be carefully considered and
clarified. Surely, using a mass of CAs to influence public opinion seems to be dishonest
and unfair, but where is the moral limit?
In addition, given the possibility of such an unfair usage of influence agents, technol-
ogy should be developed to be able to detect such unfair influence. In Section 6.5, some
studies are described that deal with detecting malicious “influence bots”. As the techno-
logical ability of such influence bots increases, detecting them becomes more challenging.
However, such detection may be crucial, especially when considering extreme groups that
may have incentives to utilize such agents for negative purposes.
Several issues arise by the use of assistant agents related to the challenges of protect-
ing user privacy. Mainly, assistant-agent developers must prevent the use of information
acquired by the assistance agent by other parties, such as, commercial companies and ad-
versaries. Information-security technologies should be employed to avoid such situations.
Sensors 2021, 21, 8448 37 of 48
To summarize, the rise of CAs and their applications can have a significant influence
on our future life. Some of these applications are positive and even crucial, such as health
support or social support; others can be beneficial to business and companies; and others
should be monitored or even avoided for moral reasons. The limits of fair use of CAs and
the technological tools to enforce these limits should be discussed and developed in future
research.
Funding: This research was supported in part by the Ministry of Science, Technology & Space, Israel.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
References
1. Bosker, B. Siri Rising: The Inside Story of Siri’s Origins—And Why She Could Overshadow the iPhone. Huffington Post. Available
online: https://www.huffpost.com/entry/siri-do-engine-apple-iphone_n_2499165 (accessed on 9 December 2021).
2. Adiwardana, D.; Luong, M.T.; So, D.R.; Hall, J.; Fiedel, N.; Thoppilan, R.; Yang, Z.; Kulshreshtha, A.; Nemade, G.; Lu, Y.; et al.
Towards a human-like open-domain chatbot. arXiv 2020, arXiv:2001.09977.
3. Bhat, H.R.; Lone, T.A.; Paul, Z.M. Cortana-intelligent personal digital assistant: A review. Int. J. Adv. Res. Comput. Sci. 2017,
8, 55–57.
4. Adamopoulou, E.; Moussiades, L. Chatbots: History, Technology, and Applications. Mach. Learn. Appl. 2020, 2, 100006. [CrossRef]
5. Adamopoulou, E.; Moussiades, L. An overview of chatbot technology. In Proceedings of the IFIP International Conference on
Artificial Intelligence Applications and Innovations, Neos Marmaras, Greece, 5–7 June 2020; Springer Nature: Cham, Switzerland,
2020; pp. 373–383.
6. Nuruzzaman, M.; Hussain, O.K. A survey on chatbot implementation in customer service industry through deep neural networks.
In Proceedings of the 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), Xi’an, China, 2–14 October
2018; IEEE: Manhattan, NY, USA, 2018; pp. 54–61.
7. Borah, B.; Pathak, D.; Sarmah, P.; Som, B.; Nandi, S. Survey of Textbased Chatbot in Perspective of Recent Technologies. In
Proceedings of the International Conference on Computational Intelligence, Communications, and Business Analytics, Kalyani,
India, 27–28 July 2018; Springer: Cham, Switzerland, 2018; pp. 84–96.
8. Chen, H.; Liu, X.; Yin, D.; Tang, J. A survey on dialogue systems: Recent advances and new frontiers. Acm Sigkdd Explor. Newsl.
2017, 19, 25–35. [CrossRef]
9. Jianfeng Gao, M.G.; Li, L. Neural Approaches to Conversational AI. arXiv 2019, arXiv:1809.08267.
10. Diederich, S.; Brendel, A.B.; Kolbe, L.M. On Conversational Agents in Information Systems Research: Analyzing the Past to
Guide Future Work. In Proceedings of the 14th International Conference on Wirtschaftsinformatiks, Siegen, Germany, 24–27
February 2019.
11. Meyer von Wolff, R.; Hobert, S.; Schumann, M. How may i help you?–state of the art and open research questions for chatbots at
the digital workplace. In Proceedings of the 52nd Hawaii International Conference on System Sciences, Honolulu, HI, USA, 8–11
January 2019.
12. Vishnoi, L. Conversational Agent: A More Assertive Form of Chatbots. 2020. Available online: https://towardsdatascience.com/
conversational-agent-a-more-assertive-form-of-chatbots-de6f1c8da8dd (accessed on 9 December 2021).
13. Nuseibeh, R. What is a Chatbot? 2018. Available online: https://medium.com/\spacefactor\@m{}rajai_nuseibeh/what-is-a-
chatbot-402427354f44 (accessed on 9 December 2021).
14. Radziwill, N.; Benton, M. Evaluating Quality of Chatbots and Intelligent Conversational Agents. Softw. Qual. Prof. 2017, 19, 25.
15. Hussain, S.; Sianaki, O.A.; Ababneh, N. A survey on conversational agents/chatbots classification and design techniques. In
Proceedings of the Workshops of the International Conference on Advanced Information Networking and Applications, Matsue,
Japan, 27–29 March 2019; pp. 946–956.
16. Masche, J.; Le, N.T. A review of technologies for conversational systems. In Proceedings of the International conference on
Computer Science, Applied Mathematics and Applications, Berlin, Germany, 30 June–1 July 2017; pp. 212–225.
17. Nimavat, K.; Champaneria, T. Chatbots: An overview types, architecture, tools and future possibilities. Int. J. Sci. Res. Dev. 2017,
5, 1019–1024.
18. Venkatesh, A.; Khatri, C.; Ram, A.; Guo, F.; Gabriel, R.; Nagar, A.; Prasad, R.; Cheng, M.; Hedayatnia, B.; Metallinou, A.; et al. On
Evaluating and Comparing Conversational Agents. In Proceedings of the 31st Conference on Neural Information Processing
Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017.
19. Weizenbaum, J. ELIZA—A computer program for the study of natural language communication between man and machine.
Commun. ACM 1966, 9, 36–45. [CrossRef]
20. Breazeal, C. Social robots: From research to commercialization. In Proceedings of the 2017 ACM/IEEE International Conference
on Human-Robot Interaction, Vienna, Austria, 6–9 March 2017; p. 1. [CrossRef]
21. Gehl, R.W. Teaching to the Turing Test with Cleverbot. J. Incl. Scholarsh. Pedagog. 2014, 24, 56–66.
22. Hill, J.; Randolph Ford, W.; Farreras, I.G. Real conversations with artificial intelligence: A comparison between human–human
online conversations and human–chatbot conversations. Comput. Hum. Behav. 2015, 49, 245–250. [CrossRef]
23. Lopatovska, I.; Rink, K.; Knight, I.; Raines, K.; Cosenza, K.; Williams, H.; Sorsche, P.; Hirsch, D.; Li, Q.; Martinez, A. Talk to me:
Exploring user interactions with the Amazon Alexa. J. Librariansh. Inf. Sci. 2019, 51, 984–997. [CrossRef]
24. Zhu, Q.; Zhang, Z.; Fang, Y.; Li, X.; Takanobu, R.; Li, J.; Peng, B.; Gao, J.; Zhu, X.; Huang, M. Convlab-2: An open-source toolkit
for building, evaluating, and diagnosing dialogue systems. arXiv 2020, arXiv:2002.04793.
25. Taskbot, A.P. Alexa Prize Taskbot. 2021. Available online: https://developer.amazon.com/alexaprize (accessed on 9 December
2021).
26. Fernandes, A. NLP, NLU, NLG and how Chatbots Work. Available online: https://chatbotslife.com/nlp-nlu-nlg-and-how-
chatbots-work-dd7861dfc9df (accessed on 9 December 2021).
27. Khurana, D.; Koli, A.; Khatter, K.; Singh, S. Natural language processing: State of the art, current trends and challenges. arXiv
2017, arXiv:1708.05148.
Sensors 2021, 21, 8448 40 of 48
28. Stoner, D.J.; Ford, L.; Ricci, M. Simulating Military Radio Communications Using Speech Recognition and Chat-Bot Technology; The
Titan Corporation: Orlando, FL, USA, 2004. Available online: https://docplayer.net/39136593-Simulating-military-radio-
communications-using-speech-recognition-and-chat-bot-technology.html (accessed on 9 December 2021).
29. Abdul-Kader, S.A.; Woods, J. Survey on Chatbot Design Techniques in Speech Conversation Systems. Int. J. Adv. Comput. Sci.
Appl. 2015, 6, 72–80.
30. Ramesh, K.; Ravishankaran, S.; Joshi, A.; Chandrasekaran, K. A Survey of Design Techniques for Conversational Agents. In
Proceedings of the 2017 ICICCT Information, Communication and Computing Technology, New Delhi, India, 13 May 2017;
pp. 336–350.
31. Ahmad, N.A.; Hamid, M.H.C.; Zainal, A.; Rauf, M.F.A.; Adnan, Z. Review of Chatbots Design Techniques. Int. J. Comput. Appl.
2018, 181, 56–67.
32. Diederich, S.; Brendel, A.B.; Kolbe, L.M. Towards a Taxonomy of Platforms for Conversational Agent Design. WI 2019. 2019.
Available online: https://aisel.aisnet.org/wi2019/track10/papers/1/ (accessed on 9 December 2021)
33. Lokman, A.S.; Ameedeen, M.A. Modern Chatbot Systems: A Technical Review. In Proceedings of the Future Technologies
Conference (FTC), San Francisco, CA, USA, 25–26 October 2019; pp. 1012–1023.
34. Azaria, A.; Nivasch, K. SAIF: A Correction-Detection Deep-Learning Architecture for Personal Assistants. Sensors 2020, 20, 5577.
[CrossRef]
35. Saund, E. How Do Conversational Agents Answer Questions? Available online: https://towardsdatascience.com/how-do-
conversational-agents-answer-questions-d504d37ef1cc (accessed on 9 December 2021).
36. Benzeghiba, M.; De Mori, R.; Deroo, O.; Dupont, S.; Erbes, T.; Jouvet, D.; Fissore, L.; Laface, P.; Mertins, A.; Ris, C.; et al. Automatic
speech recognition and speech variability: A review. Speech Commun. 2007, 49, 763–786. [CrossRef]
37. Yu, D.; Deng, L. Automatic Speech Recognition; Springer Nature: Cham, Switzerland, 2016.
38. Sadeghipour, A.; Kopp, S. Embodied gesture processing: Motor-based integration of perception and action in social artificial
agents. Cogn. Comput. 2011, 3, 419–435. [CrossRef]
39. Krishnaswamy, N.; Narayana, P.; Wang, I.; Rim, K.; Bangar, R.; Patil, D.; Mulay, G.; Beveridge, R.; Ruiz, J.; Draper, B.; et al.
Communicating and acting: Understanding gesture in simulation semantics. In Proceedings of the 12th International Conference
on Computational Semantics (IWCS), Montpellier, France, 19–22 September 2017.
40. Homburg, D.; Thieme, M.S.; Völker, J.; Stock, R. RoboTalk-Prototyping a Humanoid Robot as Speech-to-Sign Language Translator.
In Proceedings of the 52nd Hawaii International Conference on System Sciences, Honolulu, HI, USA, 8–11 January 2019.
41. Singh, S.; Jain, A.; Kumar, D. Recognizing and interpreting sign language gesture for human robot interaction. Int. J. Comput.
Appl. 2012, 52. [CrossRef]
42. Beck, A.; Stevens, B.; Bard, K.A.; Cañamero, L. Emotional body language displayed by artificial agents. Acm Trans. Interact. Intell.
Syst. (Tiis) 2012, 2, 1–29. [CrossRef]
43. Zhao, T.; Eskenazi, M. Towards end-to-end learning for dialog state tracking and management using deep reinforcement learning.
arXiv 2016, arXiv:1606.02560.
44. Noroozi, V.; Zhang, Y.; Bakhturina, E.; Kornuta, T. A Fast and Robust BERT-based Dialogue State Tracker for Schema-Guided
Dialogue Dataset. arXiv 2020, arXiv:2008.12335.
45. Bird, S.; Klein, E.; Loper, E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit; O’Reilly
Media, Inc.: Sebastopol, CA, USA, 2009.
46. Navigli, R. Natural Language Understanding: Instructions for (Present and Future) Use. In Proceedings of the Twenty-Seventh
International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 5697–5702.
47. Inui, N.; Koiso, T.; Nakamura, J.; Kotani, Y. Fully corpus-based natural language dialogue system. In Proceedings of the Natural
Language Generation in Spoken and Written Dialogue, AAAI Spring Symposium, Palo Alto, CA, USA, 24–26 March 2003.
48. Wallace, R.S. The anatomy of ALICE. In Parsing the Turing Test; Springer Nature: Cham, Switzerland, 2009; pp. 181–210.
49. Marietto, M.d.G.B.; de Aguiar, R.V.; Barbosa, G.d.O.; Botelho, W.T.; Pimentel, E.; França, R.d.S.; da Silva, V.L. Artificial intelligence
markup language: A brief tutorial. arXiv 2013, arXiv:1307.3091.
50. Agostaro, F.; Augello, A.; Pilato, G.; Vassallo, G.; Gaglio, S. A conversational agent based on a conceptual interpretation of a data
driven semantic space. In Proceedings of the Congress of the Italian Association for Artificial Intelligence, Milan, Italy, 21–23
September 2005; pp. 381–392.
51. Banchs, R.E.; Li, H. IRIS: A chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012
System Demonstrations, Jeju, Korea, 8–14 July 2012; pp. 37–42.
52. Nijholt, A. Context-Free Grammars: Covers, Normal Forms, And Parsing; Lecture Notes in Computer Science; Springer Science and
Business Media: Berlin/Heidelberg, Germany, 1980; Volume 93.
53. Resnik, P. Probabilistic tree-adjoining grammar as a framework for statistical natural language processing. In Proceedings of the
14th International Conference on Computational Linguistics, Nantes, France, 23–28 August 1992.
54. Gandhe, A.; Rastrow, A.; Hoffmeister, B. Scalable language model adaptation for spoken dialogue systems. In Proceedings of the
2018 IEEE Spoken Language Technology Workshop (SLT), Athens, Greece, 18–21 December 2018; pp. 907–912.
55. Azaria, A.; Srivastava, S.; Krishnamurthy, J.; Labutov, I.; Mitchell, T.M. An agent for learning new natural language commands.
Auton. Agents-Multi-Agent Syst. 2020, 34, 1–27. [CrossRef]
Sensors 2021, 21, 8448 41 of 48
56. Bocklisch, T.; Faulkner, J.; Pawlowski, N.; Nichol, A. Rasa: Open source language understanding and dialogue management.
arXiv 2017, arXiv:1712.05181.
57. Pennington, J.; Socher, R.; Manning, C.D. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference
on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543.
58. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]
59. Lafferty, J.; McCallum, A.; Pereira, F.C. Conditional random fields: Probabilistic models for segmenting and labeling sequence
data. In Proceedings of the 18th International Conference on Machine Learning (ICML 2001), Williamstown, MA, USA, 28 June–1
July 2001; pp. 282–289.
60. Lee, S.; Zhu, Q.; Takanobu, R.; Zhang, Z.; Zhang, Y.; Li, X.; Li, J.; Peng, B.; Li, X.; Huang, M.; et al. ConvLab: Multi-Domain
End-to-End Dialog System Platform. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics:
System Demonstrations, Florence, Italy, 28 July–2 August 2019; Association for Computational Linguistics: Stroudsburg, PA,
USA, 2019; pp. 64–69. [CrossRef]
61. Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding.
arXiv 2018, arXiv:1810.04805.
62. McTear, M. The Role of Spoken Dialogue in User–Environment Interaction. Human-Centric Interfaces for Ambient Intelligence;
Academic Press: Cambridge, MA, USA, 2010; pp. 225–254. [CrossRef]
63. Harms, J.G.; Kucherbaev, P.; Bozzon, A.; Houben, G.J. Approaches for dialog management in conversational agents. IEEE Internet
Comput. 2018, 23, 13–22. [CrossRef]
64. Nguyen, A.; Wobcke, W. An agent-based approach to dialogue management in personal assistants. In Proceedings of the 10th
International Conference on Intelligent User Interfaces, San Diego, CA, USA, 10–13 January 2005; pp. 137–144.
65. Moore, R.C.; Dowding, J.; Bratt, H.; Gawron, J.M.; Gorfu, Y.; Cheyer, A. CommandTalk: A spoken-language interface for
battlefield simulations. In Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, WA, USA,
31 March–3 April 1997; pp. 1–7.
66. Stent, A.; Dowding, J.; Gawron, J.M.; Bratt, E.O.; Moore, R.C. The CommandTalk spoken dialogue system. In Proceedings of the
37th Annual Meeting of the Association for Computational Linguistics, College Park, MA, USA, 20–26 June 1999; pp. 183–190.
67. MindMeld. Introducing MindMeld. Available online: https://www.mindmeld.com/docs/intro/introducing_mindmeld.html
(accessed on 9 December 2021).
68. Klopfenstein, L.C.; Delpriori, S.; Ricci, A. Adapting a conversational text generator for online chatbot messaging. In Proceedings
of the International Conference on Internet Science, St. Petersburg, Russia, 24–26 October 2018; pp. 87–99.
69. Building and deploying a chatbot by using Dialogflow (overview). Available online: https://cloud.google.com/solutions/
building-and-deploying-chatbot-dialogflow (accessed on 9 December 2021).
70. Williams, J.D.; Kamal, E.; Ashour, M.; Amr, H.; Miller, J.; Zweig, G. Fast and easy language understanding for dialog systems
with Microsoft Language Understanding Intelligent Service (LUIS). In Proceedings of the 16th Annual Meeting of the Special
Interest Group on Discourse and Dialogue, Prague, Czech Republic, 2–4 September 2015; pp. 159–161.
71. Henderson, M.; Thomson, B.; Young, S. Word-based dialog state tracking with recurrent neural networks. In Proceedings of the
15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), Philadelphia, PA, USA, 18–20 June
2014; pp. 292–299.
72. Singh, S.P.; Kearns, M.J.; Litman, D.J.; Walker, M.A. Reinforcement learning for spoken dialogue systems. Adv. Neural Inf. Process.
Syst. 1999, 12, 956–962.
73. Li, J.; Monroe, W.; Ritter, A.; Galley, M.; Gao, J.; Jurafsky, D. Deep Reinforcement Learning for Dialogue Generation. arXiv 2016,
arXiv:1606.01541.
74. Serban, I.V.; Sankar, C.; Germain, M.; Zhang, S.; Lin, Z.; Subramanian, S.; Kim, T.; Pieper, M.; Chandar, S.; Ke, N.R.; et al. A deep
reinforcement learning chatbot. arXiv 2017, arXiv:1709.02349.
75. Reiter, E.; Dale, R. Building Applied Natural Language Generation Systems. Nat. Lang. Eng. 1997, 3, 57–87. [CrossRef]
76. Gatt, A.; Krahmer, E. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. J.
Artif. Intell. Res. 2018, 61, 65–170. [CrossRef]
77. Van Deemter, K.; Krahmer, E.; Theune, M. Squibs and Discussions: Real versus Template-Based Natural Language Generation: A
False Opposition? Comput. Linguist. 2005, 31, 15–24. [CrossRef]
78. Wen, T.H.; Gašić, M.; Mrkšić, N.; Su, P.H.; Vandyke, D.; Young, S. Semantically Conditioned LSTM-based Natural Language
Generation for Spoken Dialogue Systems. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language
Processing, Lisbon, Portugal, 17–21 September 2015; Association for Computational Linguistics: Stroudsburg, PA, USA, 2015; pp.
1711–1721. [CrossRef]
79. Tran, V.K.; Nguyen, L.M.; Tojo, S. Neural-based Natural Language Generation in Dialogue using RNN Encoder-Decoder with
Semantic Aggregation. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, Saarbrücken, Germany,
15–17 August 2017; Association for Computational Linguistics: Saarbruecken, Germany, 2017; pp. 231–240. [CrossRef]
80. Juraska, J.; Karagiannis, P.; Bowden, K.; Walker, M. A Deep Ensemble Model with Slot Alignment for Sequence-to-Sequence
Natural Language Generation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, New Orleans, LA, USA, 1–6 June 2018; pp. 152–162. [CrossRef]
Sensors 2021, 21, 8448 42 of 48
81. Dušek, O.; Novikova, J.; Rieser, V. Evaluating the state-of-the-art of End-to-End Natural Language Generation: The E2E NLG
challenge. Comput. Speech Lang. 2020, 59, 123–156. [CrossRef]
82. Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y.; Mitchell, M.; Nie, J.Y.; Gao, J.; Dolan, B. A Neural Network Approach
to Context-Sensitive Generation of Conversational Responses. In Proceedings of the 2015 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, 31 May–5 June
2015; pp. 196–205.
83. Mikolov, T.; Zweig, G. Context dependent recurrent neural network language model. In Proceedings of the 2012 IEEE Spoken
Language Technology Workshop (SLT), Miami, FL, USA, 2–5 December 2012; IEEE: Manhattan, NY, USA, 2012; pp. 234–239.
84. Li, J.; Galley, M.; Brockett, C.; Gao, J.; Dolan, B. A Diversity-Promoting Objective Function for Neural Conversation Models. arXiv
2015, arXiv:1510.03055.
85. Serban, I.; Sordoni, A.; Bengio, Y.; Courville, A.; Pineau, J. Building end-to-end dialogue systems using generative hierarchical
neural network models. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February
2016.
86. He, S.; Liu, C.; Liu, K.; Zhao, J. Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-
to-Sequence Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver,
BC, Canada, 30 July–4 August 2017; pp. 199–208.
87. Qiu, M.; Li, F.L.; Wang, S.; Gao, X.; Chen, Y.; Zhao, W.; Chen, H.; Huang, J.; Chu, W. Alime chat: A sequence to sequence and
rerank based chatbot engine. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; Short
Papers, Vancouver, BC, Canada, 30 July–4 August 2017; Volume 2, pp. 498–503.
88. Ghazvininejad, M.; Brockett, C.; Chang, M.W.; Dolan, B.; Gao, J.; tau Yih, W.; Galley, M. A Knowledge-Grounded Neural
Conversation Model. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February
2018.
89. Ham, D.; Lee, J.G.; Jang, Y.; Kim, K.E. End-to-End Neural Pipeline for Goal-Oriented Dialogue Systems using GPT-2. In
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 5–10 July 2020; pp. 583–592. [CrossRef]
90. Kim, J.; Ham, D.; Lee, J.G.; Kim, K.E. End-to-End Document-Grounded Conversation with Encoder-Decoder Pre-Trained
Language Model. In Proceedings of the DSTC9 Workshop, Online, 8–9 February 2021.
91. Das, A.; Kottur, S.; Moura, J.M.; Lee, S.; Batra, D. Learning cooperative visual dialog agents with deep reinforcement learning. In
Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2951–2960.
92. Zhang, Z.; Takanobu, R.; Huang, M.; Zhu, X. Recent Advances and Challenges in Task-oriented Dialog System. arXiv 2020,
arXiv:2003.07490.
93. Kim, A.; Song, H.J.; Park, S.B. A two-step neural dialog state tracker for task-oriented dialog processing. Comput. Intell. Neurosci.
2018, 2018, 5798684. [CrossRef]
94. Mrksic, N.; Seaghdha, D.O.; Wen, T.H.; Thomson, B.; Young, S.J. Neural Belief Tracker: Data-Driven Dialogue State Tracking. In
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; Long Papers, Vancouver, BC, Canada,
30 July–4 August 2017; Volume 1, pp. 1777–1788. [CrossRef]
95. Su, P.H.; Vandyke, D.; Gasic, M.; Kim, D.; Mrksic, N.; Wen, T.H.; Young, S. Learning from real users: Rating dialogue success with
neural networks for reinforcement learning in spoken dialogue systems. arXiv 2015, arXiv:1508.03386.
96. Liu, B.; Lane, I. Iterative policy learning in end-to-end trainable task-oriented neural dialog models. In Proceedings of the 2017
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, 16–20 December 2017; pp. 482–489.
97. Clark, L.M.H.; Pantidi, N.; Cooney, O.; Garaialde, P.R.D.D.; Edwards, J.; Spillane, B.; Gilmartin, E.; Murad, C.; Munteanu, C.
What Makes a Good Conversation?: Challenges in Designing Truly Conversational Agents. In Proceedings of the 2019 CHI
Conference, Glasgow, UK, 4–9 May 2019.
98. Yang, X.; Aurisicchio, M.; Baxter, W. Understanding Affective Experiences with Conversational Agents. In Proceedings of the
2019 CHI Conference, Glasgow, UK, 4–9 May 2019.
99. Acheampong, F.A.; Wenyu, C.; Nunoo-Mensah, H. Text-based emotion detection: Advances, challenges, and opportunities. Eng.
Rep. 2020, 2, e12189. [CrossRef]
100. Allouch, M.; Azaria, A.; Azoulay, R.; Ben-Izchak, E.; Zwilling, M.; Zachor, D.A. Automatic detection of insulting sentences in
conversation. In Proceedings of the 2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE),
Eilat, Israel, 12–14 December 2018; pp. 1–4.
101. Schlesinger, A.; O’Hara, K.P.; Taylor, A.S. Let’s talk about race: Identity, chatbots, and AI. In Proceedings of the 2018 CHI
Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018, pp. 1–14.
102. Sarder, M.A. ECActive Embodied Conversational Agent for Mental Health Intervention. Master’s Thesis, Delft University of
Technology, Delft, The Netherlands, August 2018.
103. Yalçın, Ö.N. Empathy framework for embodied conversational agents. Cogn. Syst. Res. 2020, 59, 123–132. [CrossRef]
104. Tellols, D.; Lopez-Sanchez, M.; Rodríguez, I.; Almajano, P.; Puig, A. Enhancing sentient embodied conversational agents with
machine learning. Pattern Recognit. Lett. 2020, 129, 317–323. [CrossRef]
105. McLeod, S. Maslow’s Hierarchy of Needs. Simply Psychology. 2007. Available online: https://www.simplypsychology.org/
maslow.html (accessed on 9 December 2021).
Sensors 2021, 21, 8448 43 of 48
106. Chen, J.; Wu, Y.; Jia, C.; Zheng, H.; Huang, G. Customizable text generation via conditional text generative adversarial network.
Neurocomputing 2020, 416, 125–135. [CrossRef]
107. Zhou, L.; Gao, J.; Li, D.; Shum, H.Y. The design and implementation of xiaoice, an empathetic social chatbot. Comput. Linguist.
2020, 46, 53–93. [CrossRef]
108. Asghar, N.; Poupart, P.; Hoey, J.; Jiang, X.; Mou, L. Affective neural response generation. In Proceedings of the European
Conference on Information Retrieval, Grenoble, France, 26–29 March 2018; pp. 154–166.
109. Zhou, H.; Huang, M.; Zhang, T.; Zhu, X.; Liu, B. Emotional chatting machine: Emotional conversation generation with internal
and external memory. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February
2018.
110. Chaves, A.P.; Gerosa, M.A. How should my chatbot interact? A survey on human-chatbot interaction design, 2020. arXiv 2020,
arXiv:1904.02743.
111. Zhang, S.; Dinan, E.; Urbanek, J.; Szlam, A.; Kiela, D.; Weston, J. Personalizing Dialogue Agents: I have a dog, do you have pets
too? arXiv 2018, arXiv:1709.02349.
112. Völkel, S.T.; Schödel, R.; Buschek, D.; Stachl, C.; Winterhalter, V.; Bühner, M.; Hussmann, H. Developing a Personality Model for
Speech-based Conversational Agents Using the Psycholexical Approach. In Proceedings of the CHI ’20: Proceedings of the 2020
CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14.
113. Roccas, S.; Sagiv, L.; Schwartz, S.H.; Knafo, A. The Big Five Personality Factors and Personal Values. Personal. Soc. Psychol. Bull.
2002, 28, 789–801. [CrossRef]
114. Feine, J.; Gnewuch, U.; Morana, S.; Maedche, A. A Taxonomy of Social Cues for Conversational Agents. Int. J. Hum.-Comput.
Stud. 2019, 132, 138–161. [CrossRef]
115. Burgoon, J.; Guerrero, L.; Manusov, V. Nonverbal signals. In The SAGE Handbook of Interpersonal Communication; SAGE
Publications: Thousand Oaks, CA, USA, 2011; pp. 239–282.
116. Liao, Y.; He, J. Racial mirroring effects on human-agent interaction in psychotherapeutic conversations. In Proceedings of the
25th International Conference on Intelligent User Interfaces, Cagliari, Italy, 18–20 March 2020; pp. 430–442.
117. Go, E.; Sundar, S.S. Humanizing chatbots: The effects of visual, identity and conversational cues on humanness perceptions.
Comput. Hum. Behav. 2019, 97, 304–316. [CrossRef]
118. Smith, E.M.; Williamson, M.; Shuster, K.; Weston, J.; Boureau, Y.L. Can You Put it All Together: Evaluating Conversational Agents’
Ability to Blend Skills. arXiv 2020, arXiv:2004.08449.
119. Ferland, L.; Koutstaal, W. How’s Your Day Look? The (Un)Expected Sociolinguistic Effects of User Modeling in a Conversational
Agent. In Proceedings of the CHI EA ’20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing
Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 482–489. [CrossRef]
120. Carfora, V.; Massimo, F.D.; Rastelli, R.; Catellani, P.; Piastra, M. Dialogue management in conversational agents through
psychology of persuasion and machine learning. Multimed. Tools Appl. 2020, 79, 35949–35971. [CrossRef]
121. Ajzen, I. The theory of planned behavior. Organ. Behav. Hum. Decis. Process. 1991, 50, 179–211. [CrossRef]
122. oulay, R.; David, E.; Avigal, M.; Hutzler, D. Adaptive Task Selection in Automated Educational Software: A Comparative Study.
In Intelligent Systems and Learning Data Analytics in Online Education; Elsevier: Amsterdam, The Netherlands, 2021.
123. Azevedo, R.; Landis, R.S.; Feyzi-Behnagh, R.; Duffy, M.; Trevors, G.; Harley, J.M.; Bouchet, F.; Burlison, J.; Taub, M.; Pacampara,
N.; et al. The effectiveness of pedagogical agents’ prompting and feedback in facilitating co-adapted learning with MetaTutor. In
Proceedings of the International Conference on Intelligent Tutoring Systems, Chania, Crete, Greece, 14–18 June 2012; pp. 212–221.
124. Ueno, M.; Miyazawa, Y. IRT-based adaptive hints to scaffold learning in programming. IEEE Trans. Learn. Technol. 2017,
11, 415–428. [CrossRef]
125. Winkler, R.; Hobert, S.; Salovaara, A.; Söllner, M.; Leimeister, J.M. Sara, the lecturer: Improving learning in online education with
a scaffolding-based conversational agent. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems,
Honolulu, HI, USA, 26 April 2020; pp. 1–14.
126. Ni, L.; Lu, C.; Liu, N.; Liu, J. Mandy: Towards a smart primary care chatbot application. In Proceedings of the International
Symposium on Knowledge and Systems Sciences, Bangkok, Thailand, 17–19 November 2017; pp. 38–52.
127. Schuetzler, R.M.; Grimes, G.M.; Giboney, J.S.; Nunamaker, J.F., Jr. The influence of conversational agents on socially desirable
responding. In Proceedings of the 51st Hawaii International Conference on System Sciences, Big Island, HI, USA, 3–6 January
2018; p. 283.
128. Colby, K.M. Ten criticisms of parry. ACM SIGART Bull. 1974, 48, 5–9. [CrossRef]
129. Yin, Z.; Chang, K.h.; Zhang, R. Deepprobe: Information directed sequence understanding and chatbot design via recurrent
neural networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
Halifax, NS, Canada, 13–17 August 2017; pp. 2131–2139.
130. Liu, H.; Lin, T.; Sun, H.; Lin, W.; Chang, C.W.; Zhong, T.; Rudnicky, A. Rubystar: A non-task-oriented mixture model dialog
system. arXiv 2017, arXiv:1711.02781.
131. Hoy, M.B. Human-Aided Bots. Med. Ref. Serv. Q. 2018, 37, 81–88. [CrossRef]
132. Azaria, A.; Krishnamurthy, J.; Mitchell, T. Instructable intelligent personal agent. In Proceedings of the AAAI Conference on
Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016.
Sensors 2021, 21, 8448 44 of 48
133. Li, T.J.J.; Azaria, A.; Myers, B.A. SUGILITE: Creating multimodal smartphone automation by demonstration. In Proceedings of
the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017; pp. 6038–6049.
134. Chkroun, M.; Azaria, A. Safebot: A safe collaborative chatbot. In Proceedings of the AAAI Workshops, New Orleans, LA, USA,
2–7 February 2018.
135. Ait-Mlouk, A.; Jiang, L. KBot: A Knowledge graph based chatBot for natural language understanding over linked data. IEEE
Access 2020, 8, 149220–149230. [CrossRef]
136. Paladines, J.; Ramirez, J. A systematic literature review of intelligent tutoring systems with dialogue in natural language. IEEE
Access 2020, 8, 164246–164267. [CrossRef]
137. Paschoal, L.N.; Krassmann, A.L.; Nunes, F.B.; de Oliveira, M.M.; Bercht, M.; Barbosa, E.F.; de Souza, S.d.R.S. A Systematic
Identification of Pedagogical Conversational Agents. In Proceedings of the 2020 IEEE Frontiers in Education Conference (FIE),
Uppsala, Sweden, 21–24 October 2020; pp. 1–9.
138. Paschoal, L.N.; Turci, L.F.; Conte, T.U.; Souza, S.R. Towards a conversational agent to support the software testing education. In
Proceedings of the 33th Brazilian Symposium on Software Engineering, Salvador, Brazil, 23–27 September 2019; pp. 57–66.
139. Graesser, A.C.; Wiemer-Hastings, K.; Wiemer-Hastings, P.; Kreuz, R.; Group, T.R. AutoTutor: A simulation of a human tutor.
Cogn. Syst. Res. 1999, 1, 35–51. [CrossRef]
140. Abdellatif, A.; Badran, K.; Shihab, E. MSRBot: Using bots to answer questions from software repositories. Empir. Softw. Eng.
2020, 25, 1834–1863. [CrossRef]
141. Hobert, S. Say hello to ‘coding tutor’! design and evaluation of a chatbot-based learning system supporting students to learn
to program. In Proceedings of the 40th International Conference on Information Systems, ICIS 2019, Munich, Germany, 15–18
December 2019.
142. Kloos, C.D.; Catálan, C.; Muñoz-Merino, P.J.; Alario-Hoyos, C. Design of a conversational agent as an educational tool. In
Proceedings of the 2018 Learning With MOOCS (LWMOOCS), Madrid, Spain, 26–28 September 2018; pp. 27–30.
143. Aguirre, C.C.; Kloos, C.D.; Alario-Hoyos, C.; Muñoz-Merino, P.J. Supporting a MOOC through a conversational agent. Design of
a first prototype. In Proceedings of the 2018 International Symposium on Computers in Education (SIIE) Cadiz, Spain, 19–21
September 2018; pp. 1–6.
144. Assistant, G. Google Assistant, Your Own Personal Google. Available online: https://assistant.google.com/ (accessed on 9
December 2021).
145. Lin, P.; Van Brummelen, J.; Lukin, G.; Williams, R.; Breazeal, C. Zhorai: Designing a Conversational Agent for Children to
Explore Machine Learning Concepts. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12
February 2020; pp. 13381–13388.
146. Cai, W.; Grossman, J.; Lin, Z.J.; Sheng, H.; Wei, J.T.Z.; Williams, J.J.; Goel, S. Bandit algorithms to personalize educational chatbots.
Mach. Learn. 2021, 110, 1–30. [CrossRef]
147. Kim, N.Y.; Cha, Y.; Kim, H.S. Future English learning: Chatbots and artificial intelligence. Multimed.-Assist. Lang. Learn. 2019,
22, 32–53.
148. Maria, A. Got an Alexa? You’ve Got a Polyglot Tutor That Can Teach You a Language. Available online: https://www.fluentu.
com/blog/can-alexa-teach-languages/ (accessed on 9 December 2021).
149. Pham, X.L.; Pham, T.; Nguyen, Q.M.; Nguyen, T.H.; Cao, T.T.H. Chatbot as an intelligent personal assistant for mobile language
learning. In Proceedings of the 2018 2nd International Conference on Education and E-Learning, Bali, Indonesia, 5–7 November
2018; pp. 16–21.
150. Fei, W.Y.; Petrina, S. Using learning analytics to understand the design of an intelligent language tutor–Chatbot lucy. Editor. Pref.
2013, 4, 124–131. [CrossRef]
151. Hien, H.T.; Pham-Nguyen, C.; Nam, L.N.H.; Dinh, T.L. Intelligent Assistants in Higher-Education Environments: The FIT-EBot,
a Chatbot for Administrative and Learning Support. In Proceedings of the 9th International Symposium on Information and
Communication Technology, Danang City, Vietnam, 6–7 December 2018; pp. 69–76.
152. Ranoliya, B.R.; Raghuwanshi, N.; Singh, S. Chatbot for university related FAQs. In Proceedings of the 2017 International
Conference on Advances in Computing, Communications and Informatics (ICACCI), Manipal, India, 13–16 September 2017;
pp. 1525–1530.
153. Lee, K.; Jo, J.; Kim, J.; Kang, Y. Can Chatbots Help Reduce the Workload of Administrative Officers?—Implementing and
Deploying FAQ Chatbot Service in a University. In Proceedings of the International Conference on Human-Computer Interaction,
Orlando, FL, USA, 26–31 July 2019; pp. 348–354.
154. Feng, D.; Shaw, E.; Kim, J.; Hovy, E. An intelligent discussion-bot for answering student queries in threaded discussions. In
Proceedings of the 11th International Conference on Intelligent User Interfaces, Sydney, Australia, 29 January–1 February 2006;
pp. 171–177.
155. LI, X.; Zhong, H.; Zhang, B.; Zhang, J. A General Chinese Chatbot based on Deep Learning and Its’ Application for Children with
ASD. Int. J. Mach. Learn. Comput. 2020, 10, 1–10. [CrossRef]
156. Triantafyllidou, C. Assistive Technologies for Dyslexia: Punctuation and Its Interfaces with Speech. Master’s Thesis, University
of Central Florida, Orlando, FL, USA, 2020.
Sensors 2021, 21, 8448 45 of 48
157. Park, D.E.; Shin, Y.J.; Park, E.; Choi, I.A.; Song, W.Y.; Kim, J. Designing a Voice-Bot to Promote Better Mental Health: UX Design
for Digital Therapeutics on ADHD Patients. In Proceedings of the 2020 CHI Conference on Human Factors in Computing
Systems; Extended Abstracts, Honolulu, HI, USA, 25–30 April 2020; pp. 1–8.
158. Valadão, C.T.; Goulart, C.; Rivera, H.; Caldeira, E.; Bastos Filho, T.F.; Frizera-Neto, A.; Carelli, R. Analysis of the use of a robot to
improve social skills in children with autism spectrum disorder. Res. Biomed. Eng. 2016, 32, 161–175. [CrossRef]
159. Boucenna, S.; Narzisi, A.; Tilmont, E.; Muratori, F.; Pioggia, G.; Cohen, D.; Chetouani, M. Interactive technologies for autistic
children: A review. Cogn. Comput. 2014, 6, 722–740. [CrossRef]
160. Scassellati, B.; Boccanfuso, L.; Huang, C.M.; Mademtzi, M.; Qin, M.; Salomons, N.; Ventola, P.; Shic, F. Improving social skills in
children with ASD using a long-term, in-home social robot. Sci. Robot. 2018, 3. [CrossRef]
161. Costa, A.P.; Charpiot, L.; Lera, F.R.; Ziafati, P.; Nazarikhorram, A.; Van Der Torre, L.; Steffgen, G. More attention and less repetitive
and stereotyped behaviors using a robot with children with autism. In Proceedings of the 27th IEEE 27th IEEE International
Symposium on Robot and Human Interactive Communication, Nanjing, China, 27–31 August 2018; pp. 534–539.
162. Vanderborght, B.; Simut, R.; Saldien, J.; Pop, C.; Rusu, A.S.; Pintea, S.; Lefeber, D.; David, D.O. Using the social robot probo as a
social story telling agent for children with ASD. Interact. Stud. 2012, 13, 348–372. [CrossRef]
163. Peca, A.; Tapus, A.; Aly, A.; Pop, C.; Jisa, L.; Pintea, S.; Rusu, A.; David, D. Exploratory study: Children’s with autism awareness
of being imitated by NAO Robot. arXiv 2020, arXiv:2003.03528.
164. Laranjo, L.; Dunn, A.G.; Tong, H.L.; Kocaballi, A.B.; Chen, J.; Bashir, R.; Surian, D.; Gallego, B.; Magrabi, F.; Lau, A.Y.; et al.
Conversational agents in healthcare: A systematic review. J. Am. Med. Inform. Assoc. 2018, 25, 1248–1258. [CrossRef]
165. Car, L.T.; Dhinagaran, D.A.; Kyaw, B.M.; Kowatsch, T.; Rayhan, J.S.; Theng, Y.L.; Atun, R. Conversational agents in health care:
Scoping review and conceptual analysis. J. Med. Internet Res. 2020, 22, e17158.
166. Theresa Schachner, R.; Keller, F.v.W. Artificial Intelligence-Based Conversational Agents for Chronic Conditions: Systematic
Literature Review. J. Med. Internet Res. 2020, 22, e20701. [CrossRef]
167. Montenegro, J.L.Z.; da Costa, C.A.; da Rosa Righi, R. Survey of conversational agents in health. Expert Syst. Appl. 2019, 129, 56–67.
[CrossRef]
168. Fadhil, A.; Wang, Y.; Reiterer, H. Assistive conversational agent for health coaching: A validation study. Methods Inf. Med. 2019,
58, 009–023. [CrossRef]
169. Neerincx, M.A.; van Vught, W.; Blanson Henkemans, O.; Oleari, E.; Broekens, J.; Peters, R.; Kaptein, F.; Demiris, Y.; Kiefer, B.;
Fumagalli, D.; et al. Socio-Cognitive Engineering of a Robotic Partner for Child’s Diabetes Self-Management. Front. Robot. 2019,
6, 118. [CrossRef]
170. High, R. The Era of Cognitive Systems: An Inside Look at IBM Watson and How It Works; IBM Redbooks: Endicott, NY, USA, 2012;
16p.
171. Strickland, E. IBM Watson, heal thyself: How IBM overpromised and underdelivered on AI health care. IEEE Spectr. 2019,
56, 24–31. [CrossRef]
172. Ross, C.; Swetlitz, I. IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents
show. Stat 2018, 25, 1–10.
173. Xu, L.; Zhou, Q.; Gong, K.; Liang, X.; Tang, J.; Lin, L. End-to-End Knowledge-routed relational dialogue system for automatic
diagnosis. In Proceedings of the Association for the Advance of Artificial Intelligence, Online, 2–9 February 2019; pp. 7346–7353.
174. Fitzpatrick, K.K.; Darcy, A.; Vierhile, M. Delivering cognitive behavior therapy to young adults with symptoms of depression
and anxiety using a fully automated conversational agent (Woebot): A randomized controlled trial. JMIR Mental Health 2017,
4, e19. [CrossRef]
175. Edwards, R.A.; Bickmore, T.; Jenkins, L.; Foley, M.; Manjourides, J. Use of an interactive computer agent to support breastfeeding.
Matern. Child Health J. 2013, 17, 1961–1968. [CrossRef]
176. Yang, W.; Zeng, G.; Tan, B.; Ju, Z.; Chakravorty, S.; He, X.; Chen, S.; Yang, X.; Wu, Q.; Zhou, Y.; et al. On the generation of medical
dialogues for COIVD-19. arXiv 2020, arXiv:2005.05442
177. Palanica, A.; Flaschner, P.; Thommandram, A.; Li, M.; Fossat, Y. Physicians’ perceptions of chatbots in health care: Cross-sectional
web-based survey. J. Med. Internet Res. 2019, 21, e12887. [CrossRef]
178. Nadarzynski, T.; Miles, O.; Cowie, A.; Ridge, D. Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: A
mixed-methods study. Digit. Health 2019, 5, 2055207619871808. [CrossRef]
179. Scholten, M.R.; Kelders, S.M.; Van Gemert-Pijnen, J.E. Self-Guided Web-Based Interventions: Scoping Review on User Needs and
the Potential of Embodied Conversational Agents to Address Them. J. Med. Internet Res. 2017, 19, e383. [CrossRef]
180. Dhanda, S. How Chatbots Will Transform the Retail Industry; Juniper Research: Hampshire, UK, 2018. Available online: https://www.
brand-news.it/wp-content/uploads/2018/07/How-Chatbots-Will-Transform-The-Retail-Industry-whitepaper.pdf (accessed on
9 December 2021).
181. Bavaresco, R.; Silveira, D.; Reis, E.; Barbosa, J.; Righi, R.; Costa, C.; Antunes, R.; Gomes, M.; Gatti, C.; Vanzin, M.; et al.
Conversational agents in business: A systematic literature review and future research directions. Comput. Sci. Rev. 2020,
36, 100239. [CrossRef]
182. Thomas, N. An e-business chatbot using AIML and LSA. In Proceedings of the 2016 International Conference on Advances in
Computing, Communications and Informatics (ICACCI), Jaipur, India, 21–24 September 2016; pp. 2740–2742.
Sensors 2021, 21, 8448 46 of 48
183. Cui, L.; Huang, S.; Wei, F.; Tan, C.; Duan, C.; Zhou, M. Superagent: A customer service chatbot for e-commerce websites. In
Proceedings of the ACL 2017, System Demonstrations, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 97–102.
184. Xu, A.; Liu, Z.; Guo, Y.; Sinha, V.; Akkiraju, R. A new chatbot for customer service on social media. In Proceedings of the 2017
CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017; pp. 3506–3510.
185. Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. Bleu: A method for automatic evaluation of machine translation. In Proceedings of
the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 6–12 July 2002; pp. 311–318.
186. Yan, Z.; Duan, N.; Chen, P.; Zhou, M.; Zhou, J.; Li, Z. Building task-oriented dialogue systems for online shopping. In Proceedings
of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017.
187. Pradana, A.; Sing, G.O.; Kumar, Y. Sambot-intelligent conversational bot for interactive marketing with consumer-centric
approach. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 2017, 6, 265–275.
188. Kaghyan, S.; Sarpal, S.; Zorilescu, A.; Akopian, D. Review of Interactive Communication Systems for Business-to-Business (B2B)
Services. Electron. Imaging 2018, 2018, 1–11. [CrossRef]
189. Lewis, M.; Yarats, D.; Dauphin, Y.N.; Parikh, D.; Batra, D. Deal or No Deal? End-to-End Learning for Negotiation Dialogues,
2017. arXiv 2017, arXiv:1706.05125.
190. Luo, X.; Tong, S.; Fang, Z.; Qu, Z. Frontiers: Machines vs. humans: The impact of artificial intelligence chatbot disclosure on
customer purchases. Mark. Sci. 2019, 38, 937–947. [CrossRef]
191. Følstad, A.; Nordheim, C.B.; Bjørkli, C.A. What makes users trust a chatbot for customer service? An exploratory interview study.
In Proceedings of the International Conference on Internet Science, St. Petersburg, Russia, 24–26 October 2018; pp. 194–208.
192. Li, C.H.; Yeh, S.F.; Chang, T.J.; Tsai, M.H.; Chen, K.; Chang, Y.J. A Conversation Analysis of Non-Progress and Coping Strategies
with a Banking Task-Oriented Chatbot. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems,
Honolulu, HI, USA, 26 April 2020; pp. 1–12.
193. Agarwal, A. How to Write a Twitter Bot in 5 Minutes. Available online: https://www.labnol.org/internet/write-twitter-bot/27
902/ (accessed on 9 December 2021).
194. Peterschmidt, D. How to Make a Twitter Bot in Under an Hour Even If You Don’t Code That Often. Available online:
https://medium.com/science-friday-footnotes/how-to-make-a-twitter-bot-in-under-an-hour-259597558acf (accessed on 9
December 2021).
195. Adams, T. AI-Powered Social Bots. arXiv 2017, arXiv:1706.05143.
196. Assenmacher, D.; Clever, L.; Frischlichy, L. Demystifying Social Bots: On the Intelligence of Automated Social Media Actors. Soc.
Media Soc. 2020, 1–14. [CrossRef]
197. Kollanyi, B. Automation, Algorithms, and Politics| Where Do Bots Come From? An Analysis of Bot Codes Shared on GitHub.
Int. J. Commun. 2016, 10, 20.
198. Ferrara, E.; Varol, Q.; Davis, C.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 37, 81–88. [CrossRef]
199. Varol, O.; Ferrara, E.; Davis, C.; Menczer, F.; Flammini, A. Online human-bot interactions: Detection, estimation, and characteriza-
tion. In Proceedings of the International AAAI Conference on Web and Social Media, Montréal, QC, Canada, 15–18 May 2017;
pp. 280–289.
200. Subrahmanian, V.S.; Azaria, A.; Durst, S.; Kagan, V.; Galstyan, A.; Lerman, K.; Zhu, L.; Ferrara, E.; Flammini, A.; Menczer, F. The
DARPA Twitter bot challenge. IEEE Comput. Mag. 2016, 49, 38–46. [CrossRef]
201. Lee, K.; Eoff, B.; Caverlee, J. Seven months with the devils: A long-term study of content polluters on twitter. In Proceedings of
the International AAAI Conference on Web and Social Media, Cambridge, MA, USA, 8–11 July 2011.
202. Deriu, J.; Rodrigo, A.; Otegi, A.; Echegoyen, G.; Rosset, S.; Agirre, E.; Cieliebak, M. Survey on evaluation methods for dialogue
systems. Artif. Intell. Rev. 2021, 54, 755–810. [CrossRef]
203. Griol, D.; Carbó, J.; Molina, J.M. An automatic dialog simulation technique to develop and evaluate interactive conversational
agents. Appl. Artif. Intell. 2013, 27, 759–780. [CrossRef]
204. Papineni, K.A.; Roukos, S.; Ward, T.; Zhu, W. Understanding Affective Experiences with BLEU: A method for automatic
evaluation of machine translation. In Proceedings of the Association of Computational Linguistics, Philadelphia, PA, USA, 6–12
July 2002.
205. Lin, C.Y. Rouge: A Package for Automatic Evaluation of Summaries. Available online: https://aclanthology.org/W04-1013.pdf
(accessed on 9 December 2021).
206. Banerjee, S.; Lavie, A. Meteor: An automatic metric for mt evaluation with improved correlation with human judgments. In
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization,
Ann Arbor, MI, USA, 29 June 2005; pp. 65–72.
207. Liu, C.W.; Lowe, R.; Serban, I.V.; Noseworthy, M.; Charlin, L.; Pineau, J. How NOT To Evaluate Your Dialogue System: An
Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation. arXiv 2016, arxiv:1603.08023.
208. Lowe, R.; Noseworthy, M.; Serban, I.V.; Angelard-Gontier, N.; Bengio, Y.; Pineau, J. Towards an automatic turing test: Learning to
evaluate dialogue responses. arXiv 2017, arXiv:1708.07149.
209. Tao, C.; Mou, L.; Zhao, D.; Yan, R. Ruber: An unsupervised method for automatic evaluation of open-domain dialog systems. In
Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018.
210. Guo, F.; Metallinou, A.; Khatri, C.; Raju, A.; Venkatesh, A.; Ram, A. Topic-based evaluation for conversational bots. arXiv 2018,
arXiv:1801.03622.
Sensors 2021, 21, 8448 47 of 48
211. Serban, I.V.; Lowe, R.; Henderson, P.; Charlin, L.; Pineau, J. A Survey of Available Corpora for Building Data-Driven Dialogue
Systems. arXiv 2017, arXiv:1512.05742.
212. Keneshloo, Y.; Shi, T.; Ramakrishnan, N.; Reddy, C.K. Deep Reinforcement Learning For Sequence to Sequence Models. arXiv
2018, arXiv:1805.09461.
213. Li, Y.; Su, H.; Shen, X.; Li, W.; Cao, Z.; Niu, S. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In Proceedings
of the Eighth International Joint Conference on Natural Language Processing; Long Papers, Taipei, Taiwan, 27 November–1
December 2017; Volume 1.
214. Ameixa, D.; Coheur, L.; Redol, R.A. From Subtitles to Human Interactions: Introducing The Subtle Corpus. Technical Report.
Available online: https://www.inesc-id.pt/ficheiros/publicacoes/10062.pdf (accessed on 9 December 2021).
215. Lison, P.; Tiedemann, J. OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of
the International Conference on Language Resources and Evaluation, Portorož, Slovenia, 23–28 May 2016.
216. Tiedemann, J. News from OPUS-A collection of multilingual parallel corpora with tools and interfaces. In Proceedings of the
International Conference on Recent Advances in Natural Language Processing, Online, 1–3 September 2021; pp. 237–248.
217. Dodge, J.; Gane, A.; Zhang, X.; Bordes, A.; Chopra, S.; Miller, A.H.; Szlam, A.; Weston, J. Evaluating Prerequisite Qualities for
Learning End-to-End Dialog Systems. In Proceedings of the International Conference on Learning Representations, San Juan,
Puerto Rico, 2–4 May 2016.
218. Danescu-Niculescu-Mizil, C.; Lee, L. Chameleons in imagined conversations: A new approach to understanding coordination of
linguistic style in dialogs. arXiv 2011, arXiv:1106.3077.
219. Li, J.; Galley, M.; Brockett, C.; Spithourakis, G.P.; Gao, J.; Dolan, B. A Persona-Based Neural Conversation Model. arXiv 2016,
arXiv:1603.06155.
220. Ritter, A.; Cherry, C.; Dolan, B. Unsupervised Modeling of Twitter Conversations. In Proceedings of the Human Language
Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics,
Los Angeles, CA, USA, 2–4 June 2010; pp. 172–180.
221. Schrading, N.; Ovesdotter Alm, C.; Ptucha, R.; Homan, C. An Analysis of Domestic Abuse Discourse on Reddit. In Proceedings
of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September; pp. 2577–2583.
222. Zhang, Y.; Sun, S.; Galley, M.; Chen, Y.C.; Brockett, C.; Gao, X.; Gao, J.; Liu, J.; Dolan, B. DIALOGPT: Large-Scale Generative
Pre-training for Conversational Response Generation. arXiv 2019, arXiv:1911.00536.
223. Bao, S.; He, H.; Wang, F.; Wu, H.; Wang, H. PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable. In
Proceedings of the Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 85–96.
224. Lowe, R.; Pow, N.; Serban, I.; Pineau, J. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn
Dialogue Systems. arXiv 2017, arXiv:1506.08909.
225. Alizadeh, K. Limitations of Twitter Data Issues to be Aware of When Using Twitter Text Data. Available online: https:
//towardsdatascience.com/limitations-of-twitter-data-94954850cacf (accessed on 9 December 2021).
226. Zeng, C.; Li, S.; Li, Q.; Hu, J.; Hu, J. A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark
Datasets. Appl. Sci. 2020, 10, 7640. [CrossRef]
227. Rajpurkar, P.; Zhang, J.; Lopyrev, K.; Liang, P. Squad: 100,000+ questions for machine comprehension of text. arXiv 2016,
arXiv:1606.05250.
228. Rajpurkar, P.; Jia, R.; Liang, P. Know What You Don’t Know: Unanswerable Questions for SQuAD. In Proceedings of the 56th
Annual Meeting of the Association for Computational Linguistics; Short Papers, Melbourne, Australia, 15–20 July 2018; Volume 2,
pp. 784–789. [CrossRef]
229. Hermann, K.M.; Kocisky, T.; Grefenstette, E.; Espeholt, L.; Kay, W.; Suleyman, M.; Blunsom, P. Teaching Machines to Read and
Comprehend. Adv. Neural Inf. Process. Syst. 2015, 28, 1693–1701.
230. Kwiatkowski, T.; Palomaki, J.; Redfield, O.; Collins, M.; Parikh, A.; Alberti, C.; Epstein, D.; Polosukhin, I.; Devlin, J.; Lee, K.;
others. Natural questions: A benchmark for question answering research. Trans. Assoc. Comput. Linguist. 2019, 7, 453–466.
[CrossRef]
231. Joshi, M.; Choi, E.; Weld, D.; Zettlemoyer, L. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading
Comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; Long Papers,
Vancouver, BC, Canada, 30 July–4 August 2017; Volume 1, pp. 1601–1611. [CrossRef]
232. Rastogi, A.; Zang, X.; Sunkara, S.; Gupta, R.; Khaitan, P. Towards Scalable Multidomain Conversational Agents: The Schema-
Guided Dialogue Dataset. arXiv 2020, arXiv:1909.05855.
233. Budzianowski, P.; Wen, T.H.; Tseng, B.H.; Casanueva, I.; Ultes, S.; Ramadan, O.; Gašić, M. MultiWOZ—A Large-Scale Multi-
Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling. In Proceedings of the 2019 Conference on Empirical
Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018.
234. Byrne, B.; Krishnamoorthi, K.; Sankar, C.; Neelakantan, A.; Duckworth, D.; Yavuz, S.; Goodrich, B.; Dubey, A.; Cedilnik, A.; Kim,
K.Y. Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset. In Proceedings of the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP),
Hong Kong, China, 3–7 November 2019.
Sensors 2021, 21, 8448 48 of 48
235. Peskov, D.; Clarke, N.; Krone, J.; Fodor, B. Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating
and Annotating Large Scale Dialogue Data. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language
Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China,
3–7 November 2019; pp. 4526–4536.
236. Zeng, G.; Yang, W.; Ju, Z.; Yang, Y.; Wang, S.; Zhang, R.; Zhou, M.; Zeng, J.; Dong, X.; Zhang, R.; et al. MedDialog: Large-scale
Medical Dialogue Datasets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing
(EMNLP), Online, 16–20 November 2020; pp. 9241–9250. [CrossRef]
237. Sharma, A.; Lin, I.W.; Miner, A.S.; Atkins, D.C.; Althoff, T. Towards facilitating empathic conversations in online mental health
support: A reinforcement learning approach. arXiv 2021, arXiv:2101.07714.
238. Rashkin, H.; Smith, E.M.; Li, M.; Boureau, Y.L. Towards Empathetic Open-domain Conversation Models: A New Benchmark and
Dataset. arXiv 2019, arXiv:1811.00207.
239. McKeown, G.; Valstar, M.F.; Cowie, R.; Pantic, M. The SEMAINE corpus of emotionally coloured character interactions. In
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, ICME, Singapore, 19–23 July 2010; pp. 1–4.
[CrossRef]
240. Allouch, M.; Azaria, A.; Azoulay, R. Detecting sentences that may be harmful to children with special needs. In Proceedings of
the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 4–6 November 2019;
pp. 1209–1213.
241. Chai, Y.; Liu, G.; Jin, Z.; Sun, D. How to Keep an Online Learning Chatbot From Being Corrupted. In Proceedings of the 2020
International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8.
242. Yu, Y.; Eshghi, A.; Mills, G.; Lemon, O. The BURCHAK corpus: A challenge data set for interactive learning of visually grounded
word meanings. In Proceedings of the 6th Workshop on Vision and Language, Valencia, Spain, 4 April 2017; pp. 1–10.
243. Wolska, M.; Vo, Q.B.; Tsovaltzi, D.; Kruijff-Korbayová, I.; Karagjosova, E.; Horacek, H.; Fiedler, A.; Benzmüller, C. An Annotated
Corpus of Tutorial Dialogs on Mathematical Theorem Proving. In Proceedings of the International Conference on Language
Resources and Evaluation ( LREC), Lisbon, Portugal, 26–28 May 2004; pp. 1007–1010.
244. Hutzler, D.; David, E.; Avigal, M.; Azoulay, R. Learning methods for rating the difficulty of reading comprehension questions. In
Proceedings of the 2014 IEEE International Conference on Software Science, Technology and Engineering, Ramat Gan, Israel,
11–12 June 2014; pp. 54–62.
245. Bloom, B.S.; Engelhart, M.D.; Furst, E.J.; Hill, W.H.; Krathwohl, D.R. Taxonomy of Educational Objetives: The Classification of
Educational Goals: Handbook I: Cognitive Domain; Technical Report; Longmans, Green and Company: New York, NY, USA, 1956.
246. Stasaski, K.; Kao, K.; Hearst, M.A. CIMA: A Large Open Access Dialogue Dataset for Tutoring. In Proceedings of the 15th
Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA, 10 July 2020; pp. 52–64. [CrossRef]
247. Arabshahi, F.; Lee, J.; Gawarecki, M.; Mazaitis, K.; Azaria, A.; Mitchell, T. Conversational neuro-symbolic commonsense reasoning.
arXiv 2021, arXiv:2006.10022.
248. Chkroun, M.; Azaria, A. A Safe Collaborative Chatbot for Smart Home Assistants. Sensors 2021, 21, 6641. [CrossRef]
249. Chkroun, M.; Azaria, A. Lia: A virtual assistant that can be taught new commands by speech. Int. J.-Hum.-Comput. Interact.
2019, 35, 1596–1607. [CrossRef]
250. Došilović, F.K.; Brčić, M.; Hlupić, N. Explainable artificial intelligence: A survey. In Proceedings of the 2018 41st International
Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 21–25
May 2018; pp. 0210–0215.
251. Rosenfeld, A.; Richardson, A. Explainability in human–agent systems. Auton. Agents -Multi-Agent Syst. 2019, 33, 673–705.
[CrossRef]
252. Bird, E.; Fox-Skelly, J.; Jenner, N.; Larbey, R.; Weitkamp, E.; Winfield, A. The Ethics of Artificial Intelligence: Issues And Initiatives;
Technical Report; European Parliamentary Research Service: Strasbourg, France, 2020.