Conversational Interfaces in IoT Ecosystems Where We Are What Is Still Missing
Conversational Interfaces in IoT Ecosystems Where We Are What Is Still Missing
Conversational Interfaces in IoT Ecosystems Where We Are What Is Still Missing
ABSTRACT 1 INTRODUCTION
In the last few years, text and voice-based conversational agents The use of smart objects made possible by the advent of the Inter-
have become more and more popular all over the world as virtual net of Things (IoT) is undergoing rapid and steady growth that is
assistants for a variety of tasks. In addition, the deployment on leading to their spread and use in the most common daily environ-
the market of many smart objects connected with these agents ments. Recent reports show that the number of objects connected
has introduced the possibility of controlling and personalising the to the Internet is already substantial and will continue to increase
behaviour of several connected objects using natural language. This in the coming years: in 2021, there were about 11 billion objects,
has the potential to allow people, also those without a technical and this figure is projected to increase to 30 billion in 2030 [69]. The
background, to effectively control and use the wide variety of con- everyday use of these technologies finds application in different ar-
nected objects and services. In this paper, we present an analysis of eas, supporting end users and organizations in performing tasks of
how conversational agents have been used to interact with smart different nature and complexity. Five macro-areas can be identified
environments (such as smart homes). For this purpose, we have in which IoT devices find use [4]: healthcare, through the use of
carried out a systematic literature review considering publications smart wearables and personal monitoring; environmental, which
selected from the ACM and IEEE digital libraries to investigate the includes smart farming, smart agriculture, wildlife monitoring and
technologies used to design and develop conversational agents for climate change monitoring; smart cities composed of smart homes
IoT settings, including Artificial Intelligence techniques, the pur- and buildings, traffic and security monitoring, and in commercial
pose that they have been used for, and the level of user involvement and industrial sectors. Thus, the spread of these devices opens new
in such studies. The resulting analysis is useful to better understand possibilities for improving people’s quality of life, from comfort
how this field is evolving and indicate the challenges still open in and sustainability through smart home automation (e.g., [3]) and
this area that should be addressed in future research work to allow energy consumption control (e.g., [36]) to assistance for the older
people to completely benefit from this type of solution. adults and impaired (e.g. [44]) or remote monitoring and control
of farming systems (e.g., [8]). However, there is a need to increase
CCS CONCEPTS user confidence in the use of these technologies [25]. Providing
• Human-centered computing → Human-computer interaction the users (especially the non-experts) with useful tools for better
(HCI); Interaction paradigms; Natural language interfaces; • Gen- understanding and controlling these environments is crucial for
eral and reference → Document types; Surveys and overviews; implementing the smart living vision.
• Computer systems organization → Embedded and cyber- Using tools such as chatbots or, more in general virtual assistants
physical systems; Sensors and actuators. accessible through natural language can be a promising approach
to breaking down barriers between the user and technology given
KEYWORDS their potential ease of use, as demonstrated by the recent success of
Conversational Agents, Internet of Things, User Experience ChatGPT or by the diffusion of widely adopted commercial products
such as Alexa or Google Assistant. Thus, it is important to under-
ACM Reference Format:
stand how the development of conversational systems can empower
Simone, Gallo, Alessio, Malizia, and Fabio, Paternò. 2023. Conversational
non-technical individuals to interact intuitively with smart envi-
Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing. In
International Conference on Mobile and Ubiquitous Multimedia (MUM ’23), ronments (e.g., homes), how such tools are developed and tested,
December 03–06, 2023, Vienna, Austria. ACM, New York, NY, USA, 15 pages. and to what extent they meet the user needs while keeping the
https://doi.org/10.1145/3626705.3627775 interaction simple and clear even for complex tasks.
Despite several research papers providing comprehensive reviews
of the literature on conversational agents from various perspectives
This work is licensed under a Creative Commons Attribution International and application fields, the field concerning conversational systems
4.0 License. applied to the Internet of Things has received limited attention. In
previous studies, a systematic literature review proposed by Rapp
MUM ’23, December 03–06, 2023, Vienna, Austria
© 2023 Copyright held by the owner/author(s).
et al. [63] focuses on the interaction between text-based chatbots
ACM ISBN 979-8-4007-0921-0/23/12. and users, considering general aspects such as user satisfaction,
https://doi.org/10.1145/3626705.3627775 trust, and acceptance when engaging with chatbots. In another
study, Suhaili et al. [53] conducted a comprehensive review of
task-oriented chatbots, emphasizing the technical implementation
279
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
aspects rather than user perception. Their analysis aims to identify aspects, we can provide a clear picture of the considered area since
the techniques employed in developing the chatbots’ capabilities to they cover both the technological aspects and those related to the
understand user requests and generate appropriate responses. Re- users and the applications considered. In particular, the research
garding specific application domains, a study [58] reviews the state questions for the literature review are:
of the art of chatbots for education by exploring in which education RQ1: What intelligent technologies have been used to build IoT
sub-field are these solutions employed, how the educational system conversational agents?
can benefit from the use of chatbots, what the challenges faced Understanding these technologies offers insights into the tech-
during the implementation (such as ethical and evaluation issues) nical foundations and capabilities of chatbots in the IoT context,
and which areas of education could potentially benefit for the use enabling a more profound comprehension of their potential and
of chatbots. The applications of chatbots in healthcare has been limitations.
considered in [72], with a particular focus on oncology application. RQ2: In which IoT application domains have conversational
In this case, the authors identify six main task categories (diagnosis, agents been employed?
treatment, monitoring, support, workflow, and health promotion) in Identifying these domains aids researchers in understanding
which healthcare chatbots are employed. Based on these categories, explored areas and potential avenues for further investigation and
the authors indicate how chatbots act in assisting ontology care improvement.
patients, defining the current limitations and proposing various RQ3: What devices and modalities have been considered for
aspects that could be improved to enhance their efficiency. Overall, accessing conversational agents and how they have been deployed?
there is a lack of contributions that provide a review of the literature Analysing the interaction methods applied can be useful to un-
on chatbots applied in the Internet of Things domain. derstand whether there are areas not yet sufficiently explored and
For this purpose, we have carried out this systematic literature possible limitations in the approaches adopted.
review that aims to analyse the various contributions in the field of RQ4: How have conversational breakdowns been addressed?
conversational agents in the context of IoT ecosystems, and then Conversational agents must effectively handle challenges such
identify areas that need further investigations and aspects that as language ambiguity, user intent misinterpretation, and technical
require additional research efforts. In the paper we first introduce limitations. Resolving breakdowns can enhance user experience
the method followed in the systematic review to investigate the use and agent reliability in IoT ecosystems.
of conversational agents to control IoT ecosystems. We review the RQ5: What methodologies have been used to measure the us-
relevant papers according to the identified research questions. Then, ability and user experience of the proposed conversational agents?
we discuss the analysis carried out and identify emerging trends in Analysing whether and how usability and user experience eval-
the considered field and areas that require more work. Lastly, we uation methods have been applied is useful to understand whether
draw some conclusions and provide indications for future work. such aspects have been sufficiently considered and how they can
be further investigated.
2 METHODOLOGY
Following the guidelines introduced by Kitchenham and Charters 2.2 Search process
[46], this review begins with a planning phase that involves con- In order to identify the relevant articles for conducting this system-
ducting an initial analysis of the chatbot literature. This analysis re- atic review, we selected a set of papers from two digital libraries,
vealed a lack of reviews concerning conversational systems applied those of the Association for Computing Machinery (ACM) and the
to intelligent environments and connected objects. Subsequently, Institute of Electrical and Electronics Engineers (IEEE), in May
we defined and agreed upon the research questions and the review 2023. The articles were obtained by running queries with a string
protocol, which included the search process and the definition of of keywords aimed at finding contributions that addressed both
inclusion and exclusion criteria. Once the articles were retrieved the conversational aspect and the smart IoT context. The following
from the chosen databases, we started the conduction phase. This keywords and phrases were used in our search:
stage comprised a two-step screening process: the first screening (“conversational agent” OR “conversational AI” OR “intelligent
was based on the title and abstract of the articles, the second screen- assistant” OR “conversational assistant” OR “virtual assistant” OR
ing applied the exclusion criteria. During the second screening, we “intelligent agent” OR “chatbot” OR “chatterbot” OR “chatterbox”
discussed the relevance of the papers for our study. OR “socialbot” OR “digital assistant” OR “conversational UI” OR
Lastly, the obtained articles were analysed to address the research “conversational interface” OR “conversation system” OR “conver-
questions and report the results obtained. sational system” OR “dialogue system” OR “dialog system” OR
“vocal interaction” OR “natural language interaction” OR “natu-
2.1 Research questions ral language processing”) AND (“ambient computing” OR “smart
The research questions have been defined to investigate the evo- spaces” OR “IoT” OR “Internet Of Things” OR “IoT environments”
lution of chatbots in IoT ecosystems and identify areas that re- OR “automations” OR “smart environment” OR “smart home” OR
quire further investigation. The goal is to analyse how intelligent “IoT service mashup” OR “intelligent environment” OR “intelligent
techniques have been exploited, which application domains have spaces” OR “intelligent ambient”)
stimulated more interest, what interaction modalities have been
considered, how conversational breakdowns have been managed
and how user experience has been assessed. By addressing such
280
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
the topic of our interest (for example including only NLP or IoT
methods and algorithms, systematic reviews about only chatbots
or IoT, chatbot not related to IoT or IoT not related to chatbots).
We then applied the exclusion criteria detailed in the preceding
section on the resulting 81 papers, determining the final relevance
and inclusion of each paper in our analysis.
As a result, a further 31 articles were excluded, largely due to
their lack of conversational aspects, consideration of insufficient or
no Internet of Things (IoT) devices, and inadequate length. After
applying these exclusion criteria, we arrived at a refined list of
articles. The search phase concluded with the set of the remaining
50 articles (see the Appendix for the papers list).
3 LITERATURE ANALYSIS
From a temporal viewpoint (Table 1), an overview of the articles
retrieved shows some “early tentative” in the first decade of the
2000s, starting from 2005 when the concept of IoT was put forward
and Natural Language Processing technologies were not developed
as today. It seems that interest in this topic began to be more rele-
vant starting in 2017 (the year when Alexa and Google Home were
released worldwide), peaking in 2018 and 2021.
A preview of the key aspects related to the research questions is
summarised in Figure 2. The boxes refer to the research questions.
They include a list representing the categories derived from the
analysis of the papers’ content, each element is associated with
Figure 1: Process followed for paper selection. the number of contributions for the corresponding category. In the
box on UX evaluation methods the papers have been assigned to
multiple categories since usually the studies adopted more than
2.3 Inclusion and Exclusion criteria one evaluation metric. The full list of evaluation metrics can be
In order to obtain relevant articles for the review, we established a found in Table 2.
set of selection criteria. These were used to determine whether a
paper should be included in the analysis. Specifically, the following 3.1 Methods and Tools for conversational agents
selection criteria were applied: 1) the papers must be written in Conversational agents can be categorized into Task-Oriented and
English; 2) the papers must address the application of conversational Non-Task Oriented types [17, 39], with the former focused on spe-
systems to IoT environments, excluding studies that focus solely cific goals and the latter engaging in open-ended conversations
on interaction with a single object; 3) the interactive system must with users, and they can be further subcategorized based on their
utilize natural language within a conversational approach, thus architecture, such as rule-based, corpus-based, frame-based and dia-
we did not consider applications that exclusively use predefined logue state-based, each employing different methods for generating
commands (for instance, certain Telegram chatbots that employ responses and managing interactions [39].
commands such as ’/start’, ’/createItem’, or ’/stop’ to communicate); Given such a premise, and since most work does not explain
and 4) the papers must be at least four pages in length, in order to in-depth (and in some cases at all) the architecture exploited, we
have sufficient content to analyse. choose to label as “rule-based” all the implementations that do not
use machine learning (ML) or deep learning (DL) in the process of
2.4 Papers selection intent classification and entity extraction. Often the available solu-
With the use of the keywords outlined in the previous section, tions can be defined as architectural hybrids, applying frame-based
we identified a total of 3,177 articles within the selected digital li- architecture augmented with some dialogue-state components, as
braries. The ACM Digital Library contributed 2,337 of these articles, they use machine learning and deep learning techniques to iden-
while the remaining 840 articles were obtained from the Institute tify intents and entities but base the dialogue management and
of Electrical and Electronics Engineers (IEEE) library. References to the response generation on predefined rules and patterns. On the
the articles were collected in BibTeX format and processed using other hand, the solutions implemented in Rasa (an open-source
an open-source reference manager called JabRef. The information framework to build conversational systems) adopt a dialogue pol-
regarding the selection process is summarized below, using the icy that uses machine learning to predict the next most accurate
PRISMA flow diagram (Figure 1). dialogue action (send a response, wait for other messages, ask for
An initial screening phase (title + abstract) on the 3177 articles clarifications, . . .).
from the two digital libraries led to the exclusion of 25 duplicated Among the 50 conversational agents considered in the survey,
papers (through JabRef filter), and of 3071 papers that did not cover 18 have been considered rule-based, such as [1, 6, 7, 10, 14]. These
281
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
Year 2005 2015 2016 2017 2018 2019 2020 2021 2022
N. of articles 1 1 1 7 13 7 4 10 6
chatbots use pre-defined rules (e.g., using regular expressions) to (e.g., BERT) that can be fine-tuned for specific use cases. Rule-based
classify the intent and respond to user inputs. Such works do not systems, on the other hand, are less commonly used due to their
rely on machine learning algorithms, but in most cases (see in the limited flexibility and the difficulty of maintaining and updating
following sections) use third-party services such as Google Speech the large number of rules required for natural language processing.
API, which exploit machine learning to perform speech-to-text and In terms of specific frameworks and techniques, Dialogflow and
text-to-speech. Other analysed solutions use ML-based approaches Rasa are the most popular among the systems surveyed, likely due
for entity extraction and intent classification implementing their to their – relative – ease of use and the availability of pre-built
custom architecture or using frameworks. For example, [15, 38, 49, components and integrations. Dialogflow is proprietary, while Rasa
62] use ML-based algorithms for entity extraction and intent clas- is open source. Custom implementations using techniques such as
sification, while others [59, 71, 73] use Deep Learning approaches SVMs, RNNs, and Transformers are also relatively popular, indi-
(such as RNNs, LSTM and Transformers) to reach the same goal. cating that several developers are willing to invest the time and
One solution [22] adopts a reinforcement learning algorithm (using effort required to build a tailored solution for their specific needs.
the Q-learning method) to allow the chatbot to make new associa- In many of the papers analysed the authors pay little attention to
tions between unseen commands and the actions to be performed. the description of the implementation of conversational agents and,
Several papers, on the other hand, report the use of frameworks in some cases, this aspect is completely ignored.
such as Rasa [67], Dialogflow [19, 27, 48], IBM Watson [43], Amazon Among the considered articles, only three share the implementa-
AVS [28] to manage both the recognition of intents and entities, tion code [15, 45, 59], while 23 present an implementation descrip-
but also for the conversational flow and ease of integration with tion that can be used to reproduce the work to some extent. Thus,
instant messaging platforms or virtual assistant (e.g., Telegram, in addition to stating, for instance, which framework or algorithm
Facebook Messenger, Line, Alexa, Google Assistant). Among the was used for input classification, there is also a description of how
ML-based agents (32 of 50 articles), the most popular framework is the various intents and entities were organised (e.g., by giving ex-
Dialogflow, used in nine systems, followed by Rasa and Amazon amples of the training phrases and the NLP analysis pipeline), the
AVS in four systems respectively. Other frameworks such as IBM chatbot functionalities, the management of the conversational flow
Watson, Microsoft Louis, WIT.ai, Google Assistant SDK, Amazon and, in specific cases, the management of breakdowns (analysed
Skill Kit and Mycroft are rarely used (six times in total). Finally, in section 3.5). For instance, in some contributions [19, 27, 28] the
nine systems are based on custom implementations using tech- authors dedicate a significant part of the work to describing the im-
niques such as Support Vector Machines (SVMs), Recurrent Neural plementation of intents and the related entities, providing examples
Networks (RNNs), and Transformers such as BERT. Overall, such of training phrases, chatbot functionalities, how the conversation is
data suggest that the use of ML-based frameworks and techniques handled and how the conversational system is integrated into third-
is the most common choice in the development of conversational party or customised instant messaging applications. The remaining
systems. This is likely due to the ability of ML to handle complex 26 articles do not present enough information to reproduce the
and varied inputs, as well as the availability of pre-trained models work partially or entirely. The support for creating automations
Figure 2: Summary of the key aspects of the research questions and the corresponding literature review contributions.
282
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
in a trigger-action format, also defined as “customization rules” or functionality and security of home automation systems. For exam-
“routines,” has been discussed in several works, such as [15, 19, 20, ple, a machine learning-based algorithm took a stream of packets
27, 45, 67]. These papers propose different approaches for creat- sent by a device and classified the device based on the contents
ing automations with varying types of triggers and actions. For of the packets to enhance IoT cybersecurity [16]. Moreover, one
instance, in [15] users can create automations that perform one or contribution [38] used the Frequent Pattern Growth Algorithm to
more actions when a time-related trigger occurs (e.g., “every day at mine user activities starting from sensors data to optimize the users’
6 am get the latest weather forecast and send it via email to Bob”). In commands based on previous interactions. Case-based reasoning, a
other cases [19, 20] the system suggests existing IFTTT rules to the problem-solving approach that utilizes solutions from previously
user based on an abstract description of the desired behaviour. The solved problems, was used in [59] to resolve conflicts between
authors divide a rule into two components, the “what” component commands given by different people in a multi-user smart home
indicating the desired automated action and the “when” component environment. Finally, various approaches for face recognition, such
specifying the context for execution (e.g., “I would like to secure as the Local Binary Pattern Histogram (LBPH) algorithm [52] and
my places when I leave them”). The system can also identify rules Deep Learning models, were employed in several studies to enable
that cannot be implemented with the user’s connected devices. In a home automation system to identify and respond to individual
one case [27] the chatbot allows the creation of rules consisting of users. For example, [38] and [68] use OpenCV to authorize user
several triggers and actions (e.g., “If the lights are on when leaving access and control. The use of such technologies can be useful for
the house, turn them off and send me a notification”), where trig- further personalizing the users’ experiences through learning their
gers can refer to events and conditions related to sensors and smart preferences, and for security purposes such as authenticating users
objects and are concatenated using the logical AND/OR operators, before allowing them to execute certain commands. Overall, the use
while actions are executed sequentially. Kim et al. [45] propose a of these technologies has the potential to enhance the functionality,
system that can identify and allow users to modify existing rules usability, and security of home automation systems, and to provide
in their context according to specific user’s goals. This is done in improved efficiency and personalized experiences for users.
a two-step process: localizing the relevant automations currently
deployed in the smart home and then modifying them (by append- 3.2 Application Fields
ing a new component, replacing existing components, or updating The combination of IoT and AI has the potential to transform many
parameters) according to the user input. The chatbot in [67] can human activities in several domains (retail, industry, home, . . .) by
compose rules consisting of a trigger and an action, considering improving safety, productivity and user experience. Despite the
several users called “actors.” An example of such a rule is “I want wide range of possible applications in several fields, the papers
the bedroom lights to turn off when me and my husband get in bed identified in the systematic review mainly focus on the develop-
at night”, where “me” and “my husband” are identified as actors. ment of solutions oriented to smart homes, for example, to improve
Finally, Barricelli et al. [9] developed an Alexa skill that allows the comfort, provide assistance to older adults or impaired users, or
creation of Alexa routines through a multimodal approach, combin- monitor energy consumption. Fewer applications address aspects
ing the use of voice and touch (on Amazon devices with a screen) concerning healthcare, smart agriculture or, more in general, smart
to guide the user in selecting and configuring triggers and actions. environments, such as offices or buildings. Of the 50 articles anal-
3.1.1 Other AI Technologies Involved. In addition to the technolo- ysed, 37 propose written or spoken language to control and monitor
gies used for the realisation of conversational systems, various AI home appliances and sensors. For example, one system [1] uses a
technologies were employed in the reviewed studies to enhance chatbot accessed through Facebook Messenger to control a variety
and/or add functionality to such systems, including speech recogni- of smart home devices, such as sensors for noise and gas, a door
tion, conflict resolution reasoning, and various approaches for face sensor, and a relay to control lights. Moreover, some contributions
and emotion recognition. Regarding better support for user interac- consider the interaction with a physical robot. Among the papers
tion, speech recognition technologies, such as the Google API [6, 18, in the smart home domain, five of them [2, 13, 28, 56, 60] are specif-
23, 57, 59], Microsoft Bing Speech API [2], Web Speech API [27] and ically designed to provide support for older adults or people with
Android Speech Recognition [13] for speech-to-text and text-to- disabilities. For example, one paper [13] discusses the development
speech, were used in several studies. These technologies allow users of a human-robot-smart environment interaction interface for am-
to communicate with the system using voice commands, enabling bient assisted living, which was able to control lights, curtains, a
a more convenient and intuitive user experience, and enhancing radio, an air conditioner, and temperature. In another study, [56]
the accessibility of the application. the interaction with a robot is exploited focusing on helping older
Deep Learning technologies, such as Vokaturi for emotion clas- adults or movement-impaired people in everyday tasks, including
sification using voice [13] and convolutional neural networks for the monitoring of vital parameters. Moreover, in [66] the authors
emotion analysis [66], were utilized to enable a home automation propose a smart mirror that can act as a virtual assistant making it
system to understand and respond to the emotional states of users. possible to query web services (e.g., weather, news) or to execute
This can be useful in ambient assisted living environments, where actions on home appliances. Only two papers discuss healthcare
the system can provide appropriate assistance or support based applications. One study [11] discusses the development of an IoT-AI
on the user’s emotional state. Machine learning techniques, in- powered healthcare kit, which was able to measure blood pressure,
cluding algorithms for device classification [16] and user pattern temperature, oxygenation, and heart rate. In such a solution, this
recognition [54, 71], were applied in several studies to improve the information can only be retrieved from a classical web interface,
283
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
while a chatbot is used to make diagnoses based on the description [73]). The remaining papers present multimodal solutions, such
of the user’s symptoms. The other one [35] describes a chatbot to as text and voice (8), voice and gestures (3) and voice and touch
obtain information about the user’s heart rate and provides the (3). Text and voice solutions maintain the positive aspects of the
possibility to book a doctor’s appointment or set reminders for text-based ones, the vocal interaction is constrained to the click of
taking medications. Five papers have been classified as “smart en- a button to activate and deactivate the microphone, thus requiring
vironments” since they do not focus on particular domains but the use of the hands to be used. Among the contributions analysed,
propose conversational interfaces to control, in general, IoT devices. most are accessible through custom web interfaces [7, 19, 27, 43,
For example, one contribution [34] proposes a virtual assistant for 59] or smartphone applications [15, 54]. Salvi et al. [65] preferred
student laboratories that can answer general questions and, in ad- to split the two modalities, making the conversational agent ac-
dition, can control the status of laboratory instruments. While [73] cessible both from Google Assistant and the Telegram application.
proposes a system to perform multiple operations contained in one Concerning works that integrate voice and gesture, two of them
complex natural language command (in Chinese) for three main combine the voice command with a hand movement to perform
domains (agriculture, industry and smart home). The remaining actions on home devices. Anbarasan et al. [2] use a Kinect and their
articles present applications in different fields such as a chatbot solution captures voice and gestures, using the combination of both
application for supporting smart urban agriculture through the to execute the command on a device; while [28] also uses a Kinect
measurement of soil moister, overall humidity and temperature, but the voice commands and the gestures are managed separately,
and programming or remote controlling irrigation [30], or a chatbot thus is not possible to use them in conjunction. Finally, [42] ex-
to command a 3D printer [50], including the upload of the 3D model ploits a wristband to capture gestures around the environment and
to print, the status and progress of the printing and to guide the a Bluetooth-enabled wireless earpiece for getting voice commands,
user through the whole process. combining both to execute commands (e.g., “turn on that light”).
The possibility to simultaneously use touch and voice is exploited
3.3 Interaction Device and Modality in an Alexa skill [9] allowing the user to switch between the two
The interaction modality and the devices with which conversa- modalities while interacting with an Amazon device that presents
tional agents can be accessed play an important role from the point a display, while the other works [14, 22] exploit the two modalities
of view of both accessibility and usability. Text-only interaction separately, so the user must choose whether to interact using either
modes favour privacy, the possibility of keeping track of the con- voice or touch. One paper [18] describes interaction via three dif-
versation (and possibly carrying it on at different times), and made ferent modalities: voice, touch and BCI (Brain Computer Interface),
possible using graphic elements (e.g., choice buttons) to speed up designed for users with mobility limitations for controlling smart
and simplify the interaction. Voice modes emphasize the possibility home devices. This solution uses a mobile Android application with
of “hands-free” interaction, which is primarily useful for impaired voice recognition and a dialogue system, with the additional possi-
with limited movement possibilities, and more generally for natural bility to alternatively use a NeuroSky MindWave mobile headset
and more immediate interaction. Among the articles analysed, the as input. Finally, two papers [64, 71] do not describe the mode of
distribution of the interaction modality is almost equally divided be- interaction with the developed systems since they present a generic
tween vocal (16 articles) and textual (17 articles) interaction. Most application of natural language to smart environment control.
of the solutions using only voice rely on devices such as Alexa
[29, 35], Google Assistant [23, 35, 45, 56] or custom hardware im- 3.4 Conversational Breakdowns
plementations [34, 38, 60, 62, 66] (e.g., using Raspberry Pi with The effective handling of conversational breakdowns is an essen-
microphone). In one work [13] the authors use a humanoid robot (a tial aspect to consider in conversational agents. Breakdowns occur
Pepper one) as the interface for receiving commands and estimat- when a conversational agent fails to understand user inputs, leading
ing the user’s emotions through voice or facial expressions. It is to frustration, loss of credibility, and dropping the conversation [21,
worth noting that solutions that use Google Assistant can be used 50, 51]. This is especially crucial in task-oriented chatbots where
either from stationary devices such as Google Home or from any the user has a specific goal in mind. Therefore, repair strategies
Android-enabled device. are necessary to recover from breakdowns and continue the con-
Regarding text-based agents, custom web platforms [6, 11, 20, 67] versation. There are several methodologies to repair breakdowns,
and commercial messaging applications, like Facebook Messenger including presenting alternative options, highlighting keywords,
[1, 33], Slack [40], Telegram [50, 55], Line [30, 41, 68] and WeChat and rephrasing [5]. Out of the 50 articles reviewed, only 12 describe
[57], allow an interaction independent from the device (smartphone how breakdowns are handled. The breakdown repair in most of
or desktop), since all they require is an internet connection to ac- these articles involves rephrasing the command to make it clearer
cess the application or the web site. Moreover, integration with [18, 19, 20, 28, 59, 67]. In [27], since it allows the creation of com-
these commercial applications requires less effort than developing plex rules using a single input (which may include more than one
customised interfaces. Thus, the integration with well-known appli- trigger and one action), the chatbot asks the user to rephrase only
cations, perhaps already used for everyday messaging, makes access the part of the input that was not understood, while showing to the
and use of the chatbot straightforward. Finally, custom smartphone user which part has been correctly classified. In the case of simple
[73] and desktop applications [32, 37, 61] seem to be related to early commands (e.g., entering a single trigger or action), rephrasing the
prototypes (e.g., [61]) or to integrate the chatbot into solutions with entire input is requested. When a breakdown occurs, a solution
other functionalities (e.g., visualisations of energy consumption [41] shows the user a help message containing possible chatbot
284
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
commands, while [54] opens a Google Search page with the text of responsibility with the conversational agent, with a likelihood of
the input as a query. Campagna et al. [15] provides a list of possible 61.3%. However, participants expressed that allowing the user to
matches with available intents, and in the case of none being valid, manage the learning burden, rather than sharing it with the agent,
the system asks for rephrasing. Then, if the user selects one of the is a more efficient approach. Therefore, additional research should
suggested intents, the chatbot will add the input text that generated tackle the design and evaluation of more successful strategies for
the breakdown to the training phrases for that intent. Kim et al. [45] repairing conversational breakdowns in a variety of scenarios in
uses a “disambiguation strategy” by asking the user additional ques- IoT settings.
tions in case of incomplete or ambiguous input, while the authors do
not describe the chatbot behaviour when the input does not match 3.5 User Experience Evaluation Methods
any intent. Since the chatbot presented in [45] is dedicated to the One key point has been to analyse how users have been involved in
modification of trigger-action rules, the disambiguation strategy is assessing their experience with the proposed conversational inter-
applied when it is unclear whether a trigger or action is to be added faces for controlling IoT ecosystems. Among the articles selected
or modified, or when it is unclear which internal parameter (to the for this review, only 11 present an evaluation of the usability and
trigger or action) is to be changed. Follow-up messages are then overall experience of the conversational systems. The remaining
sent to identify the user’s intent uniquely. Oumard et al. [60] use studies only evaluate the systems in terms of computational per-
the Rasa Fallback policy, which consists of a two-stage breakdown formance, using classic Machine Learning evaluation metrics such
resolution1 : when a message is classified with low confidence, the as accuracy, F1 score and loss. The evaluation methods used in
user is asked to confirm the most probable intent. If the users do such eleven studies include Likert scales, measures of task success
not confirm the intent, they are asked to rephrase the message. and failure, NASA-TLX and SUS evaluations, which are commonly
Then, if the new message is classified with low confidence, the used to assess the usability of the systems. The NASA-TLX and
chatbot asks again for confirmation. Finally, if the user rebuts again, SUS evaluation methods were used in a study [2] that assessed
a breakdown message is sent, and the conversation state is reset. the usability, accuracy and workload of the system for older adults
Both quantitative and qualitative analyses of the reviewed arti- and compared it to Google Home and Amazon Echo. One study
cles demonstrate the need for more research effort to address con- [60] also applied the UEQ questionnaire, which measures the user’s
versational breakdown resolution. Referring to the discussion on overall experience with the system.
repair strategies for conversational breakdowns [5], the twelve arti- One example study using the Likert scale [16] evaluated the use
cles that address breakdown issues do not use particularly efficient of a voice assistant for cybersecurity tasks. Participants were asked
techniques. Simply asking to rephrase the input can be considered to rate the difficulty of the tasks after performing them with both
simple and quick as it immediately highlights a lack of understand- traditional methods and the voice assistant. Task success and failure
ing on the part of the chatbot but does not provide particular help measures and thinking aloud were used in [67] which recorded the
to the user to recover from the error. Instead, the methodology number of errors and help requests during tasks completion, while
reported in [15] would appear to be more efficient, as the chance in [42] users were asked to vocalize their thoughts on the interac-
of choosing between different options makes the chatbot’s possi- tion after completing each of the three proposed tasks. Sometimes
bilities immediately clear and less input is required by the user at more specific evaluation criteria have been used such as custom
the expense, however, of less natural interaction. Furthermore, the questionnaires, or specific metrics such as the number of conversa-
resulting possibility of re-training the chatbot is useful in prevent- tional turns. One study [19] used metrics for evaluating Perceived
ing future breakdowns. In summary, the presence of conversational Effectiveness and Fun (PEF), alongside the total number of mes-
breakdowns in chatbots creates significant obstacles to achieving an sages, the number of desired automations expressed by the user
optimal user experience, especially in task-oriented chatbots. After and the number of satisfying automations identified by the associ-
the literature analysis, it becomes apparent that the implementation ated recommendation system. Another example [59] used a custom
of effective repair strategies is crucial to allow users to overcome questionnaire with Likert scales to evaluate the virtual assistant’s
such breakdowns and continue with the conversation. Although speech recognition, conflict resolution, interaction with the virtual
most of the articles propose rephrasing as a solution, there is a need assistant, and user-friendliness. Custom questionnaires were also
for further research in devising and applying more efficient tech- used in some studies, such as [34], which asked specific questions
niques. Research suggests [5] that presenting alternative options, about the user’s experience with the virtual assistant (e.g., “Did you
emphasizing keywords, and providing disambiguation strategies enjoy the overall experience?” or “Would you enjoy my services
could potentially improve the chatbot’s ability to handle break- on a daily basis?”). The number of conversational turns and task
downs proficiently. A relevant contribution [24] has investigated time were measured in [27, 45] where participants performed some
learning mechanisms to minimize conversational breakdowns in tasks using two different approaches (they compared form-based
human-agent interaction in the manufacturing industry. It com- interfaces and a conversational one in creating and modifying au-
pares three scenarios where, after a successful repair, the learning tomations specified in terms of trigger-action rules). Finally, [9]
burden is assigned to the agent (it must adapt to the user’s terminol- used a between-subject protocol with two groups for comparing the
ogy), the users (they must adapt to the agent’s terminology), or both system proposed along with the standard solution, measuring the
(the agent adapts to the user’s terminology, but the user should use success and error rate, the execution time and the number of errors.
agent’s standard terms to reduce the possibility of breakdown). Par- Then the users were asked to fill out a SUS and a UEQ questionnaire.
ticipants (N=26) showed a preference for distributing the learning
1 https://rasa.com/docs/rasa/fallback-handoff/
285
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
[67] Predefined Tasks Completion time, n. help requests, n. 5 From 25 to 40 No info No info
(-) errors made, thinking aloud
[19] 15 minutes free n. messages, n. of need expressed by 8 From 24 to 30 All students with 5 m,
task the user, n. of automations identified, (avg. 26) computer science 3f
open questions, Perceive Effectiveness backgrounds
and Fun (PEF)
[9] Predefined Tasks Between-subject protocol with two 20 Group 1: avg. 32.5 19 without programming 10m, 10f
(5) groups (1 and 2), task success rate, Group 2: avg. 38.8 experience
execution time, number of errors, SUS,
UEQ.
User study sample sizes ranged from two [16] to twenty [2, 9, use general-purpose evaluation methods, such as NASA-TLX, SUS,
45] participants, with an average of 11.81 participants. There were and thinking aloud. Alongside these “classical” methods, specific
two studies with 15 users [34, 59], and two with ten users [27, metrics for evaluating information related to purely conversational
42], more details are in Table 2. Still, regarding the users involved, aspects have been used, such as in [27] and [45], which tracked
eight out of eleven studies reported a user age range that varied the number of conversational turns (for task-oriented chatbots, a
from a minimum of 18 years old to 53 ([16] does not present any lower number of turns may imply a higher efficiency) or in [19] the
information about the users’ age, while [59] does not report any number of messages.
user information at all). In one case [2] users’ age ranged from The variety of evaluation methods applied, ranging from custom
65 to 80 since the study focuses on solutions for older adults. An- metrics to general methods (such as SUS) highlights the lack of
other specific category of users was considered in [60] where seven specific methodologies for evaluating conversational systems. In
blind users were involved. Moreover, seven out of eleven studies this perspective, the Subjective Assessment of Speech System In-
specified the sex distribution among participants with an overall terfaces (SASSI) [31] can be used to assess user perception of some
percentage of 41.96% (47 users) females and 58.03% (65 users) males. aspects of conversational interfaces using the vocal modality. The
In general, the numbers suggest that little weight has been given SASSI questionnaire presents generic items such as “The system is
to the evaluation of the user’s experience during interaction with accurate” or “The system is easy to use”, and specific conversational
conversational systems in IoT settings, while more attention has items such as “It is clear how to speak to the system” or “I some-
been given to the computational performance of the implemented times wondered if I was using the right word”. More specifically
systems. Furthermore, it can be noted that there is no single shared for conversational interfaces, the BOT Usability Scale (BUS-15) [12]
evaluation methodology among researchers in this field who often has been recently put forward, emphasising aspects such as the
286
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
perceived quality of the conversation (e.g., “I find that the chatbot only on evaluating the computational performance of the systems
understands what I want and helps me achieve my goal”) and the using metrics such as accuracy, F1 score, and loss. However, some
perceived quality of the chatbot functions (e.g., “The chatbot was contributions have used metrics aimed at assessing, specifically, the
able to keep track of context”). As its authors themselves point out, conversational experience, for instance by considering the num-
the questionnaire needs further testing and validation. ber of conversational shifts. Thus, the lack of specific metrics is
Finally, we would like to underline that none of the eleven articles clear, as is the consequent need to research and identify evaluation
carried out user tests in a real uncontrolled environment (i.e. in-the- methods for conversational agents. In this perspective, the SASSI
wild) since all tests were performed in controlled and supervised [31] questionnaire was put forward for the evaluation of speech
environments by the researchers themselves. Table 2 shows the interfaces, whereas more recent studies such as [12] have started
user test details for the papers discussed above. to consider the evaluation of more specific conversational aspects,
but further validation is necessary to understand whether they are
4 DISCUSSION AND CONCLUSIONS the optimal solution in this case. In addition, there is a clear lack
The combination of IoT and AI technologies has the potential to of studies in real contexts that can provide more information on
transform human activities in various domains by improving com- the actual user experience over time during daily activities (i.e.,
fort, assistance and productivity, enriching future peoples’ smart in-the-wild studies).
living. Given the recent advantages in natural language processing, Application domains. The majority of studies focused on the
in this systematic review, we analysed 50 articles (selected from smart home domain, which is the one with the most immediate
an initial set of 3177) that focused on the development of chatbots impact on people’s lives, improving comfort (e.g., using routines
and virtual assistants for controlling intelligent environments in or remotely controlling devices), assisting older or impaired users
IoT settings. (e.g., through vocal commands and robots), and monitoring energy
Employed technologies. We observed that the primary technolo- consumption in smart homes. Other areas of interest relate to smart
gies employed to develop chatbots in these systems were Artificial health, giving the possibility to monitor vital parameters or perform
Intelligence-based methods (e.g., Reinforcement Learning, Trans- diagnostics. The use of conversational interfaces in home automa-
formers, SVM), with many works using frameworks, such as Rasa tion systems has the potential to enhance functionality, usability,
and Dialogflow; while a smaller, but still relevant number of works and security, and to provide personalised experiences for users.
has implemented rule-based systems. A large portion of the articles Nevertheless, there is a need for more research on the user experi-
do not present a sufficiently in-depth description to reproduce the ence evaluation of chatbots in real-world scenarios, as studies in
work proposed, and only three articles publicly share the implemen- this area were usually conducted in controlled environments.
tation code. Moreover, current solutions seem limited in terms of Conversational limitations and areas for improvement. One area
the support for flexibly creating automations that involve multiple for improvement lies in the enhancement of conversational agent
connected smart objects in a conversational way, since the current capabilities for smart home automation (also called “routines” or
solutions mainly focus on modifying existing automations or on “trigger-action rules”). Existing commercial solutions (e.g., Alexa
creating simple ones (e.g., with only one trigger and one action). and Google Assistant) do not allow users to create automations us-
Despite AI solutions being more efficient compared to rule-based ing the natural language but only through classic buttons interfaces
ones, they are still limited in terms of flexibility in understanding available in the smartphone applications, while research solutions
user input and managing conversational flow in case of “unexpected” (some of them presented in section 3.1) primarily revolve around
user behaviour. The potential unlocked by recent advancements in creating or modifying simple automations, but they lack the flexi-
NLP with LLMs (e.g., GPT-4) opens up a range of new possibilities bility to handle complex tasks involving multiple interconnected
through prompting techniques, both by reducing developer work- smart objects. Future research should be dedicated to empower-
load (e.g., there’s no longer the need to define intents and entities ing conversational agents to act as personal assistants, guiding
by creating training datasets from scratch, and there is no need users through the configuration and personalization of smart en-
to predict and define possible user conversational paths) and by vironments, providing comprehensive support and insights into
enhancing the capabilities of conversational agents. These agents the possibilities and limitations of sensors and smart objects, and
exhibit linguistic abilities that enable autonomous and self-sufficient empowering users to make informed decisions and achieve their
dialogue management, independent of pre-defined pathways. desired outcomes.
Evaluation metrics. Overall, from our analysis, it emerges that An additional area to explore is system transparency and the
despite the significant technological evolution in the areas of con- ability to provide explanations. This factor becomes particularly
versational interfaces and IoT, their integration is still an open issue, important when dealing with agents that enable the creation and
and several areas need more research to better exploit the possi- execution of automations. Specifically, users should have access
bilities of conversational interfaces in smart contexts, in particular to information about the automations and the capability to seek
in terms of user-centred approaches. Indeed, only a small number explanations for system behaviour (e.g., addressing basic requests
of studies considered user-based evaluation of the proposed solu- such as “Why did you turn on the thermostat?” or “Why is it so hot
tions. The evaluation methods for these systems varied, with some now?”).
studies evaluating usability and user experience (about 20% of the Another potential area for improvement lies in the development
papers) using methods such as Likert scales, custom questionnaires, of recommendation systems based on acquired sensor data and
and the NASA-TLX and SUS, while the remaining works focused user preferences. By considering user habits, preferences, and goals,
the system can suggest relevant automations that align with each
287
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
user’s unique requirements. This personalised approach can greatly studies were gathered from digital libraries (as detailed in Section
enhance smart environments’ usability and overall user experience. 2.4), which curate the most representative conference papers and
One further relevant research direction should focus on mitigat- journal articles for our research purposes. However, these libraries,
ing errors and improving user interaction. Breakdowns or fallbacks, while valuable, are not exhaustive. This aspect could potentially
instances where the agent fails to understand user input, can easily affect the comprehensiveness of our research findings.
lead to user frustration. Innovative strategies can be employed to
address this issue. For instance, rather than merely asking users REFERENCES
to repeat their input, the agent could automatically rephrase the [1] Sakib Ahmed, Debashish Paul, Rubaiya Masnun, Minhaz Uddin Ahmed Shanto,
and Tanjila Farah. 2020. Smart Home Shield and Automation System Using
sentence (e.g., using language models or rule-based algorithms Facebook Messenger Chatbot. In 2020 IEEE Region 10 Symposium (TENSYMP),
for replacing terms and modifying the syntactic structure) and 1791–1794. DOI:https://doi.org/10.1109/TENSYMP50017.2020.9230716
seek user confirmation. Furthermore, misunderstandings leading [2] Anbarasan and Jeannie S.A. Lee. 2018. Speech and Gestures for Smart-Home
Control and Interaction for Older Adults. In Proceedings of the 3rd International
to breakdowns can be repurposed into training data. This allows Workshop on Multimedia for Personal Health and Health Care (HealthMedia’18),
the agent to learn from its mistakes and expand its vocabulary, Association for Computing Machinery, New York, NY, USA, 49–57. DOI:https:
potentially enhancing the agent’s performance and providing a //doi.org/10.1145/3264996.3265002
[3] Carmelo Ardito, Giuseppe Desolda, Rosa Lanzilotti, Alessio Malizia, and Maris-
more satisfying user experience over time. tella Matera. 2020. Analysing trade-offs in frameworks for the design of smart
Challenges and possibilities for the future. Lastly, as previously environments. Behaviour & Information Technology 39, 1 (January 2020), 47–71.
DOI:https://doi.org/10.1080/0144929X.2019.1634760
mentioned to some extent, recent advances in Natural Language [4] Parvaneh Asghari, Amir Masoud Rahmani, and Hamid Haj Seyyed Javadi. 2019.
Processing (NLP) have greatly improved conversational capabilities Internet of Things applications: A systematic review. Computer Networks 148,
and language-related tasks. Large Language Models (LLMs), such as (January 2019), 241–261. DOI:https://doi.org/10.1016/j.comnet.2018.12.008
[5] Zahra Ashktorab, Mohit Jain, Q. Vera Liao, and Justin D. Weisz. 2019. Resilient
GPT-3 and ChatGPT, have been instrumental in achieving those im- Chatbots: Repair Strategy Preferences for Conversational Breakdowns. In Pro-
provements. These models have enabled text generation, translation ceedings of the 2019 CHI Conference on Human Factors in Computing Systems,
across various languages, text style rewriting, question answering, ACM, Glasgow Scotland Uk, 1–12. DOI:https://doi.org/10.1145/3290605.3300484
[6] Cyril Joe Baby, Faizan Ayyub Khan, and J. N. Swathi. 2017. Home automation
and more. With the use of prompt engineering techniques, they can using IoT and a chatbot using natural language processing. In 2017 Innovations
perform zero-shot and few-shot tasks from a small set of examples, in Power and Advanced Computing Technologies (i-PACT), 1–6. DOI:https://doi.
org/10.1109/IPACT.2017.8245185
without the need of training or fine-tuning a new model [47]. The [7] Cyril Joe Baby, Nalin Munshi, Ankit Malik, Kunal Dogra, and R. Rajesh. 2017.
emergence of LLMs presents a new challenge in managing smart Home automation using web application and speech recognition. In 2017 Inter-
environments through their advanced capabilities of understanding national conference on Microelectronic Devices, Circuits and Systems (ICMDCS),
1–6. DOI:https://doi.org/10.1109/ICMDCS.2017.8211543
and generating natural language as well as maintaining context [8] Sara Bardi and Claudio Enrico Palazzi. 2022. Smart Hydroponic Greenhouse:
during the entire conversation, making it very close to a human and Internet of Things and Soilless Farming. In Conference on Information Technology
consequently more natural and less frustrating. However, applying for Social Good, ACM, Limassol Cyprus, 212–217. DOI:https://doi.org/10.1145/
3524458.3547221
these models to real-world problems requires further research since [9] Barbara Rita Barricelli, Daniela Fogli, Letizia Iemmolo, and Angela Locoro. 2022.
they exhibit probabilistic and not deterministic behaviour so even A Multi-Modal Approach to Creating Routines for Smart Speakers. In Proceedings
of the 2022 International Conference on Advanced Visual Interfaces (AVI 2022),
a small change in the input may lead to the generation of an un- Association for Computing Machinery, New York, NY, USA. DOI:https://doi.org/
expected or inaccurate output. Initial explorations have started to 10.1145/3531073.3531168
investigate how combining prompt engineering and high-level func- [10] Parabattina Bhagath, Samanvi Parisa, Sasi Dinesh Reddy, and Fareeda Banu.
2021. An Android based Mobile Spoken Dialog System for Telugu language
tion libraries enables ChatGPT to adapt to different robotics tasks, to control Smart appliances. In 2021 IEEE XXVIII International Conference on
using natural language communication to control the movements Electronics, Electrical Engineering and Computing (INTERCON), 1–4. DOI:https:
of objects, such as a robotic arm or a small drone [70]. Another //doi.org/10.1109/INTERCON52678.2021.9532783
[11] Shreya Bhutada, Akshata Singh, Kaushiki Upadhyaya, and Purvika Gaikar. 2021.
possibility of using LLM, without granting them full control, could Ru-Urb IoT-AI powered Healthcare Kit. In 2021 5th International Conference on
involve the combined use of well-known and validated systems Intelligent Computing and Control Systems (ICICCS), 417–422. DOI:https://doi.org/
10.1109/ICICCS51141.2021.9432257
(e.g., Rasa) with dedicated LLM for specific tasks [26]. In this way, [12] Simone Borsci, Alessio Malizia, Martin Schmettow, Frank van der Velde, Gu-
it would be possible to develop efficient and reliable systems, reduc- nay Tariverdiyeva, Divyaa Balaji, and Alan Chamberlain. 2022. The Chatbot
ing the risks associated with LLM while simultaneously enhancing Usability Scale: the Design and Pilot of a Usability Scale for Interaction with
AI-Based Conversational Agents. Pers Ubiquit Comput 26, 1 (February 2022),
the capabilities of more “conventional” systems. Overall, LLMs are 95–119. DOI:https://doi.org/10.1007/s00779-021-01582-9
paving the way for numerous possibilities and different develop- [13] Ha-Duong Bui and Nak Young Chong. 2018. An Integrated Approach to Human-
ments. However, this progress necessitates further research, not Robot-Smart Environment Interaction Interface for Ambient Assisted Living. In
2018 IEEE Workshop on Advanced Robotics and its Social Impacts (ARSO), 32–37.
only to adapt these models to the Internet of Things (IoT) context DOI:https://doi.org/10.1109/ARSO.2018.8625821
but primarily to ensure their safe and practical use. As these models [14] Julio Cabrera, María Mena, Ana Parra, and Eduardo Pinos. 2016. Intelligent
assistant to control home power network. In 2016 IEEE International Autumn
will have an impact on people interacting with real-world objects, Meeting on Power, Electronics and Computing (ROPEC), 1–6. DOI:https://doi.org/
it is essential to address concerns related to privacy, security, and 10.1109/ROPEC.2016.7830531
ethical implications. [15] Giovanni Campagna, Rakesh Ramesh, Silei Xu, Michael Fischer, and Monica
S. Lam. 2017. Almond: The Architecture of an Open, Crowdsourced, Privacy-
Limitations. The aim of this systematic literature review is to Preserving, Programmable Virtual Assistant. In Proceedings of the 26th Interna-
provide a comprehensive framework for understanding how con- tional Conference on World Wide Web (WWW ’17), International World Wide
versational agents have been employed to control IoT devices in Web Conferences Steering Committee, Republic and Canton of Geneva, CHE,
341–350. DOI:https://doi.org/10.1145/3038912.3052562
intelligent environments, analysed through an HCI perspective. It [16] Jeffrey S. Chavis, Malcom Doster, Michelle Feng, Syeda Zeeshan, Samantha Fu,
can be useful to note that the validity of this study may be influ- Elizabeth Aguirre, Antonio Davila, Kofi Nyarko, Aaron Kunz, Tracy Herriotts,
enced and limited by different factors. Specifically, the relevant
288
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
Daniel Syed, Lanier Watkins, Anna Buczak, and Aviel Rubin. 2021. A Voice As- //doi.org/10.1109/ICORIS52787.2021.9649442
sistant for IoT Cybersecurity. In 2021 IEEE Integrated STEM Education Conference [34] Giancarlo Iannizzotto, Lucia Lo Bello, Andrea Nucita, and Giorgio Mario Grasso.
(ISEC), 165–172. DOI:https://doi.org/10.1109/ISEC52395.2021.9764005 2018. A Vision and Speech Enabled, Customizable, Virtual Assistant for Smart
[17] Hongshen Chen, Xiaorui Liu, Dawei Yin, and Jiliang Tang. 2017. A Survey on Environments. In 2018 11th International Conference on Human System Interaction
Dialogue Systems: Recent Advances and New Frontiers. SIGKDD Explor. Newsl. (HSI), 50–56. DOI:https://doi.org/10.1109/HSI.2018.8431232
19, 2 (November 2017), 25–35. DOI:https://doi.org/10.1145/3166054.3166058 [35] Andrej Ilievski, Dimitri Dojchinovski, and Marjan Gusev. 2019. Interactive Voice
[18] Miguel Angel Contreras-Castañeda, Juan Antonio Holgado-Terriza, Gonzalo Assisted Home Healthcare Systems. In Proceedings of the 9th Balkan Conference
Pomboza-Junez, Patricia Paderewski-Rodríguez, and Francisco Luis Gutiérrez- on Informatics (BCI’19), Association for Computing Machinery, New York, NY,
Vela. 2019. Smart Home: Multimodal Interaction for Control of Home Devices. USA. DOI:https://doi.org/10.1145/3351556.3351572
In Proceedings of the XX International Conference on Human Computer Interac- [36] Marco Jahn, Marc Jentsch, Christian R. Prause, Ferry Pramudianto, Amro Al-
tion, ACM, Donostia Gipuzkoa Spain, 1–8. DOI:https://doi.org/10.1145/3335595. Akkad, and Rene Reiners. 2010. The Energy Aware Smart Home. In 2010 5th
3335636 International Conference on Future Information Technology, IEEE, Busan, Korea
[19] Fulvio Corno, Luigi De Russis, and Alberto Monge Roffarello. 2020. HeyTAP: (South), 1–8. DOI:https://doi.org/10.1109/FUTURETECH.2010.5482712
Bridging the Gaps Between Users’ Needs and Technology in IF-THEN Rules via [37] Akhil Jain, Poonam Tanwar, and Stuti Mehra. 2019. Home Automation Sys-
Conversation. In Proceedings of the International Conference on Advanced Visual tem using Internet of Things (IOT). In 2019 International Conference on Ma-
Interfaces, ACM, Salerno Italy, 1–9. DOI:https://doi.org/10.1145/3399715.3399905 chine Learning, Big Data, Cloud and Parallel Computing (COMITCon), 300–305.
[20] Fulvio Corno, Luigi De Russis, and Alberto Monge Roffarello. 2021. From Users’ DOI:https://doi.org/10.1109/COMITCon.2019.8862201
Intentions to IF-THEN Rules in the Internet of Things. ACM Trans. Inf. Syst. 39, 4 [38] Farzeem D. Jivani, Manohar Malvankar, and Radha Shankarmani. 2018. A Voice
(August 2021). DOI:https://doi.org/10.1145/3447264 Controlled Smart Home Solution With a Centralized Management Framework
[21] Benjamin R. Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Implemented Using AI and NLP. In 2018 International Conference on Current
Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. “What can i help Trends towards Converging Technologies (ICCTCT), 1–5. DOI:https://doi.org/10.
you with?”: infrequent users’ experiences of intelligent personal assistants. In 1109/ICCTCT.2018.8550972
Proceedings of the 19th International Conference on Human-Computer Interaction [39] Daniel Jurafsky and James H Jr Martin. Chatbots & Dialogue Systems. In Speech
with Mobile Devices and Services, ACM, Vienna Austria, 1–12. DOI:https://doi. and Language Processing. An Introduction to Natural Language Processing, Com-
org/10.1145/3098279.3098539 putational Linguistics, and Speech Recognition (Third Edition draft). Standford
[22] Hanif Fakhrurroja, Ahmad Musnansyah, Muhammad Dewan Satriakamal, Bima Edu. Retrieved from https://web.stanford.edu/$\sim$jurafsky/slp3/
Kusuma Wardana, Rizal Kusuma Putra, and Dita Pramesti. 2022. Dialogue System [40] Charbel El Kaed, Andre Ponnouradjane, and Dhaval Shah. 2018. A Semantic
based on Reinforcement Learning in Smart Home Application. In Proceedings Based Multi-Platform IoT Integration Approach from Sensors to Chatbots. In
of the 2022 International Conference on Computer, Control, Informatics and Its 2018 Global Internet of Things Summit (GIoTS), 1–6. DOI:https://doi.org/10.1109/
Applications, ACM, Virtual Event Indonesia, 146–152. DOI:https://doi.org/10. GIOTS.2018.8534520
1145/3575882.3575911 [41] Yao-Chiang Kan, Hsueh-Chun Lin, Han-Yu Wu, and Junghsi Lee. 2020. LoRa-
[23] Renato Ferrero, Mohammad Ghazi Vakili, Edoardo Giusto, Mauro Guerrera, and Based Air Quality Monitoring System Using ChatBot. In 2020 Asia-Pacific Signal
Vincenzo Randazzo. 2019. Ubiquitous fridge with natural language interaction. In and Information Processing Association Annual Summit and Conference (APSIPA
2019 IEEE International Conference on RFID Technology and Applications (RFID-TA), ASC), 1561–1565. Retrieved from https://ieeexplore.ieee.org/document/9306463
404–409. DOI:https://doi.org/10.1109/RFID-TA.2019.8892025 [42] Runchang Kang, Anhong Guo, Gierad Laput, Yang Li, and Xiang “Anthony” Chen.
[24] Mina Foosherian, Samuel Kernan Freire, Evangelos Niforatos, Karl A. Hribernik, 2019. Minuet: Multimodal Interaction with an Internet of Things. In Symposium
and Klaus-Dieter Thoben. 2022. Break, Repair, Learn, Break Less: Investigating on Spatial User Interaction (SUI ’19), Association for Computing Machinery, New
User Preferences for Assignment of Divergent Phrasing Learning Burden in York, NY, USA. DOI:https://doi.org/10.1145/3357251.3357581
Human-Agent Interaction to Minimize Conversational Breakdowns. In Proceed- [43] Maksym Ketsmur, Ant
ings of the 21st International Conference on Mobile and Ubiquitous Multimedia, ’onio Teixeira, Nuno Almeida, Samuel Silva, and M
ACM, Lisbon Portugal, 151–158. DOI:https://doi.org/10.1145/3568444.3568454 ’ario Rodrigues. 2018. Conversational Assistant for an Accessible Smart Home:
[25] Anna Förster and Julian Block. 2022. User Adoption of Smart Home Systems. In Proof-of-Concept for Portuguese. In Proceedings of the 8th International Conference
Conference on Information Technology for Social Good, ACM, Limassol Cyprus, on Software Development and Technologies for Enhancing Accessibility and Fighting
360–365. DOI:https://doi.org/10.1145/3524458.3547118 Info-Exclusion (DSAI 2018), Association for Computing Machinery, New York,
[26] Simone Gallo, Alessio Malizia, and Fabio Paternò. Towards a Chatbot for Creating NY, USA, 55–62. DOI:https://doi.org/10.1145/3218585.3218594
Trigger-Action Rules based on ChatGPT and Rasa. In Workshops, Work in Progress [44] Lachaux Killian, Maitre Julien, Bouchard Kevin, Lussier Maxime, Bottari Carolina,
Demos and Doctoral Consortium at IS-EUD 2023, CEUR-WS.org, Cagliari. Retrieved Couture Mélanie, Bier Nathalie, Giroux Sylvain, and Gaboury Sebastien. 2021.
from https://ceur-ws.org/Vol-3408/short-s4-01.pdf Fall Prevention and Detection in Smart Homes Using Monocular Cameras and
[27] Simone Gallo and Fabio Paterno. 2022. A Conversational Agent for Creating an Interactive Social Robot. In Proceedings of the Conference on Information
Flexible Daily Automation. In Proceedings of the 2022 International Conference Technology for Social Good, ACM, Roma Italy, 7–12. DOI:https://doi.org/10.1145/
on Advanced Visual Interfaces (AVI 2022), Association for Computing Machinery, 3462203.3475892
New York, NY, USA. DOI:https://doi.org/10.1145/3531073.3531090 [45] Sanghoon Kim and In-Young Ko. 2022. A Conversational Approach for Modifying
[28] Alexandru Florin Gavril, Mihai Trascau, and Irina Mocanu. 2017. Multimodal Service Mashups in IoT Environments. In Proceedings of the 2022 CHI Conference
Interface for Ambient Assisted Living. In 2017 21st International Conference on on Human Factors in Computing Systems (CHI ’22), Association for Computing
Control Systems and Computer Science (CSCS), 223–230. DOI:https://doi.org/10. Machinery, New York, NY, USA. DOI:https://doi.org/10.1145/3491102.3517655
1109/CSCS.2017.38 [46] Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing Sys-
[29] Steven Guamán, Adrián Calvopiña, Pamela Orta, Freddy Tapia, and Sang Guun tematic Literature Reviews in Software Engineering. 2, (January 2007).
Yoo. 2018. Device Control System for a Smart Home using Voice Commands: [47] Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke
A Practical Case. In Proceedings of the 2018 10th International Conference on Iwasawa. 2023. Large Language Models are Zero-Shot Reasoners. Retrieved March
Information Management and Engineering, ACM, Salford United Kingdom, 86–89. 14, 2023 from http://arxiv.org/abs/2205.11916
DOI:https://doi.org/10.1145/3285957.3285977 [48] Satender Kumar, Bipin Kumar, Kamlesh Sharma, Rohan Raj, and Suresh Ku-
[30] Reza Gunawan, Ichan Taufik, Edi Mulyana, Opik T Kurahman, Muhammad Ali mar. 2021. IoT Based Secured Home Automation System Using NLP. In 2021
Ramdhani, and Mahmud. 2019. Chatbot Application on Internet Of Things (IoT) International Conference on Advancements in Electrical, Electronics, Communi-
to Support Smart Urban Agriculture. In 2019 IEEE 5th International Conference cation, Computing and Automation (ICAECA), 1–5. DOI:https://doi.org/10.1109/
on Wireless and Telematics (ICWT), 1–6. DOI:https://doi.org/10.1109/ICWT47785. ICAECA52838.2021.9675570
2019.8978223 [49] Lee Hoi Leong, Shinsuke Kobayashi, Noboru Koshizuka, and Ken Sakamura.
[31] Kate S. Hone and Robert Graham. 2000. Towards a tool for the Subjective Assess- 2005. CASIS: A Context-Aware Speech Interface System. In Proceedings of the
ment of Speech System Interfaces (SASSI). Nat. Lang. Eng. 6, 3 & 4 (September 10th International Conference on Intelligent User Interfaces (IUI ’05), Association
2000), 287–303. DOI:https://doi.org/10.1017/S1351324900002497 for Computing Machinery, New York, NY, USA, 231–238. DOI:https://doi.org/10.
[32] Cheng-Chi Huang, Alan Liu, and Pei-Chuan Zhou. 2015. Using Ontology Reason- 1145/1040830.1040880
ing in Building a Simple and Effective Dialog System for a Smart Home System. In [50] Shi Liu, Shahrier Erfan Harun, Florian Jasche, and Thomas Ludwig. 2021. Sup-
2015 IEEE International Conference on Systems, Man, and Cybernetics, 1508–1513. porting the Onboarding of 3D Printers through Conversational Agents. In Mensch
DOI:https://doi.org/10.1109/SMC.2015.267 Und Computer 2021 (MuC ’21), Association for Computing Machinery, New York,
[33] Husain Husain, Herlinda Herlinda, Hasriani Hasriani, Ahyuna Ahyuna, Kas- NY, USA, 494–498. DOI:https://doi.org/10.1145/3473856.3474010
mawaru Kasmawaru, and Ahmad Ahmad. 2021. Increasing the Smart Home [51] Ewa Luger and Abigail Sellen. 2016. “Like Having a Really Bad PA”: The Gulf be-
Automation by using Facebook Messenger Application. In 2021 3rd Interna- tween User Expectation and Experience of Conversational Agents. In Proceedings
tional Conference on Cybernetics and Intelligent System (ICORIS), 1–6. DOI:https: of the 2016 CHI Conference on Human Factors in Computing Systems, ACM, San
Jose California USA, 5286–5297. DOI:https://doi.org/10.1145/2858036.2858288
289
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
[52] Mithil K M, Khatri Bimal Mukesh Kumar, Lalit Sharma, Mohammed Zaki Sayeed Association for Computing Machinery, New York, NY, USA. DOI:https://doi.org/
Pasha, and Kallinath H D. 2018. An Interactive Voice Controlled Humanoid 10.1145/3170521.3170550
Smart Home Prototype Using Concepts of Natural Language Processing and [63] Amon Rapp, Lorenzo Curti, and Arianna Boldi. 2021. The human side of human-
Machine Learning. In 2018 3rd IEEE International Conference on Recent Trends chatbot interaction: A systematic literature review of ten years of research on
in Electronics, Information & Communication Technology (RTEICT), 1537–1546. text-based chatbots. International Journal of Human-Computer Studies 151, (July
DOI:https://doi.org/10.1109/RTEICT42901.2018.9012359 2021), 102630. DOI:https://doi.org/10.1016/j.ijhcs.2021.102630
[53] Sinarwati Mohamad Suhaili, Naomie Salim, and Mohamad Nazim Jambli. 2021. [64] Eugenio Rubio-Drosdov, Daniel Díaz-Sánchez, Florina Almenárez, Patricia Arias-
Service chatbots: A systematic review. Expert Systems with Applications 184, Cabarcos, and Andrés Marín. 2017. Seamless human-device interaction in the
(December 2021), 115461. DOI:https://doi.org/10.1016/j.eswa.2021.115461 internet of things. IEEE Transactions on Consumer Electronics 63, 4 (November
[54] Amr El Mougy, Ahmed Khalaf, Hazem El Agaty, Mariam Mazen, Noureldin Saleh, 2017), 490–498. DOI:https://doi.org/10.1109/TCE.2017.015076
and Mina Samir. 2017. Xenia: Secure and interoperable smart home system with [65] Sanket Salvi, V Geetha, and S Sowmya Kamath. 2019. Jamura: A Conversational
user pattern recognition. In 2017 International Conference on Internet of Things, Smart Home Assistant Built on Telegram and Google Dialogflow. In TENCON
Embedded Systems and Communications (IINTEC), 47–52. DOI:https://doi.org/10. 2019 - 2019 IEEE Region 10 Conference (TENCON), 1564–1571. DOI:https://doi.org/
1109/IINTEC.2017.8325912 10.1109/TENCON.2019.8929316
[55] Muhamad Muslih, Somantri, Dedi Supardi, Elpid Multipi, Yusup Maulana Nya- [66] SPSS Sirinayake, Dkak Dasanayake, Tmlu Rodrigo, Wuy Perera, MDJT Han-
man, Aditya Rismawan, and Gunawansyah. 2018. Developing Smart Workspace sika Mahaadikara, and Surath Kahandawala. 2021. IOT-Based Intelligent Assis-
Based IOT with Artificial Intelligence Using Telegram Chatbot. In 2018 Inter- tant Mirror For Smart Life & Daily Routine Using Raspberry PI. In 2021 21st
national Conference on Computing, Engineering, and Design (ICCED), 230–234. International Conference on Advances in ICT for Emerging Regions (ICter), 1–6.
DOI:https://doi.org/10.1109/ICCED.2018.00052 DOI:https://doi.org/10.1109/ICter53630.2021.9774788
[56] Mahmoud Nasr, Fakhri Karray, and Yuri Quintana. 2020. Human Machine In- [67] Evropi Stefanidi, Maria Korozi, Asterios Leonidis, and Margherita Antona. 2018.
teraction Platform for Home Care Support System. In 2020 IEEE International Programming Intelligent Environments in Natural Language: An Extensible
Conference on Systems, Man, and Cybernetics (SMC), 4210–4215. DOI:https: Interactive Approach. In Proceedings of the 11th PErvasive Technologies Related
//doi.org/10.1109/SMC42975.2020.9283095 to Assistive Environments Conference (PETRA ’18), Association for Computing
[57] Trung Nguyen, Barth Lakshmanan, Chengjie Lin, Weihua Sheng, Ye Gu, Meiqin Machinery, New York, NY, USA, 50–57. DOI:https://doi.org/10.1145/3197768.
Liu, and Senlin Zhang. 2017. A Miniature Smart Home Testbed for Research and 3197776
Education. In 2017 IEEE 7th Annual International Conference on CYBER Technology [68] Chwan-Lu Tseng, Che-Shen Cheng, Yu-Hsien Hsu, and Bing-Hung Yang. 2018.
in Automation, Control, and Intelligent Systems (CYBER), 1637–1642. DOI:https: An IoT-Based Home Automation System Using Wi-Fi Wireless Sensor Networks.
//doi.org/10.1109/CYBER.2017.8446621 In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC),
[58] Chinedu Wilfred Okonkwo and Abejide Ade-Ibijola. 2021. Chatbots applications 2430–2435. DOI:https://doi.org/10.1109/SMC.2018.00417
in education: A systematic review. Computers and Education: Artificial Intelligence [69] Lionel Sujay Vailshery. Number of Internet of Things (IoT) connected devices
2, (2021), 100033. DOI:https://doi.org/10.1016/j.caeai.2021.100033 worldwide from 2019 to 2021, with forecasts from 2022 to 2030. Statista.
[59] Bauyrzhan Ospan, Nawaz Khan, Juan Augusto, Mario Quinde, and Kenzhegali Retrieved from https://www.statista.com/statistics/1183457/iot-connected-
Nurgaliyev. 2018. Context Aware Virtual Assistant with Case-Based Conflict devices-worldwide/#:$\sim$:text$=$Number%20of%20IoT%20connected%
Resolution in Multi-User Smart Home Environment. In 2018 International Con- 20devices,2021,%20with%20forecasts%20to%202030&text$=$The%20number%
ference on Computing and Network Communications (CoCoNet), 36–44. DOI:https: 20of%20Internet%20of,billion%20IoT%20devices%20in%202030.
//doi.org/10.1109/CoCoNet.2018.8476898 [70] Sai Vemprala, Rogerio Bonatti, Arthur Bucker, and Ashish Kapoor. 2023.
[60] Christina Oumard, Julian Kreimeier, and Timo G ChatGPT for Robotics: Design Principles and Model Abilities. Microsoft. Re-
"otzelmann. 2022. Implementation and Evaluation of a Voice User Interface with trieved from https://www.microsoft.com/en-us/research/publication/chatgpt-
Offline Speech Processing for People Who Are Blind or Visually Impaired. In for-robotics-design-principles-and-model-abilities/
Proceedings of the 15th International Conference on PErvasive Technologies Related [71] Tianwei Xing, Luis Garcia, Federico Cerutti, Lance Kaplan, Alun Preece, and Mani
to Assistive Environments (PETRA ’22), Association for Computing Machinery, Srivastava. 2021. DeepSQA: Understanding Sensor Data via Question Answering.
New York, NY, USA, 277–285. DOI:https://doi.org/10.1145/3529190.3529197 In Proceedings of the International Conference on Internet-of-Things Design and
[61] Biswamoy Pattnaik, Sreya Dey, and B Jaganatha Pandian. 2021. A Secure and Implementation (IoTDI ’21), Association for Computing Machinery, New York,
Interactive Home Automation System with Machine Learning Based Power NY, USA, 106–118. DOI:https://doi.org/10.1145/3450268.3453529
Prediction. In 2021 Innovations in Power and Advanced Computing Technolo- [72] Lu Xu, Leslie Sanders, Kay Li, and James C L Chow. 2021. Chatbot for Health Care
gies (i-PACT), IEEE, Kuala Lumpur, Malaysia, 1–6. DOI:https://doi.org/10.1109/i- and Oncology Applications Using Artificial Intelligence and Machine Learning:
PACT52855.2021.9696603 Systematic Review. JMIR Cancer 7, 4 (November 2021), e27850. DOI:https://doi.
[62] Rohit Raj and Nandini Rai. 2018. Voice Controlled Cyber-Physical System for org/10.2196/27850
Smart Home. In Proceedings of the Workshop Program of the 19th International [73] Zhipeng Xu, Hao Wu, Xu Chen, Yongmei Wang, and Zhenyu Yue. 2022. Building
Conference on Distributed Computing and Networking (Workshops ICDCN ’18), a Natural Language Query and Control Interface for IoT Platforms. IEEE Access
10, (2022), 68655–68668. DOI:https://doi.org/10.1109/ACCESS.2022.3186760
290
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
A APPENDICES
A.1 List of Selected Papers
291
MUM ’23, December 03–06, 2023, Vienna, Austria Simone Gallo et al.
Gunawan et al. Chatbot Application on Internet Of Things IoT to 2019 2019 IEEE 5th International Conference on Wireless
Support Smart Urban Agriculture and Telematics (ICWT)
Huang et al. Using Ontology Reasoning in Building a Simple and 2015 2015 IEEE International Conference on Systems, Man,
Effective Dialog System for a Smart Home System and Cybernetics
Husain et al. Increasing the Smart Home Automation by using 2021 2021 3rd International Conference on Cybernetics and
Facebook Messenger Application Intelligent System (ICORIS)
Iannizzotto et A Vision and Speech Enabled Customizable Virtual 2018 2018 11th International Conference on Human
al. Assistant for Smart Environments System Interaction (HSI)
Ilievski et al. Interactive Voice Assisted Home Healthcare Systems 2019 BCI’19: Proceedings of the 9th Balkan Conference on
Informatics
Jain et al. Home Automation System using Internet of Things 2019 2019 International Conference on Machine Learning,
IOT Big Data, Cloud and Parallel Computing (COMITCon)
Jivani et al. A Voice Controlled Smart Home Solution With a 2018 2018 International Conference on Current Trends
Centralized Management Framework Implemented towards Converging Technologies (ICCTCT)
Using AI and NLP
Kaed et al. A Semantic Based Multi Platform IoT Integration 2018 2018 Global Internet of Things Summit (GIoTS)
Approach from Sensors to Chatbots
Kan et al. LoRa-Based Air Quality Monitoring System Using 2020 2020 Asia-Pacific Signal and Information Processing
ChatBot Association Annual Summit and Conference (APSIPA
ASC)
Kang et al. Minuet Multimodal Interaction with an Internet of 2019 SUI ’19: Symposium on Spatial User Interaction
Things
Ketsmur et al. Conversational Assistant for an Accessible Smart 2018 DSAI 2018: Proceedings of the 8th International
Home Conference on Software Development and
Technologies for Enhancing Accessibility and
Fighting Info-exclusion
Kim and Ko A Conversational Approach for Modifying Service 2022 CHI ’22: Proceedings of the 2022 CHI Conference on
Mashups in IoT Environments Human Factors in Computing Systems
Kumar et al. IoT Based Secured Home Automation System Using 2021 2021 International Conference on Advancements in
NLP Electrical, Electronics, Communication, Computing
and Automation (ICAECA)
Leong et al. CASIS a Context-Aware Speech Interface System 2005 IUI ’05: Proceedings of the 10th international
conference on Intelligent user interfaces
Liu et al. Supporting the Onboarding of 3D Printers through 2021 MuC ’21: Proceedings of Mensch und Computer 2021
Conversational Agent
Mithil et al. An Interactive Voice Controlled Humanoid Smart 2018 2018 3rd IEEE International Conference on Recent
Home Prototype Using Concepts of Natural Trends in Electronics, Information & Communication
Language Processing and Machine Learning Technology (RTEICT)
Mougy et al. Xenia Secure and interoperable smart home system 2017 2017 International Conference on Internet of Things,
with user pattern recognition Embedded Systems and Communications (IINTEC)
Muslih et al. Developing Smart Workspace Based IOT with 2018 2018 International Conference on Computing,
Artificial Intelligence Using Telegram Chatbot Engineering, and Design (ICCED)
Nasr et al. Human Machine Interaction Platform for Home Care 2020 2020 IEEE International Conference on Systems, Man,
Support System and Cybernetics (SMC)
Nguyen et al. A Miniature Smart Home Testbed for Research and 2017 2017 IEEE 7th Annual International Conference on
Education CYBER Technology in Automation, Control, and
Intelligent Systems (CYBER)
Ospan et al. Context Aware Virtual Assistant with Case Based 2018 2018 International Conference on Computing and
Conflict Resolution in Multi User Smart Home Network Communications (CoCoNet)
Environment
Oumard et al. Implementation and Evaluation of a Voice User 2022 PETRA ’22: Proceedings of the 15th International
Interface with Offline Speech Processing for People Conference on PErvasive Technologies Related to
who are Blind or Visually Impaired Assistive Environments
Pattnaik et al. A Secure and Interactive Home Automation System 2021 2021 Innovations in Power and Advanced Computing
with Machine Learning Based Power Prediction Technologies (i-PACT)
292
Conversational Interfaces in IoT Ecosystems: Where We Are, What Is Still Missing MUM ’23, December 03–06, 2023, Vienna, Austria
Raj and Rai Voice controlled cyber-physical system for smart 2018 Workshops ICDCN ’18: Proceedings of the Workshop
home Program of the 19th International Conference on
Distributed Computing and Networking
Rubio et al. Seamless human-device interaction in the internet of 2017 IEEE Transactions on Consumer Electronics ( Volume:
things 63, Issue: 4, November 2017)
Salvi et al. Jamura A Conversational Smart Home Assistant 2019 TENCON 2019 - 2019 IEEE Region 10 Conference
Built on Telegram and Google Dialogflow (TENCON)
Sirinayake et IOT-Based Intelligent Assistant Mirror For Smart 2021 2021 21st International Conference on Advances in
al. Life amp Daily Routine Using Raspberry PI ICT for Emerging Regions (ICter)
Stefanidi et al. Programming Intelligent Environments in Natural 2018 PETRA ’18: Proceedings of the 11th PErvasive
Language An Extensible Interactive Approach Technologies Related to Assistive Environments
Conference
Tseng et al. An IoT-Based Home Automation System Using 2018 2018 IEEE International Conference on Systems, Man,
Wi-Fi Wireless Sensor Networks and Cybernetics (SMC)
293