Document

Session: Watches and Small Devices CHI 2014, One of a CHInd, Toronto, ON, Canada
Duet: Exploring Joint Interactions

on a Smart Phone and a Smart Watch
Xiang ‘Anthony’ Chen12, Tovi Grossman1, Daniel Wigdor3, George Fitzmaurice1
1 2 3
User Interface Group HCI Institute Department of Computer Science
Autodesk Research Carnegie Mellon University University of Toronto
{firstname.lastname}@autodesk.com [email protected] [email protected]
Figure 1. A duet of interaction between a handheld and a wrist worn device (a): the watch is used as a tool palette when annotating
text on the phone (b); a simultaneous pinch-to-close swipe gesture on both devices mute their notifications (c); the watch’s
orientation indicates which hand part causes a touch, thus enabling a seamless transition between modes: for example, writing with
the pad of the finger (d), scrolling with side of the finger (e), and text selection with the knuckle (f).
ABSTRACT INTRODUCTION
The emergence of smart devices (e.g., smart watches and Interactive computing technology is becoming increasingly
smart eyewear) is redefining mobile interaction from the ubiquitous. Advances in processing, sensing, and displays
solo performance of a smart phone, to a symphony of have enabled devices that fit into our palms and pockets
multiple devices. In this paper, we present Duet – an (e.g., [2, 15]), that are wrist-worn [27, 40] or head-mounted
interactive system that explores a design space of [20, 29], or that are embedded as smart clothing [28, 37].
interactions between a smart phone and a smart watch. Commercialization is rapidly catching up with the research
Based on the devices’ spatial configurations, Duet community’s vision of mobile and ubiquitous form factors:
coordinates their motion and touch input, and extends their smart phones, smart watches, and smart eyewear are all
visual and tactile output to one another. This transforms the available for purchase. Soon, many of us may carry not one
watch into an active element that enhances a wide range of smart device, but two, three, or even more on a daily basis.
phone-based interactive tasks, and enables a new class of
For interaction designers, this introduces a new opportunity
multi-device gestures and sensing techniques. A technical
to leverage the availability of these devices to create new
evaluation shows the accuracy of these gestures and sensing
interactions beyond the usage of a single device alone. At
techniques, and a subjective study on Duet provides
present, the space of interaction techniques making use of
insights, observations, and guidance for future work.
this opportunity is underexplored, primarily focusing on
Author Keywords using a secondary mobile device such as a smart watch as a
Duet, joint interaction, smart phone, smart watch. viewport and remote control of the smart phone [37]. To the
ACM Classification Keywords best of our knowledge, we are unaware of any existing
H.5.2 [User Interfaces]: Input devices and strategies, work that takes a different approach in designing a class of
Interaction styles. joint interactions on two smart mobile devices.
Permission to make digital or hard copies of all or part of this work for To address this limit, our research envisions a symphony of
personal or classroom use is granted without fee provided that copies are interaction between multiple smart mobile devices. To
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. Copyrights for
approximate this vision, we start by considering a scenario
components of this work owned by others than ACM must be honored. of two smart mobile devices as a joint interactive platform.
Abstracting with credit is permitted. To copy otherwise, or republish, to Specifically, our goal is to explore various ways wherein
post on servers or to redistribute to lists, requires prior specific permission these two devices can perform their individual input and
and/or a fee. Request permissions from [email protected].
CHI 2014, April 26 - May 01 2014, Toronto, ON, Canada
output techniques to create new interaction possibilities for
Copyright 2014 ACM 978-1-4503-2473-1/14/04…$15.00. the users. To begin to realize this vision, we built Duet – an
http://dx.doi.org/10.1145/2556288.2556955 interactive system that enables joint interactions between a
159
smart phone and a smart watch (Figure 1a). Inspired by Touch is more difficult on wrist-worn devices, which
research on conversational linguistics [9], and prior HCI typically have small screens, exacerbating the fat finger
work on ‘foreground-background’ interactions [5, 14], we problem [46]. Zoomboard used iterative zooming to ease
explore a design space of interaction between the phone and target acquisition [36]. Facet utilized a multi-segment wrist-
the watch. Based on their spatial configurations, Duet worn device, which allows touch to span multiple
coordinates the two devices’ motion and touch input, and connected screens, yielding a richer input vocabulary [28].
extends their visual and tactile output to one another, thus
Given the limitation of direct-touch on wrist-worn devices,
enabling the watch to enhance a wide range of phone-based
the exploration of alternate techniques becomes more
interactive tasks. For example, we can divide an interface
important. Motion and spatial awareness creates a variety
between the phone and the watch, such as reserving the
of wrist-based interaction. GestureWrist and GesturePad
phone for a canvas while hosting a tool palette on the watch
use wrist-mounted accelerometer and capacitive sensors to
(Figure 1b). We can create novel gestures, such as a cross-
recognize hand grips and pointing directions [42]. Past
device pinch-to-close gesture to simultaneously mute both
research also focuses on wrist rotation and tilting [8, 38,
devices (Figure 1c). We can also use a watch’s sensors to
46], most of which was implemented using a mobile phone
augment touch, such as using its orientation to infer which
held in the hand. In contrast to handhelds, a wrist-worn
finger part (pad, side or knuckle) is touching the phone’s
device is a potentially better solution: its wearability
screen (Figure 1d-f).
untethers the users’ hands from any sensing devices, allows
In the following sections, we first review techniques of the sensing to be always available with one’s day-to-day
handheld, wrist-worn and device-to-device interaction. activities, and provides high-fidelity data by closely
Next we present our design space that encompasses the coupling the device to one’ hand, wrist and arm movement.
interaction between the phone and the watch based on their
A watch can also enable freehand gestures beyond its
‘foreground-background’ interactional relationships. We
surface. A disappearing mobile device [35] can be mounted
then introduce a suite of gestures and sensing techniques
on the wrist and interacted with by ‘scanning’ fingers on
enabled by this joint interactive platform – and a technical
top of it. Abracadabra enables spatial input with a small
evaluation of their recognition accuracy. We then
wrist-worn device by using the magnetometer to sense
demonstrate Duet’s various application scenarios enabled
finger-mounted magnets [12]. Gesture Watch [23] and
by these gestures and techniques, and report users’
AirTouch [26] use multiple infrared sensors mounted on the
reactions and feedback to inform future work.
back of the wrist to detect freehand gestures, such as hand
RELATED WORK swiping along the forearm. Instead of using the other hand
We first review prior work that explores individual to perform gestural input, earlier work also explores using
interaction techniques developed for handheld and wrist- wrist-mounted contact microphones to detect fingertip
worn devices. We further summarize various device-to- gestures [1]. Similarly, Digits reconstruct real-time 3D hand
device interactions that demonstrate examples of using models by instrumenting sensors on the inner side of the
multiple devices to create new interaction possibilities. wrist and facing the camera towards the palm [22].
Interaction Techniques for Handheld Devices Our review shows a plethora of interaction techniques
Touch is perhaps the most common input method for developed for both handheld and wrist-worn devices
modern handheld devices. Motion and spatial awareness, individually. Yet few have considered marrying their
enabled by a device’s on-board sensors, can also be techniques to design for scenarios where a user is carrying
leveraged to enhance touch-based interaction [15]. Past and using both devices. To better understand this issue, we
research has demonstrated interaction by orienting [10], review prior work on device-to-device interaction.
positioning [51], tilting [40], or whacking [19] a device. To
go beyond the device’s physical boundaries, others also Device-to-Device Interaction
explored interacting with the device using freehand gesture Device-to-device interaction associates multiple individual
[4,21]. All these techniques, under Buxton’s framework, devices to create new interaction possibilities. We
fall into the ‘foreground interaction’ category [5]. summarize three association principles from the literature:
Meanwhile, for ‘background interaction’, context-awareness, synchrony, proxemic interactions, and distributed gestures.
such as location, has long been proven useful in various mobile Synchrony associates devices by the synchronization of
interaction scenarios [44]. Altogether, this work collects a their inputs. Pick-and-drop synchronizes pen input across
toolbox of possible techniques for handheld devices. multiple computers to enable direct content manipulation
Interaction Techniques for Wrist-worn Devices between them [41]. Smart-Its Friends senses a handshake as
The above interaction techniques for handhelds can also be a natural way to establish connections between smart
found on wrist-worn devices. However, wrist worn devices artifacts [18]. Synchronous gestures detect the bumping
have an even smaller form factor and are worn on our between two tablets, thus allowing interactions such as
bodies, which demands a reconsideration of the techniques spanning and sharing a photo across two screens [17].
as well as explorations into new interaction possibilities. Stitching applies a similar idea where a pen stroke across
160
two tablets can be used to, for instance, transfer files [16]. Buxton defines foreground interaction as “activities which
Similar techniques were also used in Lucero et al’s work are in the fore of human consciousness – intentional
where a pinching gesture between mobile devices spans a activities” [5]. Hinckley et al. develops the definition of
shared canvas across them [27]. Siftables proposes background interaction as “sensing an action that the user
synchronized interactions with multiple networked tangible would have had to perform anyway to accomplish their
interfaces, such as bumping all devices at once to swap in a task” [14]. In the past, these frameworks have been
new set of data associations [34]. focusing on the context of a single device. Our design space
extends this framework to a scenario when two mobile
Proxemic Interaction associates devices by their spatial
devices are present, guiding the design of interactions
relationship (e.g., proximity and orientation) between one
between them.
another. The Relate system built customized sensors into
USB dongles, thus allowing peer-to-peer computation of As shown in Table 1, the combination of foreground and
devices’ spatial relationship, and a set of spatial widgets to background interactions, when two devices are present,
incorporate such relationship into the user interface [24]. A creates a 2x2 design space encompassing a variety of
spatial proximity region around mobile devices can be used interactions that leverage the availability of both devices.
to mediate content access and sharing among a group of Current commercial designs have been focusing on the
users [25]. Gradual engagement applies a similar idea to lower-left quadrant, where the watch is used as a temporal
facilitate different levels of information exchange as a replacement for the phone, such as using the watch to check
function of device-to-device proximity [31]. new emails or read text messages when the phone is not
ready to hand [37]. The lower-right quadrant characterizes
Distributed Interactions divides the tasks, features, or
work that uses both devices for context and activity sensing
functions of an interface between multiple devices.
[7, 31]. Less work has been done in the two upper
Roomware envisions a room of inter-connected smart
quadrants where the phone remains in the foreground as an
artifacts that augment people’s individual or collaborative
active input and output platform, and the watch transitions
tasks [47]. ARC-Pad divides cursor positioning task into
between foreground interaction (as an input device and
absolute pointing on a mobile device and relative
extended display) and background sensing. Duet is a new
adjustment on a large display [33]. A cross-device
system that focuses on and explores these two areas of the
interaction style [45] designs interaction between a mobile
design space.
device and a large interactive surface, such as selecting
from a list of tools on the mobile and applying that tool in Watch Foreground Watch Background
an application on the surface.
Phone Foreground
Duet: Duet:
While this work shows the potential of certain device-to- • Phone as a primary input and • Phone as a primary input and
device interactions, we are unaware of any existing research output platform; output platform;
• Watch as an input device or • Watch as a sensor.
that has explored the opportunities of using the phone and extended display.
the watch together. We see an immense potential to explore
this opportunity of combining a smart phone and a smart
Phone Background
Current commercial designs: Prior research:

watch to enhance our everyday mobile interactions. • Phone as an inactivated • Both phone and watch used for
information portal [37] context and activity sensing [7,
DESIGN SPACE
• Watch as a viewport or remote 31].
The fundamental idea of our design space is to allow the control [37]
phone and the watch, in various ways, to perform their
individual input and output techniques, and together to
Table 1. A design space of interaction on a smart phone and a
create new interaction possibilities for the users. We smart watch based on Buxton’s framework [5].
construct a design space (Table 1) based on Falk’s research
on conversational linguistics [9], and Buxton’s [5] and IMPLEMENTATION DETAILS
Hinckley et al.’s [14] ‘foreground-background’ frameworks. To explore these areas of the design space, we built an
interactive platform on a smart phone and a smart watch.
In constructing this design space, our goal is for the two
devices to carry out interactive tasks for the user as a single Hardware and System Setup
unified platform. This is similar to what Falk observed and We used a Samsung Galaxy S4 smart phone and a Sony
described in her paper ‘The conversational duet’: “In SmartWatch. The phone has a 1080x1920 capacitive multi-
conversations between three or more persons, two of them touch screen, quad-core 1.6 GHz processor, and a 3-axis
may undertake jointly to carry out the communicative task accelerometer. The watch has a 128x128 pixels capacitive-
to a third in such a way that a written version of their touch color display, and is connected to and run by the
resultant in-sequence text would be indistinguishable from phone via Bluetooth. The API of the watch provides limited
that of a single speaker.” [9] touch input of seven pre-defined gestures: press, long press,
release, and four swiping directions. Its accelerometer has a
To attain this goal, the two devices should both perform maximum rate of 10Hz.
their individual foreground or background interactions.
161
Spatial Configuration Multi-device gestures: We adapted the stitching technique

To fully explore the design space, we consider two possible [16] and the pinching gestures for interaction between the
ways in which the devices can be worn or carried in relation phone and the watch, to support four novel multi-device
to one another. Figure 2 shows these two spatial gestures (Figure 3b). The first two involve the finger
configurations: with the face of the watch worn on the swiping from the phone to the watch (phone-to-watch
dorsal side (a), or on the ventral side of the wrist (Figure swipe) or from the watch to the phone (watch-to-phone
2b). These two spatial configurations afford different ways swipe). The second two gestures are performed by two
in which the watch can augment phone-based interactions. fingers swiping simultaneously, where the fingers move
In particular, wearing the watch on the ventral side provides towards each other (pinch-close) or away from each other
additional visibility and quicker access to the watch while (pinch-open) (Figure 3b).
holding and using the phone. The current spatial
Watch in the Background, Phone in the Foreground
configuration can be detected by constantly monitoring the
In the background of interaction, the watch can be used as
two devices’ relative orientation, or by using a motion-
an auxiliary sensor for the phone.
based gesture to explicitly register that information (we use
the latter approach). While not necessary, we found the use Flip and tap: To perform this gesture, a user flips her hand
of an expansion watchband can enable easy switching (that wears the watch) immediately before tapping on the
between these two configurations (Figure 2c). phone’s screen (Figure 3c).
Hold and flip: Inspired by DoubleFlip [43], this gesture
consists of flipping the phone while holding the finger
down on the screen. By detecting the synchronized motion
of both devices, we also distinguish if the user is flipping
the phone with the hand wearing the watch (Figure 3d).
Finger posture recognition: By sensing the watch’s
orientation when touching the phone, we can tell which
finger part (pad, side, or knuckle) causes that touch (Figure
3e).
Handedness recognition: By correlating the watch’s motion
Figure 2. Devices’ two spatial configurations: watch worn on with the phone’s motion at the onset of a touch event, we
the a) dorsal side or (b) ventral side of the wrist. c) An infer whether a touch is caused by a bare hand, or a hand
expansion watchband allows easy switching between the two. that wears a watch (Figure 3f).
GESTURES AND SENSING TECHNIQUES
The availability of a smart phone and a smart watch gives
rise to a suite of new gestures and sensing techniques
(Figure 3). Below, we describe the gestures and sensing
techniques we explored, in the context of our design space
(Table 1). Specifically, we developed two categories of
techniques, where the watch is either in the foreground and
background of interaction. This section provides a brief
overview of these gestures and sensing techniques and a
description of our gesture recognition methods. Later we
describe the Duet system to demonstrate interactive
scenarios and applications that utilize such techniques.
Watch in the Foreground, Phone in the Foreground
In the foreground of interaction, the watch can be used as
an additional input device to complement the phone’s
motion and touch input. We implemented two gestures that
utilize this additional input channel:
Double bump: Bumping the phone on the watch creates a
synchronous gesture [17] that provides distinct input Figure 3. The core gesture and sensing techniques on a joint
properties compared to bumping on other surfaces. To interactive platform of a phone and a watch: a) Double bump,
reduce the chance of false positives, we implemented a b) Multi-device gestures (four different swipes: pinch-open,
pinch-close, watch-to-phone swipe, and phone-to-watch swipe,
double bump gesture, where the phone bumps against the
c) Flip and tap, d) Hold and flip, e) Finger posture recognition
watch twice in succession (Figure 3a). (the pad, side or knuckle of the finger), and f) Handedness
recognition (left- vs. right- hand touch).
162
Gesture Recognition technique was used for training. In total, the evaluation
We used machine learning techniques for implementing our produced 12 participants × 15 conditions (across the 6
recognition system. For motion-related input, our general techniques) × 4 blocks × 10 trials per block = 7200 data
approach is to segment a chunk of accelerometer data (an points.
array of X/Y/Z) pertinent to a particular gesture. We then
flatten the data into a table of features – each axis value at a All techniques except for Multi-device gestures used
given time point is considered a feature. Using these machine learning based recognition. The results for Multi-
features we can train a machine learning model (we use a device gestures will be discussed after our analysis on the
Decision Tree – one of the most widely applied machine first five techniques.
learning techniques) to recognize the gesture. These Ten-Fold Cross Validation
recognizers are used only if the watch-wearing hand is We conducted a conventional ten-fold cross validation
detected during the onset of a touch (using the using all the data from each technique. As shown in Table
aforementioned handedness recognition); otherwise a 2, all techniques achieved an accuracy of over 97% except
default interaction is applied. for Double bump (93.87%). This result gives us a basic
TECHNICAL EVALUATION
assessment where the interaction data from a group of users
To understand the feasibility of our designs, we tested the is known a priori, and a model can be trained and fine-
recognition accuracy of the six described techniques tuned to a particular group of users. To challenge our
(Figure 3). In the evaluation, participants wore the watch on techniques in more realistic scenarios, we conducted two
the dorsal side of their left wrist (Figure 2a) except for the further evaluations and analyses.
Multi-device gestures, which require the watch to be moved Per User Classifiers
to the ventral side (Figure 2b). It is important to know how the features perform at a per
user level [13]. For each technique, we separated the data
The accuracy of each technique was tested independently.
between the participants, and ran a ten-fold cross validation
Each technique recognizes several conditions,
within the data of each participant. As shown in Table 2,
corresponding to the number of classes in the machine
the features are indicative for each technique for specific
learning model (e.g., finger posture recognition involves
users (accuracy > 90% for all techniques). However, the
three: pad, side and knuckle, Figure 3e). Some techniques
results also show some users were inconsistent in
(e.g., flip and tap) included a baseline condition (e.g., a
performing the techniques, especially Double bump (SD
standard tap without any hand flipping) to test false
5.34%) and Hold and flip (SD 11.24%). These two
positives. The conditions for each technique were as
techniques, by nature, are more complicated than the others,
follows, with the number of conditions in parentheses:
and demand clearer instructions and perhaps a larger set of
Double bump (2): Users either performed a hold and training data.
double bump, or a hold without the double bump. Finger
Multi-device gestures (4): Users performed the four multi- Double Flip and Hold and Handedness
posture
bump tap flip recognition
device gestures: pinch-open, pinch-close, phone-to-watch recognition
swipe, and watch-to-phone swipe. 10-fold
93.87% 97.90% 97.56% 99.06% 99.34%
Flip and tap (2): Users either performed a flip and tap, or cross val.
performed a standard tap without first flipping the hand. Per user 92.10% 95.92% 90.11% 97.33% 97.95%
Hold and flip (2): Users either performed a hold and flip, or classifiers (5.34%) (2.89%) (11.24%) (1.92%) (0.80%)
a hold without the flip. General 88.33% 94.38% 85.29% 98.23% 93.33%
Finger posture recognition (3): Users tapped the phone classifiers (9.89%) (9.91%) (10.90%) (2.64%) (9.07%)
with either the pad of the finger, side of the finger, or
knuckle. Table 2. Accuracy (SD in parentheses) of our gestures and
Handedness recognition (2): Users tapped the phone with sensing techniques: ten-fold cross validation, per user
classifiers, and general classifiers.
either the left (watch wearing) hand or the right (bare) hand.
Pinch to open Pinch to close Phone to watch Watch to phone
Participants first learned to perform each technique
condition by watching a demonstration by the experimenter. 97.69% 98.61% 95.83% 96.76%
(5.67%) (2.32%) (3.83%) (3.25%)
In the trials, participants were presented with visual cues
instructing them to perform each condition. Table 3. Accuracy (SD in parentheses) for Multi-device
Twelve participants (five male, seven female, ages 18 to 34, gestures.
two left-handed) completed our study. Each participant General Classifiers
performed five blocks of the six techniques, with the order It is also important to know how the features can be
of techniques counter-balanced using a Latin-square design. generalized to new users whose data has not been used for
In each block, participants repeated 10 trials for each training [13]. To simulate this scenario, we separated out
condition of a given technique. The first block for each one participant’s data as a test set (new user), and the
163
AUXILIARY SENSOR
others’ aggregated as a training set (existing users). For App Selection and Arrangement
each technique, we repeated this process 12 times (i.e., all Four app icons are displayed on the home screen. The user
the combinations from the 12 users). We then calculated the can touch an icon to open an app, or use a knuckle-touch to
average and the standard deviations of the accuracy. As move the icons (Figure 5ab). Contrary to existing designs,
shown in Table 2, the results indicate that for most this requires no extra steps to distinguish between opening
techniques, there was some inconsistency between and navigating the apps, and repositioning their icons.
participants (SD between 9.00% and 11.00% except for App Selection Shortcut
EXTENDED DISPLAY | INPUT DEVICE
Handedness recognition). As a result, their performance A person can also use the watch to quickly switch between
dropped compared to the previous two analyses. A solution apps. Pressing and holding the watch brings up an app
to mitigate this problem is using some online learning selection screen on the watch, which displays the app icons
mechanisms that dynamically incorporate a new user’s data in a 2x2 grid (Figure 5c). Additional app icons would be
into the existing model. organized on pages that a user would swipe between.
Multi-Device Gestures Tapping on an app loads it on the phone, and pressing and
The multi-device gestures were recognized using hard holding on the app selection screen dismisses it.
coded heuristics, based on the gesture length, duration, and
timing. The results of our evaluation (Table 3) show a
fairly high accuracy of our implementation.
DUET: AN EXPLORATION OF JOINT INTERACTIONS
We now introduce the Duet system, which demonstrates
how the novel gestures and sensing techniques we have
described could be utilized to enhance a wide range of Figure 5. On the home screen, a, b) A knuckle-touch
interactive tasks across various applications. Duet is an repositions app icons on the phone. c) The watch can be used
to switch between apps on the phone.
interactive system that explores the joint interactions
between a smart phone and smart watch. The system can be Email
thought of a smart phone shell that is enhanced by the The Email app provides techniques to support reading and
watch. The shell consists of a home screen and four organizing a list of emails.
common mobile apps. The interactions we present are List Management
AUXILIARY SENSOR
meant to explore the areas of interest within our design Tapping an email opens it; while a knuckle-touch can be
space (Table 1). In particular, we demonstrate how the used to select and apply actions to multiple emails, such as
watch can perform foreground interactions as an input ‘archive’, ‘mark as read’ or ‘delete’. This technique
device or extended display, or serve in the background as an requires no extra widgets (e.g., checkboxes) for selection,
auxiliary sensor. Meanwhile, the phone remains in the thus saving more screen space for the other interactions.
foreground, whose interaction is enhanced by these three EXTENDED DISPLAY
different roles of the watch (as labeled in the headings of Notification Management
each interaction subsection below). In social occasions like meetings and movies, a person can
use the multi-device gestures to manage which device(s)
Home Screen email notifications are received on. A pinch-to-close mutes
The Home Screen provides techniques for managing the both devices simultaneously (Figure 1c). A pinch-to-open
device and its applications (apps). resumes their notifications (Figure 6a). A stitching gesture
Hold and Flip to Unlock
AUXILIARY SENSOR from the phone to the watch directs all the notifications to
To unlock the device from an inactive mode, a user the watch (Figure 6b). The opposite direction pushes all the
performs the hold and flip gesture (Figure 4). This gesture notifications to be shown on the phone (Figure 6c). We also
requires a synchronized motion of both the phone and the use tactile feedback to inform a gesture’s direction, e.g.,
watch, thus reducing recognizer false positives. Optionally, when swiping from the phone to the watch, a user can feel
one can use it as an additional security layer that requires two vibrations – first on the phone, then on the watch, as if
the ownership of both devices in order to gain access. a single vibration was ‘transferred’ across the devices. This
technique provides a way to customize notifications on
multiple devices without resorting to extra physical buttons
or UI elements.
Figure 4. Hold and flip unlocks the phone and registers the
devices’ spatial configuration (in this case, the watch is worn
on the dorsal side of the wrist, also see Figure 2).
Figure 6. In Email, Multi-device gestures are used to manage

new email notifications on both devices.
164
Map Reader
The Map app enhances a user’s search and navigation task. The Reader app allows users to read and annotate text.
AUXILIARY SENSOR AUXILIARY SENSOR
One-Handed Zoom Menu Access
Existing map apps only partially support one-handed use – A normal tap on the page brings up the menu with basic and
a common scenario that happens when, for instance, the frequently used options (Figure 10a). Alternatively, with
user is holding a coffee. While zooming in can sometimes the watch as a sensor, one can use the flip and tap gesture to
be accomplished with a double tap, it is difficult to zoom display an advanced menu that contains additional
out with a single hand. With Duet, we use the double bump commands (Figure 10bc).
as a gestural shortcut for zooming out (Figure 7). AUXILIARY SENSOR
Implicit Tool Selection
We use the finger postures recognition to implicitly select
tools in the reader app. For example, after selecting a pen
tool from the menu, the finger pad is used to annotate the
text (Figure 1d), the side of the finger to scroll the page
(Figure 1e), and the knuckle to start text selection (Figure
1f). This allows for a seamless transition between three
Figure 7. In a one-handed scenario, double bumping the phone
on the watch creates a gestural shortcut to zoom out the map. frequent operations without having to explicitly specify any
EXTENDED DISPLAY | INPUT DEVICE
modes.
Multi-Device Target Selection
Another difficult task on a map app is selecting tiny and
cluttered location markers (Figure 8a). Inspired by Shift
[49], we design a mechanism for using the watch to
facilitate small target acquisition on the phone. To start, the
user thumbs down an area of interest (Figure 8b), which is
then zoomed in on the watch (Figure 8bc). The user can
select an enlarged target on the watch (Figure 8d), or swipe
to pan and adjust the zoomed-in area (Figure 8e). Releasing
the thumb brings the user back to the map navigation on the
phone. The watch assists users in selecting multiple small
targets without invoking widgets that take up screen space.
Figure 10. A tap accesses a basic menu (a), while a flip-and-tap

accesses an advanced menu in the Reader app (bc).
EXTENDED DISPLAY
Multi-Device Clipboard
In addition to the phone’s default copy and paste functions,
the watch can also be used as a clipboard that holds
multiple pieces of text. Upon selecting the text (Figure
11a), the text will be displayed on the watch. Users can then
add it to the clipboard by swiping right on the watch
(Figure 11b). One can also retrieve an earlier selection by
Figure 8. Using the watch to zoom in and select small map swiping down the clipboard (Figure 11cd).
location markers on the phone.
INPUT DEVICE
Toggle View Modes
The watch can also provide shortcuts for interactions on the
phone. Swiping on the watch’s screen is used as a shortcut
to switch between normal and satellite views (Figure 9). In
general, such simple interactions with the watch can replace
tedious menu selections typically carried out on the phone.
Figure 11. Using the watch as an additional way to copy and
paste from a clipboard of text. Selected text (a) is added to the
clipboard with a swipe to the right (b), which then displays all
the selected text (c). Swiping down goes through the text (c).
EXTENDED DISPLAY
Multi-Device Tool Palette
By positioning the watch to the ventral side of the wrist, the
Figure 9. Swiping on the watch’s screen (b) switches map watch can be used as a tool palette (Figure 1b). Swiping on
views – in this case, from normal (a) to satellite (c). the watch shows more menu options. Hosting the tool
palette on the watch saves screen space on the phone for
165
other interactions. It also resembles how a painter holds and First, users liked how adding the watch can create
uses a color palette while drawing on a canvas. lightweight interaction, which might otherwise be
cumbersome on the phone alone. For example, swipe to
Call
The call app shows an exemplar interaction using the watch switch map views was considered as a “handy” feature (P5),
to retrieve information while holding the phone for a call. “better compared to [the] traditional way of doing it” (P1),
and “reduce interaction steps” (P3).
EXTENDED DISPLAY
Information Retrieval
In this situation, back-of-device touch [3,50] might be a Second, people liked using the watch as an extended
useful input solution. To enable a quick exploration of this display. For example, P3 liked how using the knuckle to
idea, we flipped the phone and turned its front screen into a select emails dispenses with UI widgets and “increases
back-of-device touch area (Figure 12a). This proof-of- screen space”, P2 commented that flip and tap to bring up
concept prototype allows a person in a phone call to retrieve the advanced menu “saves screen real-estate”, and P8 liked
information that can be displayed on the watch. Users can how a tool palette on the watch “saves screen space” for the
navigate between a list of frequently used apps, by swiping text in a reader.
up and down. Once the desired app is located (Figure 12b),
details can be retrieved. For example, swiping left/right on
the Email app goes through the inbox where the watch
shows one email (sender and subject line) at a time (Figure
12c). Tapping on an email opens it; and the user can read
the email by scrolling through its text; another tap closes
the email. This technique works for cases where the caller
needs to quickly access information on the phone, such as
recent emails, missed calls, or calendar events.
Figure 13. Subjective rankings show an overall positive

reaction to Duet’s interaction techniques.
Third, users enjoyed having a set of novel gestures enabled
Figure 12. Enabling basic app access on the watch while using by using the watch as an auxiliary sensor. For example,
making a phone call. most participants liked the possibility of using the watch to
USER FEEDBACK ON DUET sense the pad, side and knuckle of the fingers. P2 and P5
We gathered feedback of people using the Duet system. Our further recommended designs that allow users to customize
intention was not to carry out a task-based quantitative what each hand part does in the applications.
study, but rather to collect and learn from some initial user Last, users found it most compelling when the phone and
reaction and comments on our techniques. the watch complemented one another and created
Participants interaction that went beyond their capabilities as
We recruited 10 participants, five male, five female, ages individuals. For example, P2 considered accessing apps on
21 to 27. All participants were smart phone users. Six the watch while on a phone call a feature he “can’t think of
participants were students from three different local any other [better] way to do it”. The watch in this scenario
universities and the others were young professionals. fit in a ‘niche’ wherein phone-based interaction fell short.
Some participants commented that a tool pallet on the
Procedure
watch resonated with their experience of using a physical
We demonstrated all the Duet interactions to the
toolbox (P5), pencil box (P6), and color pallet (P10): the
participants, and asked them to comment on their easiness
combination of both devices did not just enable a certain
(how easy is it to perform the interaction) and usefulness
function, but also created a unique and pleasant experience
(how is an interaction useful for some applications). The
when using that function.
participants could also try out Duet by themselves and think
aloud while exploring. The entire study took approximately On the other hand, participants also pointed out issues and
60 minutes. concerns with Duet. Foremost, participants often felt less
Results enthusiastic when a Duet technique did not show significant
In general, participants gave positive feedback to Duet. The improvement from what existing single-device designs
subjective rankings for each technique are shown in Figure could already achieve. For example, P7 thought hold and
13. Below we summarize some of our key observations. flip is good for some “niche applications” (e.g., banking),
but would be “overkill” if using it to replace the existing
locking/unlocking mechanisms.
166
People also gave mixed reviews for some of the watch- our future work can learn from this ‘phrasing’ concept to
enhanced gestures. For example, both P8 and P9 pointed extend our work to multi-device symphonic interaction. We
out that knuckle-touch could create screen occlusion and can rethink how we can phrase the interaction between and
felt hard when dragging for precise positioning (e.g., by multiple devices into a fluid stream of action. Our multi-
arranging apps on the home screen); however, they liked device target selection technique (Figure 8) has set foot in
using it for email selection, as this interaction required less this exploration. As shown in Figure 14, this technique
precision and felt easier when performed with the knuckle. starts from the phone with a ‘touch and hold on the map’.
Finally, many participants noted the small display and This leads to a ‘showing touched area’ on the watch, and
touch area of the watch. leaves room for ‘touch to adjust or select targets’. A touch
release on the phone brings an end to this technique,
We also received valuable suggestions for additional
dismissing map display on the watch, and leaving selected
features from the participants. Users suggested different
targets, if there is any, highlighted on the phone. All these
mappings between techniques and applications, e.g., using
‘components’ of the techniques are phrased together, ‘glued’
multi-device gestures to copy text from the phone to the
by the muscle tension of thumb that holds down on the map
watch (P6), or as another way to authenticate the ownership
[6]. By thinking in terms of phrasing in musical
of both devices (P7). A number of participants suggested
communication, we can explore more ways of designing
including a ‘fall back’ option for situations where the user
interaction that spans multiple smart devices.
misplaces either device (e.g., how to unlock the phone
when not having the watch for hold and flip). This further
suggests a design challenge: how can we allow the users to
transition to a multi-device paradigm and interface with
them as if they were a unified interactive platform? Figure 14. Phrasing between the phone and the watch in a
DISCUSSION AND FUTURE WORK target selection task on a map.
We discuss issues and questions to inform future work. From Duet to Symphony. Our paper focuses on a duet of
Recognition robustness. Despite the promising accuracy interaction between a smart phone and a smart watch. In the
levels shown in the technical evaluation, it should be noted future, it would be interesting to consider how our design
that the study was performed in a controlled lab space and the interactions could be extended to not just a
environment. As such, there will likely be conditions where duet of devices, but perhaps a trio, a quartet, and eventually
recognition rates may not perform as well. For example, our towards a symphony of devices and interaction [44].
handedness detection (Figure 3d) is based on the CONCLUSION
assumption that, when wearing a watch, a touch down event Soon mobile interaction will no longer be the solo
will cause synchronized movement between the watch and performance of the smart phone, but will rather be a
the phone. However, a touch might only incur subtle finger symphony of a growing family of smart devices. Our Duet
motion, without detectable movement of the watch or the system reveals a design space of joint interaction between
phone (false negatives); a bare hand’s touch might also two smart devices and illustrates underexplored areas where
coincide with the devices’ movement, thus resembling a the phone remains in the foreground of interaction, and the
touch caused by a watch-wearing hand (false positives). watch is used to enhance a wide range of phone-based
Our future work will explore software and hardware interactive tasks. Our technical evaluation demonstrates the
solution for mitigating this problem. accuracy of the new gestures and sensing techniques used
Watch wearing. Some of the Duet techniques require by Duet, and a subjective study on the Duet system
wearing the watch on the ventral side of the wrist to keep it provides insights, observations, and guidance for future
readily visible and accessible (e.g., Figure 1bc). Although work towards a symphony of interaction.
an elastic watchband greatly eases the switching between REFERENCES
the two configurations (Figure 2c), a user still needs to 1. Amento, B., Hill, W., and Terveen, L. The sound of one
perform the switch. Our future work will explore alternate hand. CHI '02, 724-725.
input/output modalities on the form factor of a watch, e.g., 2. Ballagas, R., Borchers, J., Rohs, M., and Sheridan, J.G.
extending the design solution of Facet [28]. The Smart Phone. IEEE Pervasive Computing 5, 1
Exploring the ‘phrasing’ of Duet. Musical communication (2006), 70–77.
research found that musicians use phrasing to structure 3. Baudisch, P. and Chu, G. Back-of-device interaction
their duet performance [11]. In particular, phrasing allows allows creating very small touch devices. CHI ‘09,
musicians to communicate with one another by delimiting 1923–1932.
their duet performance into temporal frames through which 4. Butler, A., Izadi, S., and Hodges, S. SideSight.
musicians anticipate each other’s musical actions while UIST ’08, 201–204.
finding room for their own musical expression. Similar to 5. Buxton, W. Integrating the periphery and context: A
how Buxton articulates this concept in gesture design [6], new taxonomy of telematics. GI ‘95, 239–246.
167
6. Buxton, W.A.S. Chunking and phrasing and the design 26. Lee, S.C., Li, B., and Starner, T. AirTouch. ISWC ’11,
of human-computer dialogues. IFIP ‘86, 494–499. 3–10.
7. Chen, G., and Kotz, D., A survey of context-aware 27. Lucero, A., Keränen, J., and Korhonen, H. Collaborative
mobile computing research. Technical Report TR2000- use of mobile phones for brainstorming. MobileHCI '10,
381, Dept. of Computer Science, Dartmouth College, 337.
2000. 28. Lyons, K., Nguyen, D., Ashbrook, D., and White, S.
8. Crossan, A., Williamson, J., Brewster, S., and Murray- Facet. UIST ’12, 123–130.
Smith, R. Wrist rotation for interaction in mobile 29. Mann, S. Smart clothing. CACM 39, 8 (1996), 23–24.
contexts. MobileHCI ’08, 435–438. 30. Mann, S. `WearCam’ (The Wearable Camera). ISWC
9. Falk, J. The conversational duet. Proceedings of the ’98, 124–131.
Annual Meeting of the Berkeley Linguistics Society, 31. Marquardt, N., Ballendat, T., Boring, S., Greenberg, S.,
2011. and Hinckley, K. Gradual engagement. ITS ’12, 31–40.
10. Fitzmaurice, G.W. Situated information spaces and 32. Maurer, U., Rowe, A., Smailagic, A., and Siewiorek,
spatially aware palmtop computers. CACM 36, 7 (1993), D.P. eWatch. BSN’06, 142–145.
39–49. 33. McCallum, D.C. and Irani, P. ARC-Pad. UIST ’09, 153.
11. Gratier, M. Grounding in musical interaction: Evidence 34. Merrill, D., Kalanithi, J., and Maes, P. Siftables.
from jazz performances. Musicae Scientiae 12, 1 Suppl TEI ’07, 75–78.
(2008), 71–110. 35. Ni, T. and Baudisch, P. Disappearing mobile devices.
12. Harrison, C. and Hudson, S.E. Abracadabra. UIST ’09, UIST ’09, 101–110.
121–124. 36. Oney, S., Harrison, C., Ogan, A., and Wiese, J.
13. Harrison, C., Schwarz, J., and Hudson, S.E. TapSense. ZoomBoard. CHI ’13, 2799–2803.
UIST ’11, 627–634. 37. Pebble. Pebble E-Paper Watch. http://getpebble.com/.
14. Hinckley, K., Pierce, J., Horvitz, E., and Sinclair, M. 38. Post, E.R. and Orth, M. Smart fabric, or “wearable
Foreground and background interaction with sensor- clothing.” ISWC ’97, 167–168.
enhanced mobile devices. TOCHI ’12, 1 (2005), 31–52. 39. Rahman, M., Gustafson, S., Irani, P., and Subramanian,
15. Hinckley, K., Pierce, J., Sinclair, M., and Horvitz, E. S. Tilt techniques. CHI ’09, 1943–1952.
Sensing techniques for mobile interaction. CHI ‘00, 91– 40. Rekimoto, J. Tilting operations for small screen
100. interfaces. UIST ’96, 167–168.
16. Hinckley, K., Ramos, G., Guimbretiere, F., Baudisch, 41. Rekimoto, J. Pick-and-drop. UIST ’97, 31–39.
P., and Smith, M. Stitching. AVI ’04, 23–30. 42. Rekimoto, J. GestureWrist and GesturePad. ISWC ’01,
17. Hinckley, K. Synchronous gestures for multiple persons 21–27.
and computers. UIST ’03, 149–158. 43. Ruiz, J. and Li, Y. DoubleFlip. CHI ’11, 2717–2720.
18. Holmquist, L.E., Mattern, F., Schiele, B., Alahuhta, P., 44. Santosa, S. and Wigdor. D.. A field study of multi-
Beigl, M., and Gellersen, H. Smart-Its Friends. Ubicomp device workflows in distributed workspaces. UbiComp
‘01, 116–122. ’13. 63-72.
19. Hudson, S.E., Harrison, C., Harrison, B.L., and 45. Schilit, B., Adams, N., and Want, R. Context-Aware
LaMarca, A. Whack gestures. TEI ’10, 109–103. Computing Applications. First Workshop on Mobile
20. Ishiguro, Y., Mujibiya, A., Miyaki, T., and Rekimoto, J. Computing Systems and Applications, 85–90.
Aided eyes. AH ’10, 1–7. 46. Schmidt, D., Seifert, J., Rukzio, E., and Gellersen, H. A
21. Jones, B., Sodhi, R., Forsyth, D., Bailey, B., and cross-device interaction style for mobiles and surfaces.
Maciocci, G. Around device interaction for multiscale DIS ’12, 318–327.
navigation. MobileHCI ’12, 83–92. 47. Siek, K.A., Rogers, Y., and Connelly, K.H. Fat finger
22. Kim, D., Hilliges, O., Izadi, S., et al. Digits. UIST ’12, worries INTERACT ’05, 267–280.
167–176. 48. Streitz, N.A., Konomi, S., and Burkhardt, H.-J.,
23. Kim, J., He, J., Lyons, K., and Starner, T. The Gesture Roomware for cooperative buildings. Cooperative
Watch. ISWC ’07, 1–8. Buildings: Integrating Information, Organization, and
24. Kortuem, G., Kray, C., and Gellersen, H. Sensing and Architecture, 1998. 4-21.
visualizing spatial relations of mobile devices. 49. Strohmeier, P., Vertegaal, R., and Girouard, A. With a
UIST ’05, 93. flick of the wrist. TEI ’12, 307–308.
25. Kray, C., Rohs, M., Hook, J., and Kratz, S. Group 50. Vogel, D. and Baudisch, P. Shift. CHI ’07, 657–666.
coordination and negotiation through spatial proximity 51. Wigdor, D., Forlines, C., Baudisch, P., Barnwell, J., and
regions around mobile devices on augmented tabletops. Shen, C. Lucid touch. UIST ’07, 269–278.
3rd IEEE International Workshop on Horizontal 52. Yee, K.-P. Peephole displays. CHI ’03, 1–9.
Interactive Human Computer Systems, 1–8.
168

Document

Uploaded by

Copyright:

Available Formats

Document

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Document

Uploaded by

Copyright:

Available Formats

Session: Watches and Small Devices CHI 2014, One of a CHInd, Toronto, ON, Canada

Duet: Exploring Joint Interactions

Current commercial designs: Prior research:

Spatial Configuration Multi-device gestures: We adapted the stitching technique

Figure 6. In Email, Multi-device gestures are used to manage

Figure 10. A tap accesses a basic menu (a), while a flip-and-tap

Figure 13. Subjective rankings show an overall positive

You might also like