Yao 2012
Yao 2012
Yao 2012
Abstract—This paper presents the design and implementation of exploited the power of PowerWall technology [3] for
a multi-touch gesture interface framework that has been stakeholder engagement in urban planning. The interaction and
designed and implemented for collaborative urban planning. The team collaboration within the COPE is currently supported
user gestures required for supporting collaborative urban through a gamepad interface. This paper presents the authors’
planning have been defined by analysing a well established urban approach for enhancing the current user interaction capability
planning environment, previously developed by the authors. by exploiting the power of natural hand gestures. Furthermore,
Although, the urban planning have been considered as the it presents a generic multi-touch based natural user interface
application context, the overall multi-touch gesture interface framework that provides a range of high level gestures that can
framework has been designed and implemented as a layered
be used to develop interactive information exploration in
architecture to support any high level applications. This layered
gesture architecture comprises a multi-touch raw data layer, a
applications such as urban planning.
basic gesture layer and an application specific gesture layer.
Keywords: Multi Touch; NUI; gesture design; gesture II. REVIEW
framework During recent years, there have been many research and
development efforts towards creating multi-touch natural user
I. INTRODUCTION interfaces to enhance user interaction and collaboration.
However, designing and implementing a multi-touch natural
The term natural user interface (NUI) is used to refer to
user interface is still a challenging task. In this research the
interfaces which allow the user to interact with a system based
authors’ main concerns are the selection of a set of natural
on the knowledge learnt from using other systems [1].
gestures appropriate for urban planning tasks and the
Typically, NUI can be aided by technologies allowing users to
development of a multi-touch gesture framework that can
carry out natural motions, movements or gestures to control the
easily be deployed in supporting geo-spatial applications such
computer application or manipulate on-screen content. A multi-
as urban planning.
touch device is one of the technologies that have emerged
recently and is widely used in creating NUI for developing The design of multi-touch gesture sets have been
interactive computer applications. investigated by some researchers, such as Ringel et al. and Wu
et al [4, 5]. Hinrichs et al. observed many different gesture
Multi-touch devices enable users to directly interact with
instances from a field study and categorised them into a group
information on screens using fingers as input devices. It
of low-level actions: drag/move, enlarge/shrink, rotate, tap,
provides users with a stronger feeling of control over their
sweep, flick, and hold [6]. Existing multi-touch based systems,
interactions rather than being controlled by the system [2].
such as Apple iOS and Android systems, also impact on the
Multi-touch interaction is becoming common as a type of user
gesture set design. Due to their increasing popularity, certain
interface for many software applications running on mobile
gestures have become widely known by the general public.
devices such as tablets and smart phones. Furthermore, due to
Other related work, such as GestureWorks [7], has also
its popularity and ease of use, the multi-touch input method
influenced gesture sets and their possible meanings. Although
opens up possibilities for a broader range of applications where
the selection of a gesture set mainly depends on the application
users need to participate and collaborate together to explore
requirements, all of these efforts have played a significant role
alternative solutions and build consensus.
in influencing the nature of multi-touch based applications and
This research explores how a multi-touch interface the design features of multi-touch based user interfaces.
framework can be developed to support intuitive user
Some multi-touch gesture frameworks have been developed
interaction during collaborative urban planning discussions.
to provide a development platform for various applications.
Since many stakeholders, with varying levels of computing
Three of the most popular multi-touch gesture frameworks are:
knowledge are involved in collaborative urban regeneration
an open source framework – Tangible User Interface Objects
projects, this research explores the development of a multi-
(TUIO), Windows 7 SDK from Microsoft [8], and PQ Labs
touch interface that is based on natural gestures, requiring little
multi-touch SDK (http://multi-touch-screen.com/sdk.html).
training for exploring the urban spaces and proposed designs.
TUIO is an open framework that defines a common protocol
This research builds on the authors’ previous research on and API for tangible multi-touch surfaces. The TUIO protocol
the COllaborative Planning Environment (COPE) which allows the transmission of an abstract description of interactive
500
urban planning environment. The following section explains
the final gestures designed for the application.
1) Navigation
Figure 3. Example of move-and-zoom gesture happening at the same time
a) Exo-centric navigation
In the exo-centric navigation mode, typical navigation b) Ego-centric navigation
functions include move, zoom and rotate (Table I). The In ego-centric navigation mode, only “driving/walking”
gestures for these operations are defined as moving fingers in functions are required. This function is mainly used to simulate
parallel, towards or apart from each other or rotating around a one who walks (moves and rotates) along a street. In this mode,
point respectively. In this design, the number of fingers to be the gestures are designed to use single touch to do both the
used are not defined which means the user can use any number navigation and the rotation. When a finger is placed on the
of fingers to do such operations. However, the zoom and rotate screen, the first touch point is used as the reference point of
operations cannot be operated with only one finger; they navigation. Then the finger will move around the reference
require a minimum of two fingers (Fig. 1) 1. point to control the navigation (Fig. 4). The distance between
the current touch point and the reference point indicates the
Although, the forwards and backwards tilting of the map is moving velocity. The angle between the “up vector” and the
a rotational operation, a different gesture is required to allow vector along reference point and the touch point defines the
the user to rotate the normal of the map away from the screen. rotation factor.
For this special action, when the user pins down the map using
one finger, a vertical scroll bar will be prompted on the screen.
Moving another finger within this bar will tilt the map (Fig. 2).
1
Some of the gesture figures are taken from GestureWorks [7]. When a full
hand is used in a gesture in these figures, it indicates that any number of
fingers can be used. Otherwise, it indicates the exact number of fingers used to Figure 4. Ego-centric navigation control
express a gesture
501
2) Undo or erase By using these four types of gestures, the multi-touch
This gesture imitates the action of erasing drawings from a device can control the digital map like a floating piece of paper
piece of paper. Using a palm to touch the screen, and moving map being handled with two hands.
the palm left and right will carry out the undo or erase
operation (Fig. 5). This operation will undo one action. To IV. THREE-LAYER FRAMEWORK FOR MULTI-TOUCH UI
undo again, the palm must be lifted off the screen first before
the gesture is repeated. A. Framework design
One of the objectives of this work is to explore a general
purpose multi-touch framework to support NUI for different
applications. Rather than developing all possible gestures, this
framework provides the possibility of aggregating low level
gestures to appropriate high level gestures, depending on the
application and user requirements. The authors’ solution is to
Figure 5. Undo or erase gesture. break gestures into limited numbers of low level gestures first.
Then higher-level gestures are recognised by combining a
3) Area selection sequence of low level gestures.
In this work, the application level gestures are broken into
several gesture “atoms”. The gesture atoms are basic “pure”
gestures. A series of atom gestures can be grouped together to
form an application gesture to represent a certain functional
meaning.
The basic gesture information is classified into two
Figure 6. Line-drawing gesture categories: dynamic gestures and touch information. Dynamic
gestures are formed by moving touches, such as finger-down,
Like a pin attaching paper to a board, the paper can rotate finger-up, move, zoom, rotate, etc. A touch information set
about the pin but cannot move. However, if there are more than provides all the touch points’ information (e.g. position,
two pins used to attach the paper to the board, the paper will be moving/non-moving, moved before, touch area, etc.) The two
fixed on the board. Based on this idea, the gesture for area types of information are used to define application gestures.
selection also comes from the way of handling paper on a Therefore, the multi-touch framework is organised into
board. Users can use two or more fingers to hold the map, and three layers which is illustrated in Fig. 7. The left part in Fig. 7
then use another finger moving on the screen to draw the shows the structure of the framework. The right part of Fig. 7
boundary of an area (Fig. 6). The end of the gesture is defined shows its relation to a COPE based application on a Windows
as the time when the user’s hand is removed from the screen. platform.
However, if the user changes his/her mind, he/she can use the
undo/erase gesture to cancel the area selection operation before Multi-Touch Frame Work An OSG Application
releasing the ‘holding gesture’ from the screen.
Application Gesture Layer OSG Event Queue
502
• The 3rd layer is the application layer. This layer V. INITIAL NUI PROTOTYPE IMPLEMENTATION
provides the final user interface based on the gestures To test the feasibility of the multi-touch NUI framework,
captured by the 2nd layer. It uses the gestures to the authors have implemented and integrated the framework
interact with the context of the application and with the COPE environment.
provides the special user interface functions required
by the application. Therefore, the 3rd layer is an The COPE implementation is based on Open Scene Graph
application related layer, and cannot be typically used (2.8) on a Microsoft Windows platform. Since the OSG does
for general purposes. not process multi-touch events, one of the key tasks of the
prototype is to act as a plug-in for the OSG-based COPE
B. Gestures’ definition within each layer architecture so that the multi-touch events are accepted without
modifying any code of the OSG itself.
1) RAW layer In order to implement the NUI plug-in for the COPE, the
The RAW layer gestures are usually the initial touch multi-touch events need to be captured. At same time, the NUI
information, such as the touch id, orientation, touch size and prototype also needs to distinguish the multi-touch events from
shape, touch image for each touch on the screen surface. The any other types of events, such as mouse events, so that the
details of the data are usually decided by the touch screen’s original interface still works.
capability and provided to the application over the device The implementation approach is based on two steps. The
driver or 3rd party libraries. first step is to capture multi-touch events and place them into
Because of the data structures of different libraries, the data the OSG’s event queue. The events cannot be placed into the
may need to be restructured for them to fit into the higher Windows’ event queue because the OSG can only put very
layers. limited types of Windows’ events into the OSG event queue,
and all other types of events are thrown away. The second step
is to process those multi-touch events in an extended
2) Basic gesture layer manipulator which is based on the OSG’s event manipulator. In
The basic gesture layer tries to capture gestures from raw the implementation of the COPE prototype, a new manipulator,
touch information and generate basic gesture information for derived from the OSG’s matrix-manipulator, has been
the application layer. implemented so that a specially designed user interface could
This layer first generates statistical information from the be implemented. It supports a plug-in structure by
raw touch information, such as the number of moving touches, implementing an event manipulating chain which dispatches
the number of non-moving touches and each touch’s history. events first to the plug-ins’ manipulator. If not processed, it
Then it captures the basic gesture based only on moving then dispatches them to the original event manipulator.
touches.
The gestures that the 2nd layer provides include:
• Parallel move
• Split
• Rotate
• Finger-down, finger-up, single tap and double tap for
each touch
• Begin, end and inertia begin
Because most of the current multi-touch devices cannot
distinguish touches from different people, it assumes that all
touches come from the same person. In general, the framework
should also provide touch-grouping functions to support
collaboration. Figure 8. The COPE multi-touch user interface
503
transform the COPE urban planning environment into a multi- [2] S. Bachl, M. Tomitsch, C. Wimmer, and T. Grechenig, “Challenges for
touch supported application. designing the user experience of multi-touch interfaces,” presented at the
Engineering Patterns for Multi-Touch Interfaces Workshop (MUTI'10)
The three-layer framework provides a generic structure for of the ACM SIGCHI Symposium on Engineering Interactive Computing
Systems, Berlin, Germany, 2010.
the development of a multi-touch user interface. The
implementation of the prototype based on the COPE has [3] J. Yao, T. Fernando, H. Tawfik, R. Armitage, and I. Billing, “A VR-
centred workspace for supporting collaborative urban planning,” in 9th
demonstrated the capability of the framework. International Conference on Computer Supported Cooperative Work in
Design, Coventry, UK, 2005.
The prototype has been informally tested by a small group
of end-users. During the tests, the authors have intentionally [4] M. Ringel, K. Ryall, C. Shen, C. Forlines, and F. Vernier, “Release,
relocate, reorient, resize: fluid techniques for document sharing on multi-
asked users to use this system without offering prior knowledge user interactive tables,” presented at the CHI '04 extended abstracts on
of the multi-touch gestures. The initial results show that there is Human Factors in Computing Systems, Vienna, Austria, 2004.
little difficulty for these users to navigate around (in the exo- [5] M. Wu, C. Shen, K. Ryall, C. Forlines, and R. Balakrishnan, “Gesture
centric mode) with multiple fingers which includes move, registration, relaxation, and reuse for multi-point direct-touch surfaces,”
zoom and rotate. Object selection with the tap gesture was also presented at the Proceedings of the First IEEE International Workshop
carried out without difficulty. However, the less commonly on Horizontal Interactive Human-Computer Systems, 2006.
used gestures, such as the hold gesture, the tilt operation, the [6] U. Hinrichs and S. Carpendale, “Gestures in the wild: studying multi-
touch gesture sequences on interactive tabletop exhibits,” presented at
erase and area selection required prompting and training. Many the Proceedings of the 2011 Annual Conference on Human factors in
users were simply unaware of the presence of such functions. Computing Systems, Vancouver, BC, Canada, 2011.
After introducing these gestures the users could use them with [7] GestureWorks - Multitouch Framework - Build Gesture-Driven Apps.
ease. http://gestureworks.com.
[8] Y. Kiriaty, MultiTouch Capabilities in Windows 7, MSDN Magazine
The next step of the authors’ research is to carry out a http://msdn.microsoft.com/en-us/magazine/ee336016. aspx , 2009.
formal usability evaluation and to integrate the framework with [9] A. Bragdon, R. Zeleznik, B. Williamson, T. Miller, and J. J. LaViola Jr.,
other applications to validate the framework as a general “GestureBar: Improving the Approachability of Gesture-based
purpose system and enhance its usability. Interfaces,” presented at the Proceedings of the 27th International
Conference on Human Factors in Computing Systems, Boston, MA,
USA, 2009.
REFERENCES
[1] D. A. Norman, “Natural user interfaces are not natural,” Interactions,
vol. 17, pp. 6-10, 2010.
504