Intelligent Sign Language Recognition Using Image
Intelligent Sign Language Recognition Using Image
Intelligent Sign Language Recognition Using Image
net/publication/304196988
CITATIONS READS
18 5,011
2 authors:
1 PUBLICATION 18 CITATIONS
R. H. Sapat College of Engineering, Management Studies and Research
23 PUBLICATIONS 68 CITATIONS
SEE PROFILE
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Virtual Memory Management Techniques in 2.6 Linux kernel and challenges View project
All content following this page was uploaded by Archana Vaidya on 06 October 2020.
Abstract : Computer recognition of sign language is an important research problem for enabling communication
with hearing impaired people. This project introduces an efficient and fast algorithm for identification of the
number of fingers opened in a gesture representing an alphabet of the Binary Sign Language. The system does
not require the hand to be perfectly aligned to the camera. The project uses image processing system to identify,
especially English alphabetic sign language used by the deaf people to communicate. The basic objective of this
project is to develop a computer based intelligent system that will enable dumb people significantly to
communicate with all other people using their natural hand gestures. The idea consisted of designing and
building up an intelligent system using image processing, machine learning and artificial intelligence concepts
to take visual inputs of sign language’s hand gestures and generate easily recognizable form of outputs. Hence
the objective of this project is to develop an intelligent system which can act as a translator between the sign
language and the spoken language dynamically and can make the communication between people with hearing
impairment and normal people both effective and efficient. The system is we are implementing for Binary sign
language but it can detect any sign language with prior image processing
Keywords: Artificial Intelligence, Binary sign Language, Image Processing, Machine learning, Template
Matching.
I. INTRODUCTION
Dumb people are usually deprived of normal communication with other people in the society. It has
been observed that they find it really difficult at times to interact with normal people with their gestures, as only
a very few of those are recognized by most people. Since people with hearing impairment or deaf people cannot
talk like normal people so they have to depend on some sort of visual communication in most of the time.
Sign Language is the primary means of communication in the deaf and dumb community. As like any
other language it has also got grammar and vocabulary but uses visual modality for exchanging information.
The problem arises when dumb or deaf people try to express themselves to other people with the help of these
sign language grammars. This is because normal people are usually unaware of these grammars. As a result it
has been seen that communication of a dumb person are only limited within his/her family or the deaf
community.
The importance of sign language is emphasized by the growing public approval and funds for
international project. At this age of Technology the demand for a computer based system is highly demanding
for the dumb community. However, researchers have been attacking the problem for quite some time now and
the results are showing some promise. Interesting technologies are being developed for speech recognition but
no real commercial product for sign recognition is actually there in the current market.
The idea is to make computers to understand human language and develop a user friendly human
computer interfaces (HCI). Making a computer understand speech, facial expressions and human gestures are
some steps towards it. Gestures are the non-verbally exchanged information. A person can perform innumerable
gestures at a time. Since human gestures are perceived through vision, it is a subject of great interest for
computer vision researchers. The project aims to determine human gestures by creating an HCI. Coding of these
gestures into machine language demands a complex programming algorithm. In our project we are focusing on
Image Processing and Template matching for better output generation.
www.iosrjen.org 45 | P a g e
Intelligent Sign Language Recognition Using Image Processing
rate of 10Hz. Over the years advanced glove devices have been designed such as the Sayre Glove, Dexterous
Hand Master and Power Glove [1].
The most successful commercially available glove is by far the VPL Data Glove [2]. It was developed
by Zimmerman during the 1970’s. It is based upon patented optical fiber sensors along the back of the fingers.
Star-ner and Pentland developed a glove-environment system capable of recognizing 40 signs from the
American Sign Language (ASL) with a rate of 5Hz.
Another research is by Hyeon-Kyu Lee and Jin H. Kim presented work on real-time hand-gesture
recognition using HMM (Hidden Markov Model). Kjeldsen and Kendersi devised a technique for doing skin-
tone segmentation in HSV space, based on the premise that skin tone in images occupies a connected volume in
HSV space. They further developed a system which used a back-propagation neural network to recognize
gestures from the segmented hand images[1].
Etsuko Ueda and Yoshio Matsumoto presented a novel technique a hand-pose estimation that can be
used for vision-based human interfaces, in this method, the hand regions are extracted from multiple images
obtained by a multi viewpoint camera system, and constructing the “voxel Model”[6] . Hand pose is estimated.
Chan Wah Ng, Surendra Ranganath presented a hand gesture recognition system, they used image furrier
descriptor as their prime feature and classified with the help of RBF network. Their system’s overall
performance was 90.9%. Claudia Nolker and Helge Ritter presented a hand gesture recognition modal based on
recognition of finger tips, in their approach they find full identification of all finger joint angles and based on
that a 3D modal of hand is prepared and using neural network.
Fig 1 shows the overall idea of proposed system. The system consists of 4 modules. Image is captured
through the webcam. The camera is mounted on top of system facing towards the wall with neutral background.
Firstly, the captured Colored image is converted into the gray scale image which intern converted into the binary
form. Coordinates of captured image is calculated with respect to X and Y coordinates. The calculated
coordinates are then stored into the database in the form of template. The templates of newly created coordinates
are compared with the existing one. If comparison leads to success then the same will be converted into audio
and textual form. The system works in two different mode i.e. training mode and operational mode.
Training mode is part of machine learning where we are training our system to accomplish the task for which it
is implemented i.e. Alphabet Recognition.
However since the lighting was single overhead bulb, light intensity would be higher and shadowing effects
least if the camera was pointed downwards.
3.2.2Thresholding
Thresholding is the simple method of image segmentation[1]. In this method we convert the RGB
image to Binary image. Figure 2-5 shows the details of image processing . Binary image is digital image and
has only two values (0 or 1). For each pixel typically two colors are used black and white though any two colors
can be used. Here, the background pixels are converted into black color pixels and pixels containing our area of
interest are converted into white color pixels. It is nothing but the preprocessing.
www.iosrjen.org 47 | P a g e
Intelligent Sign Language Recognition Using Image Processing
Fig. 4 Binary Image(Mask image After Thresholding) Fig.5 only the area of interest is preserved
is discarded.(using color filters)
3.2.3Coordinate Mapping
In the previous step only the area containing marker color bands are preserved for further processing
and rest of the portion of image are converted into black color pixels this is shown in figure 6. This task of
converting color band pixels into the white color pixels is accomplished by setting the values of RGB color in
filter. After getting the marker pixels that are now highlighted as a white color pixels, coordinates of that area
for each color is generated. The newly generated coordinates are the compared with the stored coordinates in the
database for the purpose of output generation using pattern matching technique explained in the next section.
It is likely that the detection system will be subjected to varying lighting conditions (for example, due
to time of day or position of camera relative to light sources). Therefore it is likely that an occasional
recalibration will have to be performed. The calibration technique is discussed below:
A formal description of the initial calibration method is as follows: The image is a 2D array of pixels:
. . . . . . . . . . . . . . (1)
. . . . . . . . . (2)
www.iosrjen.org 48 | P a g e
Intelligent Sign Language Recognition Using Image Processing
. . . . . . . . . . . (3)
A formal description of skin detection is then as follows,
Using this method skin pixels were detected at the rate of 15fps on 2.00 GHz laptop
Each newly generated pixel value then gets compared with the previously stored template value in the
database. Algorithm proceeds until the comparison leads to success or failure. If algorithm returns positive result
then the sign will be converted into corresponding text and audio if comparison results into failure then the
proper error message will be displayed on the screen.
www.iosrjen.org 49 | P a g e
Intelligent Sign Language Recognition Using Image Processing
Figure 7 show binary finger tapping tool showing the significant values assigned to fingers by referring to
gesture table 2 shows code of each alphabet.
www.iosrjen.org 50 | P a g e
Intelligent Sign Language Recognition Using Image Processing
V. CONCLUSION
Our project aims to bridge the gap by introducing an inexpensive computer in the communication path
so that the sign language can be automatically captured, recognized and translated to speech for the benefit of
blind people. In the other direction, speech must be analyzed and converted to either sign or textual display on
the screen for the benefit of the hearing impaired.
REFERENCES
[1] Christopher Lee and Yangsheng Xu, “Online, interactive learning of gestures for human robot interfaces”
Carnegie Mellon University, the Robotics Institute, Pittsburgh, Pennsylvania, USA, 1996
[2] Richard Watson, “Gesture recognition techniques”, Technical report, Trinity College, Department of
Computer Science, Dublin, July, Technical Report No. TCD-CS-93-11, 1993
[3] Ray Lockton, ”Hand Gesture Recognition using computer Vision”,4th year project report, Ballilol
College , Department of Engineering Science, Oxford University,2000
[4] Ms. Rashmi D. Kyatanavar, Prof. P. R. Futane, “Comparative Study of Sign Language Recognition
Systems”, Department of Computer Engineering, Sinhgad College of Engineering, Pune, India
International Journal of Scientific and Research Publications, Volume 2, Issue 6, June 2012 ISSN 2250-
3153
[5] International Multi Conference of Engineers and Computer Scientists 2009”, Hong Kong , Vol I IMECS
2009, March 18 - 20, 2009
[6] Etsuko Ueda, Yoshio Matsumoto, Masakazu Imai, Tsukasa Ogasawara, ”Hand Pose Estimation for
Vision Based Human Interface”, IEEE Transactions on Industrial Electronics,Vol.50,No.4,pp.676-
684,2003.
Books:
[7] “digital image processing” (2nd Edition) Rafael C. Gonzalez (Author), Richard E. Woods (Author)
Publication Date: January 15, 2002 | ISBN-10: 0201180758 | ISBN-13: 978-0201180756
www.iosrjen.org 51 | P a g e