Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
by
Certificate
This is to certify that the project entitled “Hand Written Character Recognition” is a Bonafede work
of
submitted to the University of Mumbai in partial fulfilment of the requirement for the award
of the degree of “Bachelor of Engineering” in “Computer Engineering”.
A PROJECT REPORT
Submitted By
1. Ankit Singh
2. Mitesh Devpura
3. Mohit Agarwal
In partial fulfilment of the Degree of B.E. in computer Engineering is approved.
Examiners
Date:
TABLE OF CONTENTS
Abstract…................................................................................................................... 5
Acknowledgement ..................................................................................................... 6
Chapter 1. Introduction
1.1 Introduction 8
1.2 Motivation 11
ABSTRACT
The aim of this project is to develop an Optical Capture Recognition
(OCR) for Android based mobile devices. Scanned text documents,
pictures stored in mobile phones having Android as operating system, and
pictures taken by any Android device are the mainly focus of this
application. This capstone project is divided into two parts, the desktop
application that has been implemented by Mehdi Barakat, and the
Android based application that I have implemented. The purpose of this
application is to recognize text in scanned text documents, text images,
and any picture taken by an Android based device in order to reuse it
later. This application will allow its users to perform many actions in few
minutes, such as copy text from these aforementioned documents and
modify it, instead of wasting time on retyping it.
6
ACKNOWLEDGEMENT
I remain immensely obliged to Prof. Jayendra Jadhav for providing me with
the idea of topic, for her invaluable support in gathering resources, her
guidance and supervision which made this work successful.
I would like to say that it has indeed been a fulfilling experience for working out
this project topic.
List of Figures
Sr. No. Name of figure
1. Architecture of system
2. Flow Chart
3. Activity Diagram
4. Sequential Diagram
5. UML Diagram
8
Chapter 1. Introduction
1.1 Introduction
In the running world, there is growing demand for the software systems to
recognize characters computer system when information is scanned through paper
documents as we know that we have number of newspapers and books which are
in printed format related to different subjects. These days there is a huge demand in
“storing the information available in these paper documents in to a computer
storage disk and then later reusing this information by searching process”. One
simple way to store information in these paper documents in to computer system is
to first scan the documents and then store them as IMAGES. But to reuse this
information it is very difficult to read the individual contents and searching the
contents form these documents line-by-line and word-by-word. The reason for this
difficulty is the font characteristics of the characters in paper documents are
different to font of the characters in computer system. As a result, computer is
unable to recognize the characters while reading them. This concept of storing the
contents of paper documents in computer storage place and then reading and
searching the content is called DOCUMENT PROCESSING. Sometimes in this
document processing we need to process the information that is related to
languages other than the English in the world. For this document processing we
need a software system called CHARACTER RECOGNITION SYSTEM. This process is
also called DOCUMENT IMAGE ANALYSIS (DIA).
Thus, our need is to develop character recognition software system to perform
Document Image Analysis which transforms documents in paper format to
electronic format. For this process there are various techniques in the world.
Among all those techniques we have chosen Optical Character Recognition as main
fundamental technique to recognize characters. The conversion of paper
documents in to electronic format is an on-going task in many of the organizations
particularly in Research and Development (R&D) area, in large business enterprises,
in government institutions, so on. From our problem statement we can introduce
the necessity of Optical Character Recognition in mobile electronic devices such as
cell phones, digital cameras to acquire images and recognize them as a part of face
recognition and validation
9
What is OCR?
The goal of Optical Character Recognition (OCR) is to classify optical patterns (often contained in a digital
image) corresponding to alphanumeric or other characters. The process of OCR involves several steps
including segmentation, feature extraction, and classification. Each of these steps is a field unto itself, and is
described briefly here in the context of a MATLAB implementation of OCR.
One example of OCR is shown below. A portion of a scanned image of text, borrowed from the web, is
shown along with the corresponding (human recognized) characters from that text.
A few examples of OCR applications are listed here. The most common for use OCR is the first item; people
often wish to convert text documents to some sort of digital representation.
1. People wish to scan in a document and have the text of that document available in a word
processor.
2. Recognizing license plate numbers
3. Post Office needs to recognize zip-codes
EXISTING SYSTEM
In the running world there is a growing demand for the users to convert the printed documents in
to electronic documents for maintaining the security of their data. Hence the basic OCR system
was invented to convert the data available on papers in to computer process able documents, so
that the documents can be editable and reusable. The existing system/the previous system of
OCR on a grid infrastructure is just OCR without grid functionality. That is the existing system deals
with the homogeneous character recognition or character recognition of single languages.
PROPOSED SYSTEM
Our proposed system is OCR on a grid infrastructure which is a character recognition system that
supports recognition of the characters of multiple languages. This feature is what we call grid
infrastructure which eliminates the problem of heterogeneous character recognition and
supports multiple functionalities to be performed on the document. The multiple functionalities
include editing and searching too whereas the existing system supports only editing of the
document. In this context, Grid infrastructure means the infrastructure that supports group of
specific set of languages. Thus, OCR on a grid infrastructure is multi-lingual.
1.2 MOTIVATION
• The primary reason for selecting handwritten digit recognition was due to
the unfamiliarity of the word that twigged an interest in the subject.
• Another reason was to know how computer vision works and help in
artificial intelligence. This aspect makes it more interesting.
12
A neural network trained for classification is designed to take input samples and
classify them into groups. These groups may be fuzzy, without clearly defined
boundaries. This project concerns detecting free handwritten characters.
16
2.3 OBJECTIVE
The main purpose of Optical Character Recognition (OCR) system based on a grid
infrastructure is to perform Document Image Analysis, document processing of electronic
document formats converted from paper formats more effectively and efficiently. This
improves the accuracy of recognizing the characters during document processing compared
to various existing available character recognition methods. Here OCR technique derives
the meaning of the characters, their font properties from their bit-mapped images.
The primary objective is to speed up the process of character recognition in document
processing. As a result, the system can process huge number of documents with-in less
time and hence saves the time.
Since our character recognition is based on a grid infrastructure, it aims to recognize
multiple heterogeneous characters that belong to different universal languages with
different font properties and alignments.
17
2.4 Scope
The scope of our product Optical Character Recognition on a grid infrastructure is to provide
an efficient and enhanced software tool for the users to perform Document Image Analysis,
document processing by reading and recognizing the characters in research, academic,
governmental and business organizations that are having large pool of documented, scanned
images. Irrespective of the size of documents and the type of characters in documents, the
product is recognizing them, searching them and processing them faster according to the
needs of the environment.
18
3.1 Algorithm
Optical Location
scanning
Segmentation
Preprocessing
Feature
Recognition
extraction
Post-
processing
The identity of each symbol is found by comparing the extracted features with descriptions of
the symbol classes obtained through a previous learning phase. Finally contextual information
is used to reconstruct the words and numbers of the original text. In the next sections these
steps and some of the methods involved are described in more detail.
19
Optical scanning.
Through the scanning process a digital image of the original document is captured. In OCR
optical scanners are used, which generally consist of a transport mechanism plus a sensing
device that converts light intensity into gray-levels. Printed documents usually consist of black
print on a white background. Hence, when performing OCR, it is common practice to convert
the multilevel image into a bilevel image of black and white. Often this process, known as
thresholding, is performed on the scanner to save memory space and computational effort.
The thresholding process is important as the results of the following recognition is totally
dependent of the quality of the bilevel image. Still, the thresholding performed on the scanner
is usually very simple. A fixed threshold is used, where gray-levels below this threshold is said
to be black and levels above are said to be white. For a high-contrast document with uniform
background, a prechosen fixed threshold can be sufficient. However, a lot of documents
encountered in practice have a rather large range in contrast. In these cases, more
sophisticated methods for thresholding are required to obtain a good result.
Figure 4: Problems in thresholding: Top: Original eyelevel image, Middle: Image thresholder with global method,
Bottom: Image thresholder with an adaptive method.
The best methods for thresholding are usually those which are able to vary the threshold over
the document adapting to the local properties as contrast and brightness. However, such
methods usually depend upon a multilevel scanning of the document which requires more
memory and computational capacity. Therefore, such techniques are seldom used in
connection with OCR systems, although they result in better images.
20
Preprocessing
The image resulting from the scanning process may contain a certain amount of noise.
Depending on the resolution on the scanner and the success of the applied technique for
thresholding, the characters may be smeared or broken. Some of these defects, which may
later cause poor recognition rates, can be eliminated by using a preprocessor to smooth the
digitized characters.
The smoothing implies both filling and thinning. Filling eliminates small breaks, gaps and holes
in the digitized characters, while thinning reduces the width of the line. The most common
techniques for smoothing, moves a window across the binary image of the character,
applying certain rules to the contents of the window.
In addition to smoothing, preprocessing usually includes normalization. The normalization is
applied to obtain characters of uniform size, slant and rotation. To be able to correct for
rotation, the angle of rotation must be found. For rotated pages and lines of text, variants of
Hough transform are commonly used for detecting skew. However, to find the rotation angle
of a single symbol is not possible until after the symbol has been recognized.
Feature extraction
The objective of feature extraction is to capture the essential characteristics of the symbols,
and it is generally accepted that this is one of the most difficult problems of pattern
recognition. The most straight forward way of describing a character is by the actual raster
image. Another approach is to extract certain features that still characterize the symbols, but
leaves out the unimportant attributes. The techniques for extraction of such features are
often divided into three main groups, where the features are found from:
• The distribution of points.
• Transformations and series expansions.
• Structural analysis.
22
The different groups of features may be evaluated according to their sensitivity to noise and
deformation and the ease of implementation and use. The results of such a comparison are
shown in table 1. The criteria used in this evaluation are the following:
• Robustness.
1) Noise.
Sensitivity to disconnected line segments, bumps, gaps, filled loops etc.
2) Distortions.
Sensitivity to local variations like rounded corners, improper protrusions, dilations and
shrinkage.
3) Style variation.
Sensitivity to variation in style like the use of different shapes to represent the same
character or the use of serifs, slants etc.
4) Translation.
Sensitivity to movement of the whole character or its components.
5) Rotation.
Sensitivity to change in orientation of the characters.
• Practical use.
1) Speed of recognition.
2) Complexity of implementation.
3) Independence.
The need of supplementary techniques.
description that matches most closely provides recognition. The features are given as
numbers in a feature vector, and this feature vector is used to represent the symbol.
Distribution of points.
This category covers techniques that extracts features based on the statistical distribution of
points. These features are usually tolerant to distortions and style variations. Some of the
typical techniques within this area are listed below.
Zoning.
The rectangle circumscribing the character is divided into several overlapping, or
nonoverlapping, regions and the densities of black points within these regions are computed
and used as features.
Moments.
The moments of black points about a chosen center, for example the center of gravity, or a
chosen coordinate system, are used as features.
n-tuples.
The relative joint occurrence of black and white points (foreground and background) in
certain specified orderings, are used as features.
Characteristic loci.
For each point in the background of the character, vertical and horizontal vectors are
generated. The number of times the line segments describing the character are intersected by
these vectors are used as features.
Figure 7: Zoning
24
Many of these transformations are based on the curve describing the contour of the
characters. This means that these features are very sensitive to noise affecting the contour of
the character like unintended gaps in the contour. In table 2 these features are therefore
characterized as having a low tolerance to noise. However, they are tolerant to noise
affecting the inside of the character and to distortions.
25
Structural analysis.
During structural analysis, features that describe the geometric and topological structures of a
symbol are extracted. By these features one attempts to describe the physical makeup of the
character, and some of the commonly used features are strokes, bays, end-points,
intersections between lines and loops. Compared to other techniques the structural analysis
gives features with high tolerance to noise and style variations. However, the features are
only moderately tolerant to rotation and translation. Unfortunately, the extraction of these
features is not trivial, and to some extent still an area of research.
Classification.
The classification is the process of identifying each character and assigning to it the correct
character class. In the following sections two different approaches for classification in
character recognition are discussed. First decision-theoretic recognition is treated. These
methods are used when the description of the character can be numerically represented in a
feature vector.
We may also have pattern characteristics derived from the physical structure of the character
which are not as easily quantified. In these cases, the relationship between the characteristics
may be of importance when deciding on class membership. For instance, if we know that a
character consists of one vertical and one horizontal stroke, it may be either an “L” or a “T”,
and the relationship between the two strokes is needed to distinguish the characters. A
structural approach is then needed.
26
Decision-theoretic methods.
The principal approaches to decision-theoretic recognition are minimum distance classifiers,
statistical classifiers and neural networks. Each of these classification techniques are briefly
described below.
Matching.
Matching covers the groups of techniques based on similarity measures where the distance
between the feature vector, describing the extracted character and the description of each
class is calculated. Different measures may be used, but the common is the Euclidean
distance. This minimum distance classifier works well when the classes are well separated,
that is when the distance between the means is large compared to the spread of each class.
When the entire character is used as input to the classification, and no features are extracted
(template-matching), a correlation approach is used. Here the distance between the
character image and prototype images representing each character class is computed.
Neural networks.
Recently, the use of neural networks to recognize characters (and other types of patterns)
has resurfaced. Considering a back-propagation network, this network is composed of several
layers of interconnected elements. A feature vector enters the network at the input layer.
27
Each element of the layer computes a weighted sum of its input and transforms it into an
output by a nonlinear function. During training the weights at each connection are adjusted
until a desired output is obtained. A problem of neural networks in OCR may be their limited
predictability and generality, while an advantage is their adaptive nature.
3.1.5.2 Structural Methods.
Within the area of structural recognition, syntactic methods are among the most prevalent
approaches. Other techniques exist, but they are less general and will not be treated here.
Syntactic methods.
Measures of similarity based on relationships between structural components may be
formulated by using grammatical concepts. The idea is that each class has its own grammar
defining the composition of the character. A grammar may be represented as strings or trees,
and the structural components extracted from an unknown character is matched against the
grammars of each class. Suppose that we have two different character classes which can be
generated by the two grammars G1 and G2, respectively. Given an unknown character, we say
that it is more similar to the first class if it may be generated by the grammar G1, but not by G2.
Post processing.
Grouping.
The result of plain symbol recognition on a document, is a set of individual symbols. However,
these symbols in themselves do usually not contain enough information. Instead, we would
like to associate the individual symbols that belong to the same string with each other,
making up words and numbers. The process of performing this association of symbols into
strings, is commonly referred to as grouping. The grouping of the symbols into strings is based
on the symbols’ location in the document. Symbols that are found to be sufficiently close are
grouped together.
For fonts with fixed pitch the process of grouping is fairly easy as the position of each
character is known. For typeset characters the distance between characters is variable.
However, the distance between words is usually significantly larger than the distance
between characters, and grouping is therefore still possible. The real problems occur for
handwritten characters or when the text is skewed.
of all characters, but some of these errors may be detected or even corrected by the use of
context.
There are two main approaches, where the first utilizes the possibility of sequences of
characters appearing together. This may be done by the use of rules defining the syntax of the
word, by saying for instance that after a period there should usually be a capital letter. Also,
for different languages the probabilities of two or more characters appearing together in a
sequence can be computed and may be utilized to detect errors. For instance, in the English
language the probability of a “k” appearing after an “h” in a word is zero, and if such a
combination is detected an error is assumed.
Another approach is the use of dictionaries, which has proven to be the most efficient
method for error detection and correction. Given a word, in which an error may be present,
the word is looked up in the dictionary. If the word is not in the dictionary, an error has been
detected, and may be corrected by changing the word into the most similar word.
Probabilities obtained from the classification, may help to identify the character which has
been erroneously classified. If the word is present in the dictionary, this does unfortunately
not prove that no error occurred. An error may have transformed the word from one legal
word to another, and such errors are undetectable by this procedure. The disadvantage of the
dictionary methods is that the searches and comparisons implied are time-consuming.
29
Functional Requirements
• The system should process the input given by the user only if it is an image
file (JPG, PNG etc.)
• System shall show the error message to the user when the input given is
not in the required format.
• System should detect characters present in the image.
• System should retrieve characters present in the image and display them to
the user.
Non-Functional Requirements
• Performance: Handwritten characters in the input image will be recognized
with an accuracy of about 90% and more.
30
OCR systems
OCR systems may be subdivided into two classes. The first class includes the special purpose
machines dedicated to specific recognition problems. The second class covers the systems
that are based on a PC and a low-cost scanner.
Fixed font.
OCR machines of this category deals with the recognition of one specific typewritten font.
Such fonts are OCR-A, OCR-B, Pica, Elite, etc. These fonts are characterized by fixed spacing
between each character. The OCR-A and OCR-B are the American and European standard
fonts specially designed for optical character recognition, where each character has a unique
shape to avoid ambiguity with other characters similar in shape. Using these character sets, it
is quite common for commercial OCR machines to achieve a recognition rate as high as
99.99% with a high reading speed.
The systems of the first OCR generation were fixed font machines, and the methods applied
were usually based on template matching and correlation.
Multifront.
Multifront OCR machines recognize more than one font, as opposed to a fixed font system,
which could only recognize symbols of one specific font. However, the fonts recognized by
these machines are usually of the same type as those recognized by a fixed font system.
These machines appeared after the fixed-font machines. They were able to read up to about
ten fonts. The limit in the number of fonts were due to the pattern recognition algorithm,
template matching, which required that a library of bit map images of each character from
each font was stored. The accuracy is quite good, even on degraded images, as long as the
fonts in the library are selected with care.
Omni font.
An omni font OCR machine can recognize most nonutilized fonts without having to maintain
huge databases of specific font information. Usually omni font-technology is characterized by
the use of feature extraction. The database of an omni font system will contain a description
of each symbol class instead of the symbols themselves. This gives flexibility in automatic
recognition of a variety of fonts.
Although omni font is the common term for these OCR systems, this should not be
understood literally as the system being able to recognize all existing fonts. No OCR machine
performs equally well, or even usably well, on all the fonts used by modern typesetters. A lot
of current OCR-systems claim to be omni font.
Constrained handwriting.
Recognition of constrained handwriting deals with the problem of unconnected normal
handwritten characters. Optical readers with such capabilities are not yet very common, but
do exist. However, these systems require well-written characters, and most of them can only
recognize digits unless certain standards for the hand-printed characters are followed (see
33
figure 10). The characters should be printed as large as possible to retain good resolution, and
entered in specified boxes. The writer is also instructed to keep to certain models provided,
avoiding gaps and extra loops. Commercially the term ICR (Intelligent Character Recognition)
is often used for systems able to recognize handprinted characters.
Script.
All the methods for character recognition described in this document treat the problem of
recognition of isolated characters. However, to humans it might be of more interest if it were
possible to recognize entire words consisting of cursively joined characters. Script recognition
deals with this problem of recognizing unconstrained handwritten characters which may be
connected or cursive.
In signature verification and identification, the objective is to establish the identity of the
writer, irrespective of the handwritten contents. Identification establishes the identity of the
writer by comparing specific attributes of the pattern describing the signature, with those of a
list of writers stored in a reference database. When performing signature verification, the
34
claimed identity of the writer is known, and the signature pattern is matched against the
signature stored in the database for this person. Some systems of this kind are starting to
appear.
A more difficult problem is script recognition where the contents of the handwriting must be
recognized. This is one of the really challenging areas of optical character recognition. The
variations in shape of handwritten characters are infinite and depend on the writing habit,
style, education, mood, social environment and other conditions of the writer. Even the best
trained optical readers, the humans, make about 4% errors when reading in the lack of
context. Recognition of characters written without any constraint is still quite remote. For the
time being, recognition of handwritten script seems to belong only to online products where
writing tablets are used to extract real-time information and features to aid recognition.
Segmentation.
The majority of errors in OCR-systems are often due to problems in the scanning process and
the following segmentation, resulting in joined or broken characters. Errors in the
segmentation process may also result in confusion between text and graphics or between
text and noise.
Feature extraction.
Even if a character is printed, scanned and segmented correctly, it may be incorrectly
classified. This may happen if the character shapes are similar and the selected features are
not sufficiently efficient in separating the different classes, or if the features are difficult to
extract and has been computed incorrectly.
Classification.
Incorrect classification may also be due to poor design of the classifier. This may happen if the
classifier has not been trained on a sufficient number of test samples representing all the
possible forms of each character.
35
Grouping.
Finally, errors may be introduced by the postprocessing, when the isolated symbols are
associated to reconstruct the original words as characters may be incorrectly grouped. These
problems may occur if the text is skewed, in some cases of proportional spacing and for
symbols having subscripts or superscripts.
As OCR devices employ a wide range of approaches to character recognition, all systems are
not equally affected by the above types of complexities. The different systems have their
particular strengths and weaknesses. In general, however, the problems of correct
segmentation of isolated characters are the ones most difficult to overcome, and recognition
of joined and split characters are usually the weakest link of an OCR-system.
• Recognition rate.
The proportion of correctly classified characters.
• Rejection rate.
The proportion of characters which the system was unable to recognize. Rejected
characters can be flagged by the OCR-system, and are therefore easily retraceable for
manual correction.
• Error rate.
The proportion of characters erroneously classified. Misclassified characters go by
undetected by the system, and manual inspection of the recognized text is necessary to
detect and correct these errors.
There is usually a tradeoff between the different recognition rates. A low error rate may lead
to a higher rejection rate and a lower recognition rate. Because of the time required to detect
and correct OCR errors, the error rate is the most important when evaluating whether an OCR
system is cost-effective or not. The rejection rate is less critical. An example from barcode
reading may illustrate this. Here a rejection while reading a barcoded price tag will only lead to
rescanning of the code or manual entry, while a mis decoded price tag might result in the
customer being charged for the wrong amount. In the barcode industry the error rates are
36
therefore as low as one in a million labels, while a rejection rate of one in a hundred is
acceptable.
In view of this, it is apparent that it is not sufficient to look solely on the recognition rates of a
system. A correct recognition rate of 99%, might imply an error rate of 1%. In the case of text
recognition on a printed page, which on average contains about 2000 characters, an error
rate of 1% means 20 undetected errors per page. In postal applications for mail sorting, where
an address contains about 50 characters, an error rate of 1% implies an error on every other
piece of mail.
37
Future improvements
New methods for character recognition are still expected to appear, as the computer
technology develops and decreasing computational restrictions open up for new approaches.
There might for instance be a potential in performing character recognition directly on grey
level images. However, the greatest potential seems to lie within the exploitation of existing
methods, by mixing methodologies and making more use of context.
Integration of segmentation and contextual analysis can improve recognition of joined and
split characters. Also, higher level contextual analysis which look at the semantics of entire
sentences may be useful. Generally, there is a potential in using context to a larger extent
than what is done today. In addition, combinations of multiple independent feature sets and
classifiers, where the weakness of one method is compensated by the strength of another,
may improve the recognition of individual characters.
The frontiers of research within character recognition have now moved towards the
recognition of cursive script, that is handwritten connected or calligraphic characters.
Promising techniques within this area, deal with the recognition of entire words instead of
individual characters.
38
Future needs
Today optical character recognition is most successful for constrained material, that is
documents produced under some control. However, in the future it seems that the need for
constrained OCR will be decreasing. The reason for this is that control of the production
process usually means that the document is produced from material already stored on a
computer. Hence, if a computer readable version is already available, this means that data
may be exchanged electronically or printed in a more computer readable form, for instance
barcodes.
The applications for future OCR-systems lie in the recognition of documents where control
over the production process is impossible. This may be material where the recipient is cut off
from an electronic version and has no control of the production process or older material
which at production time could not be generated electronically. This means that future OCR-
systems intended for reading printed text must be omni font.
Another important area for OCR is the recognition of manually produced documents. Within
postal applications for instance, OCR must focus on reading of addresses on mail produced by
people without access to computer technology. Already, it is not unusual for companies etc.,
with access to computer technology to mark mail with barcodes. The relative importance of
handwritten text recognition is therefore expected to increase.
39
4.1 Conclusion
40
Summary
Character recognition techniques associate a symbolic identity with the image of character.
Character recognition is commonly referred to as optical character recognition (OCR), as it
deals with the recognition of optically processed characters. The modern version of OCR
appeared in the middle of the 1940’s with the development of the digital computers. OCR
machines have been commercially available since the middle of the 1950’s. Today OCR-
systems are available both as hardware devices and software packages, and a few thousand
systems are sold every week.
In a typical OCR system input characters are digitized by an optical scanner. Each character is
then located and segmented, and the resulting character image is fed into preprocessor for
noise reduction and normalization. Certain characteristics are the extracted from the
character for classification. The feature extraction is critical and many different techniques
exist, each having its strengths and weaknesses. After classification the identified characters
are grouped to reconstruct the original symbol strings, and context may then be applied to
detect and correct errors.
Optical character recognition has many different practical applications. The main areas where
OCR has been of importance, are text entry (office automation), data entry (banking
environment) and process automation (mail sorting).
The present state of the art in OCR has moved from primitive schemes for limited character
sets, to the application of more sophisticated techniques for omni font and handprint
recognition. The main problems in OCR usually lie in the segmentation of degraded symbols
which are joined or fragmented. Generally, the accuracy of an OCR system is directly
dependent upon the quality of the input document. Three figures are used in ratings of OCR
systems; correct classification rate, rejection rate and error rate. The performance should be
rated from the systems error rate, as these errors go by undetected by the system and must
be manually located for correction.
In spite of the great number of algorithms that have been developed for character
recognition, the problem is not yet solved satisfactory, especially not in the cases when there
are no strict limitations on the handwriting or quality of print. Up to now, no recognition
algorithm may compete with man in quality. However, as the OCR machine is able to read
much faster, it is still attractive.
In the future the area of recognition of constrained print is expected to decrease. Emphasis
will then be on the recognition of unconstrained writing, like omni font and handwriting. This
is a challenge which requires improved recognition techniques. The potential for OCR
algorithms seems to lie in the combination of different methods and the use of techniques
that are able to utilize context to a much larger extent than current methodologies.
41
Chapter 5. References
REFERENCES
1. Varvaras, Gatos, N. Stathopoulos, and Stathopoulos, “A Complete
Optical Character Recognition Methodology for Historical Documents”,
the Eighth IAPR Workshop on Document Analysis Systems
3. Raejean Plamodel, Fellow IEEE and Sager N. Srihari, Fellow IEEE, “On-Line
and Off-Line Handwriting character Recognition: A Comprehensive Survey”,
1EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE. VOL. 22, NO. 1. JANUARY 2000
Progress/Attendance Report
1 20th July
- 24th Introduction
July
2 27th July
-31st
July Literature Survey
3 3rd Aug.
-7th Aug.
Survey Existing System
45
4 10th
Aug.-
14th
Problem Statement
Aug.
5 17th Presentation
Aug.- 1
21st
Aug.
6 24th
Aug.- Objective
28th
Aug.
7 31st
Aug.-
04 Scope of the Work
Sept
8 07th
Sept.-12 Algorithm
Sept.
9 14th
Sept
- 19th
10 21st
Sept.– Design details
26th
Sept.
11 28th
Sept List of figures
03rd Oct
46
12 5th Oct.
– 9th
Oct.
Analysis
13 26th
Oct.-
30th Framework
Oct.
14 2nd
Nov.-
6th Methodology
Nov.
15 27th
Nov.-2nd
Dec. Conclusion