Word Retrieval from Kannada Document Images Using HOG and Morphological Features

Hangarge, Mallikarjun; Veershetty, C.; Rajmohan, P.; Mukarambi, Gururaj

doi:10.1007/978-981-10-4859-3_7

Mallikarjun Hangarge¹⁴,
C. Veershetty¹⁴,
P. Rajmohan¹⁴ &
…
Gururaj Mukarambi¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 709))

Included in the following conference series:

International Conference on Recent Trends in Image Processing and Pattern Recognition

792 Accesses
1 Citations

Abstract

This paper presents a method to retrieve words from Kannada documents. It works on Histogram of Oriented Gradients (HOG) and Morphological filters. A large dataset of 50000 words is created using 250 document pages belongs to different categories. A preprocessed document image is segmented using simple morphological filters. The histogram channels are designed over four-sided cells (i.e. R-HOG) to compute gradients of a word image. In parallel, morphological erosion, opening, top and bottom hat transformations are applied on each word. The densities of the resultant images are estimated. Later on, HOG and morphological features are fused. Then, the cosine distance is used to measure the similarity between two words i.e., query and candidate word, based on it, the relevance of the word is estimated by generating distance ranks. Then correctly matched words are selected at threshold 98%. The experimental results confirm the efficiency of our proposed method in terms of the average precision rate 91.23%, and average recall rate 84.78% as well as average F-measure 89.47%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Handwritten Word Image Matching Based on Heat Kernel Signature

Content-Based Document Image Retrieval Based on Document Modeling

Article 06 June 2020

Automatic logo detection from document image using HOG features

Article 10 June 2022

References

Otsu, N.: A threshold selection method from gray-level histograms. Pattern Anal. Mach. Intell. 9(1), 62–66 (1979)
MathSciNet Google Scholar
Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts, document analysis and recognition. Int. J. Doc. Anal. Recogn. 1, 218–222 (2003)
Google Scholar
Konidaris, T., Gatos, B., Ntzios, K.: Keyword-guided word spotting in historical printed documents using synthetic data and user feedback. Int. J. Doc. Anal. Recogn. 9, 167–177 (2007)
Article Google Scholar
Lu, S., Li, L., Tan, C.L.: Document image retrieval through word shape coding. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1913–1918 (2008)
Article Google Scholar
Bai, S., Li, L., Tan, C.L.: Keyword spotting in document images through word shape coding. In: Document Analysis and Recognition, pp. 331–335 (2009)
Google Scholar
Hangarge, M., Dhandra, B.V.: Script identification in indian document images based on directional morphological filters. Int. J. Recent Trends Eng. 2, 124–126 (2009)
Google Scholar
Rabaev, I., Biller, O., El-Sana, J., Kedem, K., Dinstein, I.: Case study in Hebrew character searching. In: International Conference on Document Analysis and Recognition, pp. 1080–1084 (2011)
Google Scholar
Abidi, A., Siddiqi, I., Khurshid, K.: Towards searchable digital Urdu libraries-a word spotting based retrieval approach. In: International Conference on Document Analysis and Recognition, pp. 1344–1348 (2011)
Google Scholar
Yat, M., Lam, L., Suen, C.Y.: Arabic handwritten word spotting using language models, pp. 43–48 (2012)
Google Scholar
Doermann, D.: The indexing and retrieval of document images: a survey. Comput. Vis. Image Underst. 70(3), 287–298 (1998)
Article Google Scholar
Lu, S., Chen, B.M., Ko, C.C.: A partition approach for the restoration of camera images of planar and curled document. In: Image and Vision Computing, pp. 837–848 (2006)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 211–224 (2012)
Article Google Scholar
www.ee.iisc.ernet.in/new/people/student/phd/pati (2005)
Tarasawa, K., Tanaka, Y.: Slit style HOG feature for document image word spotting. In: ICDAR (2009)
Google Scholar
Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Recogn. Lett. 29, 1218–1229 (2008)
Article Google Scholar
Jain, R., Frinken, V., Jawahar, C.V., Manmatha, R.: BLSTM neural network based word retrieval for Hindi documents. In: 2011 International Conference on Document Analysis and Recognition, pp. 83–87 (2011)
Google Scholar
Tarafdar, A., Mondal, R., Pal, S., Pal, U., Kimura, F.: Shape code based word-image matching for retrieval of Indian multi-lingual documents. In: International Conference on Pattern Recognition (2010)
Google Scholar
Hangarage, M., Veershetty, C., Rajmohan, P., Dhandra, B.V.: Gabor wavelets based word retrieval from Kannada documents. Procedia Comput. Sci. 79, 441–448 (2016). International Conference on Communication, Computing and Visualization
Article Google Scholar

Download references

Author information

Authors and Affiliations

P.G. Department of Computer Science, Karnatak College, Bidar, India
Mallikarjun Hangarge, C. Veershetty & P. Rajmohan
Department of Computer Science, Gulbarga University, Kalaburagi, India
Gururaj Mukarambi

Authors

Mallikarjun Hangarge
View author publications
You can also search for this author in PubMed Google Scholar
C. Veershetty
View author publications
You can also search for this author in PubMed Google Scholar
P. Rajmohan
View author publications
You can also search for this author in PubMed Google Scholar
Gururaj Mukarambi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Veershetty .

Editor information

Editors and Affiliations

The University of South Dakota, Vermillion, South Dakota, USA
K.C. Santosh
Karnatak Arts, Science and Commerce College, Bidar, India
Mallikarjun Hangarge
Polytecnico di Bari, Bari, Italy
Vitoantonio Bevilacqua
University of Hyderabad, Hyderabad, India
Atul Negi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hangarge, M., Veershetty, C., Rajmohan, P., Mukarambi, G. (2017). Word Retrieval from Kannada Document Images Using HOG and Morphological Features. In: Santosh, K., Hangarge, M., Bevilacqua, V., Negi, A. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, vol 709. Springer, Singapore. https://doi.org/10.1007/978-981-10-4859-3_7

Download citation

DOI: https://doi.org/10.1007/978-981-10-4859-3_7
Published: 29 April 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4858-6
Online ISBN: 978-981-10-4859-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Word Retrieval from Kannada Document Images Using HOG and Morphological Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handwritten Word Image Matching Based on Heat Kernel Signature

Content-Based Document Image Retrieval Based on Document Modeling

Automatic logo detection from document image using HOG features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Word Retrieval from Kannada Document Images Using HOG and Morphological Features

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Handwritten Word Image Matching Based on Heat Kernel Signature

Content-Based Document Image Retrieval Based on Document Modeling

Automatic logo detection from document image using HOG features

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation