skip to main content
article

Offline handwritten Amharic word recognition

Published: 01 June 2011 Publication History

Abstract

This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained.

References

[1]
An overview of character recognition focused on off-line handwriting. IEEE Trans. Systems Man Cybernet. v31 i2. 216-233.
[2]
Multifont size-resilient recognition system for Ethiopic script. Internat. J. Document Anal. Recognition. v10 i2. 85-100.
[3]
Assabie, Y., Bigun, J., 2008. Lexicon-based offline recognition of Amharic words in unconstrained handwritten text. In: The 19th Internat. Conf. Pattern Recognition (ICPR2008), December 8-11, Tampa, Florida, USA. IEEE.
[4]
Bar-Yosef, I., Hagbi, N., Kedem, K., Dinstein, I., 2009. Line segmentation for degraded handwritten historical documents. In: Proceedings of the 10th Internat. Conf. Document Analysis and Recognition (ICDAR2009), Barcelona, Spain, pp. 1161-1165.
[5]
Gaussian derivative model for edge enhancement. Pattern Recognition. v27 i11. 1451-1461.
[6]
Vision with Direction. Springer, Heidelberg.
[7]
Optimal orientation detection of linear symmetry. In: First International Conference on Computer Vision, ICCV, London, June 8-11, IEEE Computer Society. pp. 433-438.
[8]
Recognition by symmetry derivatives and the generalized structure tensor. IEEE TPAMI. v26 i2. 1590-1605.
[9]
Bunke, H., 2003. Recognition of cursive Roman handwriting - past, present and future. In: Proc. 7th Internat. Conf. Document Analysis and Recognition, Edinburgh, pp. 448-459.
[10]
Character Recognition Systems. John Wiley, New York.
[11]
An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE TPAMI. v21 i8. 752-760.
[12]
Forty years of research in character and document recognition - An industrial perspective. Pattern Recognition. v41 i8. 2435-2446.
[13]
African language literatures: An introduction to the literary history of Sub-Saharan Africa. Three Continents Press, Washington.
[14]
Ethnologue: Languages of the world. fifteenth ed. SIL International, Dallas.
[15]
Huang, C., Srihari, S., 2008. Word segmentation of off-line handwritten documents. In: Proc. Document Recognition and Retrieval XV, IST/SPIE Annual Symposium, vol. 6815.
[16]
Statistical pattern recognition: A review. IEEE TPAMI. v22 i1. 4-37.
[17]
Recognising handwritten Arabic manuscripts using a single hidden Markov model. Pattern Recognition Lett. v24 i3. 2235-2242.
[18]
Kim, K., Kim, D., Aggarwal, J., 1998. Feature extraction of edge by directional computation of gray-scale variation. In: Proc. 14th Internat. Conf. on Pattern Recognition (ICPR'98), vol. 2, pp. 1022-1027.
[19]
Koerich, A.L., Leydier, Y., Sabourin, R., Suen, C.Y., 2002. A hybrid large vocabulary handwritten word recognition system using neural networks with hidden Markov models. In: Proc. IWFHR2002, pp. 99-104.
[20]
Koerich, A.L., Sabourin, R., Suen, C.Y., 2003. Lexicon-driven HMM decoding for large vocabulary handwriting recognition with multiple character models. 6, 126-144.
[21]
Li, Y., Zheng, Y., Doermann, D., Jaeger, S., 2006. A new algorithm for detecting text line in handwritten documents. In: Proc. 10th IWFHR, La Baule, France, pp. 35-40.
[22]
Classification and Learning Methods for Character Recognition: Advances and Remaining Problems. Springer, Berlin.
[23]
Handwriting Recognition: Soft Computing and Probabilistic Approaches. Springer, Berlin.
[24]
Offline Arabic handwritten word recognition: A survey. IEEE TPAMI. v28 i5. 712-724.
[25]
Text line and word segmentation of handwritten documents. Pattern Recognition. v42 i12. 3169-3183.
[26]
The role of holistic paradigms in handwritten word recognition. IEEE TPAMI. v23 i2. 149-164.
[27]
Artificial neural networks for document analysis and recognition. IEEE TPAMI. v27 i1. 23-35.
[28]
Meshesha, M., Jawahar, C.V., 2005. Recognition of printed Amharic documents. In: Internat. Conf. Document Analysis and Recognition (ICDAR), pp. 784-788.
[29]
Historical review of OCR research and development. Proc. IEEE. v80 i7. 1029-1058.
[30]
On-line and off-line handwriting recognition: A comprehensive survey. IEEE TPAMI. v22 i1. 63-84.
[31]
A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE. v77 i2. 257-286.
[32]
An introduction to hidden Markov models. IEEE Acoust. Speech Signal Process. Mag. v3 i1. 4-16.
[33]
Holistic cursive word recognition based on perceptual features. Pattern Recognition Lett. v28 i13. 1600-1609.
[34]
Selvi, S., Indira, K., 2005. A novel character segmentation algorithm for offline handwritten character recognition. In: Proc. 10th IWFHR, Mysore, India, pp. 462-468.
[35]
Offline handwritten Chinese character recognition by radical decomposition. ACM Trans. Asian Lang. Inform. Process. v2 i1. 27-48.
[36]
High-performance reading machines. Proc. IEEE. v80 i7. 1120-1132.
[37]
Machine-printed Japanese document recognition. Pattern Recognition. v80 i8. 1301-1313.
[38]
Suen, C.Y., Mori, S., Kim, S.H., Leung, C.H., 2003. Analysis and recognition of Asian scripts - The state of the art. In: Proc. 7th Internat. Conf. Document Analysis and Recognition, Edinburgh, pp. 866-878.
[39]
Tan, T., Sullivan, G., Baker, K., 1996. Efficient image gradient-based object localization and recognition. In: Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR'96), pp. 397-402.
[40]
Eigen-deformations for elastic matching based handwritten character recognition. Pattern Recognition. v36 i9. 2031-2040.
[41]
Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE TPAMI. v26 i6. 709-720.
[42]
The HTK Book. Cambridge University Engineering Department, Cambridge.

Cited By

View all
  • (2024)A Historical Handwritten Dataset for Ethiopic OCR with Baseline Models and Human-Level PerformanceDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70543-4_2(23-38)Online publication date: 30-Aug-2024
  • (2022)Multi-script handwritten digit recognition using multi-task learningJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21223343:1(355-364)Online publication date: 1-Jan-2022
  • (2012)Recognition of Ethiopic braille charactersProceedings of the International Conference on Management of Emergent Digital EcoSystems10.1145/2457276.2457304(158-165)Online publication date: 28-Oct-2012

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition Letters
Pattern Recognition Letters  Volume 32, Issue 8
June, 2011
142 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 June 2011

Author Tags

  1. Amharic
  2. Ethiopic script
  3. HMM
  4. Handwriting recognition
  5. OCR
  6. Word recognition

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Historical Handwritten Dataset for Ethiopic OCR with Baseline Models and Human-Level PerformanceDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70543-4_2(23-38)Online publication date: 30-Aug-2024
  • (2022)Multi-script handwritten digit recognition using multi-task learningJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21223343:1(355-364)Online publication date: 1-Jan-2022
  • (2012)Recognition of Ethiopic braille charactersProceedings of the International Conference on Management of Emergent Digital EcoSystems10.1145/2457276.2457304(158-165)Online publication date: 28-Oct-2012

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media