Abstract
This paper examines the effect of varying the coarse-ness (or fine-ness) in a data representation upon the learning or recognition accuracy achievable. This accuracy is quantified by the least probability of error in recognition or the Bayes error rate, for a finite-class pattern recognition problem. We examine variation in recognition accuracy as a function of resolution, by modeling the granularity variation of the representation as a refinement of the underlying probability structure of the data. Specifically, refining the data representation leads to improved bounds on the probability of error. Indeed, this confirms the intuitive notion that more information can lead to improved decision-making. This analysis may be extended to multiresolution methods where coarse-to-fine and fineto-coarse variations in representations are possible.
We also discuss a general method to examine the effects of image resolution on recognizer performance. Empirical results in a 840-class Japanese optical character recognition task are presented. Considerable improvements in performance are observed as resolution increases from 40 to 200 ppi. However, diminshed performance improvements are observed at resolutions higher than 200 ppi. These results are useful in the design of optical character recognizers. We suggest that our results may be relevant to human letter recognition studies, where such an objective evaluation of the task is required.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K. R. Alexander, W. Xie, and D. J. Derlacki. Spatial-frequency characteristics of letter identification. Journal of the Optical Society of America A, 11: 2375–2382, 1994.
H. S. Baird. Document image defect models and their uses. In Proceedings of ICDAR, 1993, 1993.
R. E. Blahut. Principles and Practice of Information Theory. Addison-Wesley, 1990.
T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley, 1991.
R. E. Crochiere and L. R. Rabiner. Interpolation and decimation of digital signals: A tutorial review. Proceedings of the IEEE, 69: 300–331, 1981.
R. O. Duda and P. E. Hart. Pattern Classification and Scene Analysis. John Wiley, 1973.
M. Feder and N. Merhay. Relations between entropy and error probability. IEEE Transactions on Information Theory, 1994.
A. K. Jain and B. Chandrasekaran. Dimensionality and sample size considerations in pattern recognition practice. Handbook of Statistics-Classification, Pattern Recognition and Reduction of Dimensionality, Ed. P. R. Krishnaiah L. N. Kanal, 2: 835–855, 1982.
B. Kanal, L. & Chandrasekaran. On dimensionality and sample size in statistical pattern recognition. Pattern Recognition, 3: 225–234, 1971.
D. Lee, T. Pavlidis, and G. W. Wasilkowski. A note on the trade-off between sampling and quantization in signal processing. Journal of Complexity, 3: 359–371, 1987.
A. V. Oppenheim and R. W. Schaeffer. Discrete-time Signal Processing. Prentice-Hall, 1989.
P. Palumbo, S. N. Srihari, J. Soh, R. Sridhar, and V. Demjanenko. Postal address block location in real time. IEEE Computer, pages 34–42, 1992.
D. H. Parish and G. Sperling. Object spatial frequencies, retinal spatial frequencies, noise, and the efficiency of letter discrimination. Vision Research, 31: 1399–1415, 1991.
S. N. Srihari. High-performance reading machines. Proceedings of the IEEE, 80: 1120 1132, 1992.
S. N. Srihari and J. J. Hull. Character recognition. Encyclopaedia of Artificial Intelligence, 1, 1992.
G. Srikantan. Image Sampling Rate and Image Pattern Recognition. Doctoral Dissertation, Department of Computer Science, SUNY at Buffalo, 1994.
G. Srikantan and S. N. Srihari. A study relating image sampling rate and image pattern recognition. In CVPR-94. IEEE Press, 1994.
W. G. Waller and A. K. Jain. On the monotonicity of the performance of bayesian classifiers. IEEE Transactions on Information Theory, 24: 392–394, 1978.
L. Wang and T. Pavlidis. Direct gray-scale extraction of features for character recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15: 1053 1067, 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Springer-Verlag New York, Inc.
About this chapter
Cite this chapter
Srikantan, G., Srihari, S.N. (1996). Data Representations in Learning. In: Fisher, D., Lenz, HJ. (eds) Learning from Data. Lecture Notes in Statistics, vol 112. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-2404-4_29
Download citation
DOI: https://doi.org/10.1007/978-1-4612-2404-4_29
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-94736-5
Online ISBN: 978-1-4612-2404-4
eBook Packages: Springer Book Archive