Abstract
This paper clarifies learning efficiency of a non-regular parametric model such as a neural network whose true parameter set is an analytic variety with singular points. By using Sato’s b-function we rigorously prove that the free energy or the Bayesian stochastic complexity is asymptotically equal to λ 1 log n − (m 1 − 1) log log n+constant, where λ 1 is a rational number, m 1 is a natural number, and n is the number of training samples. Also we show an algorithm to calculate λ 1 and m 1 based on the resolution of singularity. In regular models, 2λ 1 is equal to the number of parameters and m 1 = 1, whereas in non-regular models such as neural networks, 2λ 1 is smaller than the number of parameters and m 1 ≥ 1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hagiwara, K., Toda, N., Usui, S.,: On the problem of applying AIC to determine the structure of a layered feed-forward neural network. Proc. of IJCNN Nagoya Japan. 3 (1993) 2263–2266
Fukumizu, K.: Generalization error of linear neural networks in unidentifiable cases. In this issue.
Watanabe, S.: Inequalities of generalization errors for layered neural networks in Bayesian learning. Proc. of ICONIP 98 (1998) 59–62
Levin, E., Tishby, N., Solla, S.A.: A statistical approaches to learning and generalization in layered neural networks. Proc. of IEEE 78(10) (1990) 1568–1674
Amari, S., Fujita, N., Shinomoto, S.: Four Types of Learning Curves. Neural Computation 4(4) (1992) 608–618
Sato, M., Shintani, T.: On zeta functions associated with prehomogeneous vector space. Anals. of Math., 100 (1974) 131–170
Bernstein, I.N.: The analytic continuation of generalized functions with respect to a parameter. Functional Anal. Appl.6 (1972) 26–40.
Björk, J.E.: Rings of differential operators. Northholand (1979)
Kashiwara, M.: B-functions and holonomic systems. Inventions Math. 38 (1976) 33–53.
Gel’fand, I.M., Shilov, G.E.: Generalized functions. Academic Press, (1964).
Watanabe, S.: Algebraic analysis for neural network learning. Proc. of IEEE SMC Symp., 1999, to appear.
Watanabe, S.: On the generalization error by a layered statistical model with Bayesian estimation. IEICE Trans. J81-A (1998) 1442–1452. (The English version is to appear in Elect. and Comm. in Japan. John Wiley and Sons)
Atiyah, M.F.: Resolution of Singularities and Division of Distributions. Comm. Pure and Appl. Math. 13 (1970) 145–150
Hörmander, L.: An introduction to complex analysis in several variables. Van Nostrand. (1966)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Watanabe, S. (1999). Algebraic Analysis for Singular Statistical Estimation. In: Watanabe, O., Yokomori, T. (eds) Algorithmic Learning Theory. ALT 1999. Lecture Notes in Computer Science(), vol 1720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46769-6_4
Download citation
DOI: https://doi.org/10.1007/3-540-46769-6_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66748-3
Online ISBN: 978-3-540-46769-4
eBook Packages: Springer Book Archive