nLab singular learning theory




In the context of machine learning, singular learning theory applies results from algebraic geometry to statistical learning theory. In the case of learning algorithms, such as deep neural networks, where there are multiple parameter values corresponding to the same statistical distribution, the preimage of the target distribution may take the form of a singular subspace of the parameter space. Techniques from algebraic geometry may then be applied to study learning with such devices.


Historically, it has been understood that the neural networks are singuar statistical models in

  • Shun-ichi Amari, T. Ozeki, H. Park, Learning and inference in hierarchical models with singularities, Syst. Comput. Japan 34:7 (2003) 34–42
  • Sumio Watanabe, Almost all learning machines are singular, Proc. IEEE Symp. Found. Comput. Intell., Apr. 2007, 383–388.

Textbook treatments:

For an informal discussion:

  • Jesse Hoogland, Neural networks generalize because of this one weird trick (blog post); Jesse Hoogland, Filip Sondej, Spooky action at a distance in the loss landscape (blog post).

For a series of talks and further texts:

  • Singular Learning Theory seminar, (webpage)

  • S. Wei, Daniel Murfet, M. Gong, H. Li , J. Gell-Redman, T. Quella, Deep learning is singular, and that’s good, IEEE Transactions on neural networks and learning systems pdf

Last revised on April 7, 2023 at 12:33:31. See the history of this page for a list of all contributions to it.