nLab singular learning theory

Contents

Contents

Idea

In the context of machine learning, singular learning theory applies results from algebraic geometry to statistical learning theory. In the case of learning algorithms, such as deep neural networks, where there are multiple parameter values corresponding to the same statistical distribution, the preimage of the target distribution may take the form of a singular subspace of the parameter space. Techniques from algebraic geometry may then be applied to study learning with such devices.

References

Historically, it has been understood that the neural networks are singular statistical models in

  • Shun-ichi Amari, T. Ozeki, H. Park, Learning and inference in hierarchical models with singularities, Syst. Comput. Japan 34:7 (2003) 34–42
  • Sumio Watanabe, Almost all learning machines are singular, Proc. IEEE Symp. Found. Comput. Intell., Apr. 2007, 383–388.

Textbook treatments:

For an informal discussion:

  • Jesse Hoogland, Neural networks generalize because of this one weird trick (blog post); Jesse Hoogland, Filip Sondej, Spooky action at a distance in the loss landscape (blog post).

For a series of talks and further texts:

  • Singular Learning Theory seminar, (webpage)

  • S. Wei, Daniel Murfet, M. Gong, H. Li , J. Gell-Redman, T. Quella, Deep learning is singular, and that’s good, IEEE Transactions on neural networks and learning systems pdf

Last revised on July 28, 2024 at 17:35:40. See the history of this page for a list of all contributions to it.