Information geometry aims to apply the techniques of differential geometry to statistics. Often it is useful to think of a family of probability distributions as a statistical manifold. For example, normal Gaussian distributions form a 2-dimensional manifold, parameterised by , mean and standard deviation. On such manifolds there are notions of Riemannian metric, connection, curvature, and so on, of statistical relevance.
Kullback-Leibler information?, or relative entropy, features as a measure of divergence (not quite a metric, because it’s asymmetric), and Fisher information? takes the role of curvature. One useful aspect of information geometry is that it gives a means to prove results about statistical models, simply by considering them as well-behaved geometrical objects. For instance, it’s basically a tautology to say that a manifold is not changing much in the vicinity of points of low curvature, and changing greatly near points of high curvature. Stated more precisely, and then translated back into probabilistic language, this becomes the Cramer-Rao inequality?, that the variance of a parameter estimator is at least the reciprocal of the Fisher information?. (Shalizi)
One of the founders of the subject is Shun-ichi Amari.
g(v,w)_s := E_s( v(log s) w(log s)) \,,
where denotes the expectation value under the measure of the function on .
For instance (Amari, (2.1)).
A textbook providing the big picture is
Lecture notes include
Hông Vân Lê, Statistical manifolds are statistical models, Journal of Geometry 84(1-2), March 2006, pp. 83-93.
A brief introduction with more references is