The theory of hyperfunctions, created by the Japanese school of Mikio Sato, Masaki Kashiwara et al. is one of the many variants of the theory of generalized functions. Unlike the earlier theory of distributions of Schwarz et alt. it is not based on duals of some spaces of smooth functions but rather on boundary values of holomorphic functions. This allows usage of sheaf theory and includes Schwartz distributions as a special case. For example in Schwarz’s theory apart from measures one has functions with discrete support which are linear combinations of finite derivatives of delta functions. Here one can allow things like the exponential of a differential operator applied to a delta function. So in a sense one has distributions of infinite order.

Hyperfunctions are a very useful tool in the study of D-modules, holonomic systems of differential equations, and especially some aspects of symplectic geometry and harmonic analysis that are part of microlocal analysis, especially algebraic microlocalization.

This page is about hyperfunctions of one variable only, for the multiple variable case see hyperfunction of multiple variables?.


We define hyperfunctions and explain some basic properties.

The theory of hyperfunctions of one variable is considerably easier than the one for multiple variables, for this reason we will split the exposition to handle the one dimensional case first, and then generalize to multiple dimensions: hyperfunction of multiple variables?.

The exposition of the one dimensional theory will try to illuminate the following points:

  1. Hyperfunctions are more general than distributions: In a certain sense, the space of test functions (for compactly supported hyperfunctions) is restricted to real analytic functions instead of smooth functions. Distributions are “functions that are meromorphic on the real line”, while hyperfunctions are allowed to have essential singularities. The meaning of this will be explained by an example.

A basic strategy of the exposition will therefore be to stress both similarities and differences of hyperfunctions and distributions (real analytic and smooth setting).

  1. Hyperfunctions require less technical machinery, at least in one dimension, since they can be defined and studied with basic complex analysis only. That will nevertheless make it possible to give a rigorous definitions of otherwise only formal expressions that are often used by engineers and physicists, like a bδ(x)dx\int_a^b \delta (x) dx.

  2. Hyperfunctions and their microfunction?s both form flabby sheaves, which is the starting point of a study of singularities of hyperfunctions and the algebraic analysis of systems of differential equations.


Let I=(a,b)I = (a, b) \subseteq \mathbb{R} be an open interval, a complex neighbourhood of II is an open set UU \subset \mathbb{C} such that U=IU \bigcap \mathbb{R} = I. For VV \subset \mathbb{C} open let 𝒪(V)\mathcal{O}(V) be the sheaf of holomorphic functions on VV.


The set of hyperfunctions on II for a complex neighbourhood UU of II is defined to be the quotient

(I):=𝒪(U/I)𝒪(U) \mathcal{B}(I) := \frac{ \mathcal{O}(U/I) }{\mathcal{O}(U)}

The set (I)\mathcal{B}(I) does not depend on the chosen complex neighbourhood UU, which is a consequence of Mittag-Leffler's theorem? in the one-dimensional case. The definition easily generalizes to open subsets of \mathbb{R}.

Let U +:={uU:Im(u)>0}U^+ := \{u \in U : \operatorname{Im}(u) \gt 0 \} and U :={uU:Im(u)<0}U^- := \{u \in U : \operatorname{Im}(u) \lt 0 \}, then a nontrivial hyperfunction can be described by two functions F +F^+ holomorphic on U +U^+ and F F^- holomorphic on U U^- such that there is no function FF holomorphic on UU that restricts to F +F^+ and F F^-. In this sense hyperfunctions are “differences” of holomorphic functions on a “boundary” (we did not specify an algebraic object that contains both F +F^+ and F F^-, so strictly speaking we cannot subtract them).

In the following we will often write F=(F +,F )F = (F^+, F^-) for a hyperfunction FF, with (F +,F )(F^+, F^-) being a representative of the equivalence class that is FF in \mathcal{B} on some complex neighbourhood. If the algebraic expression of F +F^+ and F F^- coincides we can further simplify the notation and simply write [F][F].


The δ\delta distribution can be represented by δ=[i2πz]\delta = [\frac{i}{2 \pi z}]. We can prove that this equality is simply a restatement of Cauchy’s integral formula after we define the concept of integration for a compactly supported hyperfunction below.

The Heaviside function H(x)H(x) has as a representation:

H(x)=[12πilog(z)] H(x) = [- \frac{1}{2 \pi i} log(-z)]

Here we use the main branch of the logarithm that is defined on the complex plane minus the negative real axis, then log(z)log(-z) takes the value log(z)πilog(|z|) - \pi \, i on the upper side of the positive real axis and log(z)+πilog(|z|) + \pi \, i on the lower side of the positive real axis, while being holomorphic on the negative real axis. This results in

H(x)=lim ϵ0+(F +(x+iϵ)F (xiϵ))={1, ifx>0 0, ifx<0 H(x) = \lim_{\epsilon \to 0+} (F^+ (x + i \epsilon) - F^- (x - i \epsilon)) = \begin{cases} 1, & \text{if}\; x \gt 0 \\ 0, & \text{if}\; x \lt 0 \end{cases}

and a singularity in x=0x=0.

Basic Properties and Definitions

Let UU \subseteq \mathbb{R} be open.



The derivative of a hyperfunction F=[F +,F ]F = [F^+, F^-] is

F=[dF +dz,dF dz] F' = [\frac{dF^+}{dz}, \frac{dF^-}{dz}]

We immediatly see that the derivative of a hyperfunction is again a hyperfunction and that all hyperfunctions are indefinitly differentiable.


As the derivative of the Heaviside function we obtain:

H(x)=[ddz(12πilog(z))]=[12πiz]=δ(x) H'(x) = [\frac{d}{dz} (- \frac{1}{2 \pi i} log(-z))] = [- \frac{1}{2 \pi \, i\, z}] = \delta (x)

Note that we can derive this relationship without any resort to dualities of TVS.

Sheaf Structure


(hyperfunctions are a module over analytic functions) Let 𝒜(U)\mathcal{A}(U) be the algebra of real analytic functions on UU, then (U)\mathcal{B}(U) is a module over 𝒜(U)\mathcal{A}(U).

In fact every real analytic function ff is naturally an element of (U)\mathcal{B}(U) represented by [f,0][f, 0] or [0,f][0, -f] on a complex neighbourhood. Given any hyperfunction F=(F +,F )F = (F^+, F^-) we can define the product by fF:=(fF +,F )=(F +,fF )f F := (f F^+, F^-) = (F^+, -f F^-) which can be shown to be independent of the various choices involved (representation of FF, complex neighbourhood of UU, domain of the extension of the domain of ff).


(hyperfunctions form a flabby sheaf) The sets of hyperfunctions of open subsets of UU form a flabby sheaf of vector spaces over UU.


(distributions are a subsheaf) There is a linear injection from the space of distributions 𝒟(U)\mathcal{D}'(U) to the space of hyperfunctions (U)\mathcal{B}(U). The sheaf of germs of distributions is therefore a subsheaf of the sheaf of germs of hyperfunctions.


There is no product of hyperfunctions that reduces to the ordinary product for real analytic functions and respects the chain rule of differential calculus, just as in the case of distributions. Nevertheless certain hyperfunctions may be multiplied, more on that later.

First let us note that the naive definition of the product of two hyperfunctions F=[F +,F ],G=[G +,G ]F = [F^+, F^-], G = [G^+, G^-] via

FG:=[F +G +,F G ] FG := [F^+ G^+, F^- G^-]

is obviously not well defined on equivalence classes.


In general we cannot speak of the value of a hyperfunction at a certain point, much like we cannot speak of the value of a distribution at a certain point. We say that a hyperfunction FF vanishes on an open subset UU' of UU if it coincides with the zero hyperfunction, which is of course equivalent to stating that there is a representation of F=(F +,F )F = (F^+, F^-) such that the boundary values of F +F^+ and F F^- coincide on UU' and therefore define a real analytic function on UU'.


The support of a hyperfunction FF is the complement of the largest open subset of UU on which FF vanishes in the sense described above.

For any compact KK we will denote by K\mathcal{B}_K all hyperfunctions whose support is contained in KK.

Integration for Compactly Supported Hyperfunctions

Since we defined hyperfunctions for open subsets only, we do not yet know anything about hyperfunctions K\mathcal{B}_K with support in a compact subset KK, but the following theorem tells us that there is no difference:


(characterization of compactly supported hyperfunctions) Let K\mathcal{B}_K be all hyperfunctions with support contained in KK \subset \mathbb{R} compact and VV be any complex neighbourhood of KK. Then we have the following isomorphism:

K𝒪(V/K)𝒪(V) \mathcal{B}_K \cong \frac{ \mathcal{O}(V/K) }{\mathcal{O}(V)}

Let a,ba, b \in \mathbb{R} with aba \leq b and suppose that FF is a hyperfunction that is analytical in aa and bb. Then we can choose a complex neighbourhood VV of [a,b][a, b], paths τ +\tau^+ in V +V^+ and τ \tau^- in V V^- from a to b and a representation of F=(F +,F )F = (F^+, F^-) such that we can define the integral

a bF(x)dx:= τ +F +(z)dz τ F (z)dz \int_a^b F(x) dx := \int_{\tau^+} F^+ (z) dz - \int_{\tau^-} F^- (z) dz

independently of all the arbitrary choices we made, thanks to the holomorphy of the functions on the right side (and Cauchy’s integral formula for holomorphic functions, of course).

Example of Integration for Hyperfunctions

Using the definition of an integral of a hyperfunction above, we can easily prove

a bF(x)dx=F(b)F(b) \int_a^b F'(x) dx = F(b) - F(b)

for all hyperfunctions FF that are analytic in aa and bb. Therefore we just made rigorous sense of the formula

a bδ(x)dx=H(b)H(a) \int_a^b \delta (x) dx = H(b) - H(a)

for a,b0a, b \neq 0 and HH the Heaviside function.

Example of a Hyperfunction that is not a Distribution

Warning: As of this moment this paragraph consists of some handwaving.

A well known structure theorem in the theory of distributions says that every distribution whose support consists of one single point PP is in fact a finite linear combination of distributional derivatives of the δ\delta function at PP.

As an example of a hyperfunction that is not a distribution we therefore have F=[exp1z]F = [\exp{\frac{1}{z}}] with suppF={0}\supp{F} = \{ 0 \}.

Using the representation of the δ\delta distribution, we see that

δ (n)(x)=[(1) n+1n!2πi1z n+1] \delta^{(n)}(x) = [\frac{(-1)^{n+1} n!}{2 \pi i} \frac{1}{z^{n+1}}]

Using this representation, for any real analytic test function aa we see that the following identity holds, which motivates the use of the δ\delta as in distribution theory:

a(x)δ n(x)dx=a (n)(0) \int a(x) \delta^{n}(x) dx = a^{(n)} (0)

Comparing the Laurent series of FF with the representation of δ (n)\delta^{(n)} we get

[exp1z]=[1+2πi n=0 1n!(n1)!δ (n)(z)] [\exp{\frac{1}{z}}] = [1 + 2 \pi i \sum_{n=0}^{\infty} \frac{1}{n! (n-1)!} \delta^{(n)}(z)]

with (1)!:=1(-1)! := 1. Given the structure theorem about distributions cited above, we see that the hyperfunction [exp1z][\exp{\frac{1}{z}}] cannot be a distribution. There is an analog structure theorem for hyperfunctions, however:


(structure of hyperfunctions supported at a single point) Let FF be a hyperfunction supported at the origin, then FF has a representation as

F(x)=[ n=0 b nδ (n)(x)] F(x) = [ \sum_{n=0}^{\infty} b_n \delta^{(n)} (x)]

where the coefficients satisfy

lim nb nn!=0 \lim_{n \to \infty} \sqrt{|b_n n!|} = 0

We include a sketch of the proof of the first statement to illustrate the ease of the use of hyperfunctions.


Let FF be a hyperfunction supported at the origin, that means that FF is analytic in a complex neighbourhood of the origin (excluding the origin itself). Therefore FF has a representation by a Laurent series in a neighbourhood of the origin:

F(x)=[ a nz n]=[ 1a nz n] F(x) = [\sum_{- \infty}^{\infty} a_n z^n] = [\sum_{- \infty}^{-1} a_n z^n]

In the representation of FF we may remove the holomorphic part of the Laurent series. Now we may insert the representation of the δ\delta hyperfunctions and get

F(x)=[ 1b nδ (n)(z)]= 1b n[δ (n)(z)] F(x) = [\sum_{- \infty}^{-1} b_n \delta^{(n)} (z)] = \sum_{- \infty}^{-1} b_n [\delta^{(n)} (z)]

The last sum is understood to converge in the topology of uniform convergence on compact subsets where the summands are holomorph.

Topology and Duality for Compactly Supported Hyperfunctions

We can introduce a canonical topology on the space K\mathcal{B}_K of hyperfunctions with support in a compact KK and show that the topological dual is the space 𝒜 K\mathcal{A}_K of germs of real analytical functions, that is the inductive limit of the spaces of holomorphic functions on a cofinal set of complex neighbourhoods of KK, and vice versa.

The algebraic statement is known as Köthe’s (duality) theorem.

This will also hint at the fact that we will not be able to introduce a topology on ()\mathcal{B}(\mathbb{R}) that resembles the situation in the theory of distributions, since its dual space would be something like “the space of compactly supported real analytic functions”.

Main Theoretical Results

One striking example of the use of hyperfunctions in the theory of differential equations with real analytic coefficients is this result due to Sato:

Let UU \subseteq \mathbb{R} be open and PP be a linear, finite order differential operator with real analytic coefficients defined on UU:

P(x,ddx)= i=0 na(x)(ddx) i P(x, \frac{d}{dx}) = \sum_{i=0}^n a(x) (\frac{d}{dx})^{i}

with a n0a_n \neq 0.


(solvability of differential equations) For every f(U)f \in \mathcal{B}(U) there is a solution u(U)u \in \mathcal{B}(U) of the equation Pu=fP u = f. Every such solution can be extended to an open set VUV \supset U iff the coefficients a ia_i and ff can be extended to VV.

Briefly: PP is surjective on hyperfunctions.

Two related results are:


PP is a sheaf endomorphismus, both of the sheaf of real analytic functions and of the sheaf of hyperfunctions.


PP does not enlarge the support of hyperfunctions.

Singularities and Microfunctions

Much information about a given hyperfunction is encoded in the kind of singularities that it has. A first step into the theory of singularities is the following definition:


The singular support sing supp(FF) of a hyperfunction FF is the complement of the largest open set on which FF is real analytic.

This definition does not seem particularly useful, since it only pinpoints the singular locus of a hyperfunction, without explaining the differences between various singularities. One question of crucial importance to physics is for example “when can two distributions” be multiplied?

The singular support does not help much: The δ\delta distribution cannot be squared, while the distribution defined by [1x+ϵi][\frac{1}{x + \epsilon i}] can. (The latter is a distribution defined as a Cauchy principal value). The singular support of both consists of the origin. But in a certain sense the singularity of the δ\delta distribution is worse than that of [1x+ϵi][\frac{1}{x + \epsilon i}]. Going a step further requires the notion of wavefront sets in the smooth setting. Note that the definition of a wavefront set needs the concept of a cutoff function, that is a smooth function with compact support, which cannot be used in the real analytic setting.


A gentle introduction with examples is the booklet

  • K. Yosida, Operational calculus. A theory of hyperfunctions, Applied Mathematical Sciences, 55. Springer-Verlag, New York, 1984. x+170 pp.

  • Mikio Sato, Theory of hyperfunctions I, pdf scan

  • M. Kashiwara, T. Kawai, T. Kimura, “Foundation of algebraic analysis” , Princeton Univ. Press (1986) ((translated from the Japanese))

  • Mitsuo Morimoto: An introduction to Sato’s hyperfunctions. (ZMATH entry)

  • A. Eida, S. Pilipović, On the microlocal decomposition of same classes of hyperfunctions, Math. Proc. Camb. Phil. Soc. 125, 455-461 (1999)

It is possible to use hyperfunctions as an introduction to generalized functions for physicists and engineers with a minimal background in complex analysis, see

Revised on April 15, 2013 21:50:33 by Zoran Škoda (