Contents
Definition
A multiset consists of a set and a function , where is a universal set and if and only if .
We can add two multisets and to get
\mathcal{X} + \mathcal{Y} = \langle X\cup Y,\mu_X+\mu_Y\rangle.
Note that we can write
k\mathcal{X} = \langle X,k\mu_X\rangle
for .
We can also define an inner product of multisets via
\langle \mathcal{X},\mathcal{Y}\rangle = \sum_{e\in\mathcal{X}\cup \mathcal{Y}} \mu_X(e) \mu_Y(e).
Note that when and are simply sets, and are the characteristic functions and
\langle \mathcal{X},\mathcal{Y}\rangle = |X\cap Y|,
where denotes the cardinality of the set.
Using this inner product, we can define the angle between multisets as
\cos\theta_{\mathcal{X},\mathcal{Y}} = \frac{\langle \mathcal{X},\mathcal{Y}\rangle}{\sqrt{\langle \mathcal{X},\mathcal{X}\rangle \langle \mathcal{Y},\mathcal{Y}\rangle}}.
In particular, when we have
cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = 1
and when we have
cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = 0.
When and are simply sets, the angle between them is given by
cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = \frac{|X\cap Y|}{\sqrt{|X||Y|}}.
With this notion of addition, the collection of multisets in becomes the -module (that is abelian monoid) ; this inner product makes it an inner product space analogous to the Banach space .
Machine Learning
The inner product of multisets is closely related to the βbag of wordsβ kernel in machine learning (see n-Cafe).