nLab inner product of multisets

Contents

Definition
Machine Learning
Related concepts

Definition

A multiset $\mathcal{X} = \langle X,\mu_X\rangle$ consists of a set $X$ and a function $\mu_X:U\to\mathbb{N}$ , where $U$ is a universal set and $\mu_X(e) \gt 0$ if and only if $e\in X$ .

We can add two multisets $\mathcal{X} = \langle X,\mu_X\rangle$ and $\mathcal{Y} = \langle Y,\mu_Y\rangle$ to get

\mathcal{X} + \mathcal{Y} = \langle X\cup Y,\mu_X+\mu_Y\rangle.

Note that we can write

k\mathcal{X} = \langle X,k\mu_X\rangle

for $k\in\mathbb{N}$ .

We can also define an inner product of multisets via

\langle \mathcal{X},\mathcal{Y}\rangle = \sum_{e\in\mathcal{X}\cup \mathcal{Y}} \mu_X(e) \mu_Y(e).

Note that when $\mathcal{X}$ and $\mathcal{Y}$ are simply sets, $\mu_X$ and $\mu_Y$ are the characteristic functions and

\langle \mathcal{X},\mathcal{Y}\rangle = |X\cap Y|,

where $|\cdot|$ denotes the cardinality of the set.

Using this inner product, we can define the angle between multisets as

\cos\theta_{\mathcal{X},\mathcal{Y}} = \frac{\langle \mathcal{X},\mathcal{Y}\rangle}{\sqrt{\langle \mathcal{X},\mathcal{X}\rangle \langle \mathcal{Y},\mathcal{Y}\rangle}}.

In particular, when $\mathcal{X}=\mathcal{Y}$ we have

cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = 1

and when $X\cap Y = \emptyset$ we have

cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = 0.

When $\mathcal{X}$ and $\mathcal{Y}$ are simply sets, the angle between them is given by

cos\theta_{{\mathcal{X}},{\mathcal{Y}}} = \frac{|X\cap Y|}{\sqrt{|X||Y|}}.

With this notion of addition, the collection of multisets in $U$ becomes the $\mathbb{N}$ -module (that is abelian monoid) $\mathbb{N}^U$ ; this inner product makes it an inner product space analogous to the Banach space $\mathbb{R}^n$ .

Machine Learning

The inner product of multisets is closely related to the “bag of words” kernel in machine learning (see n-Cafe).

Related concepts

multiset

Last revised on April 2, 2021 at 02:19:50. See the history of this page for a list of all contributions to it.