Eric Forgy Colimit as a Linear Regression of Multisets

Warning: This page is highly speculative and the title could very well be complete nonsense. Feedback is more than welcome.

Given a set XX and a finite collection of sets X iX_i, we are interested in finding “the best possible” multiset expression

X= i=1 Nw iX i+X = \sum_{i=1}^N w_i X_i + \mathcal{E}

such that the inner product satisfies

X i *,=0,\langle X^*_i,\mathcal{E}\rangle = 0,

where X i *X^*_i is a dual multiset satisfying

X i *,X j=δ i,j\langle X^*_i,X_j\rangle = \delta_{i,j}

so that

X i *,X=w i.\langle X^*_i,X\rangle = w_i.

As a step toward this goal, define a matrix whose elements are given by

Z i,j=X i,X j=|X iX j|.Z_{i,j} = \langle X_i,X_j\rangle = |X_i\cap X_j|.

Case 1: det(Z)0det(Z)\ne 0

If the inverse matrix exists, we can define

X i *= j=1 NZ i,j 1X j.X^*_i = \sum_{j=1}^N Z^{-1}_{i,j} X_j.

Note that although the elements Z i,jZ_{i,j} are integers greater than or equal to zero, the elements are Z i,j 1Z^{-1}_{i,j} are rational numbers that can be negative. Therefore, although multisets form a rig, dual multisets form a \mathbb{Q}-algebra.

The weights are then given explicitly by

w i= jZ i,j 1X j,X= jZ i,j 1|X jX|.w_i = \sum_j Z^{-1}_{i,j} \langle X_j,X\rangle = \sum_j Z^{-1}_{i,j} |X_j\cap X|.

Case 2: det(Z)0det(Z)\ne 0 and X iX j=X_i\cap X_j = \emptyset

When the sets X iX_i are disjoint, ZZ is a diagonal matrix whose entries are given by

Z i,i=|X i|.Z_{i,i} = |X_i|.

The inverse matrix is obviously also diagonal with entries given by

Z i,i 1=1|X i|.Z^{-1}_{i,i} = \frac{1}{|X_i|}.

In this case, weights reduce to

w i=|X iX||X i|w_i = \frac{|X_i\cap X|}{|X_i|}

and we have

X= i=1 N|X iX||X i|X i+.X = \sum_{i=1}^N \frac{|X_i\cap X|}{|X_i|} X_i + \mathcal{E}.

References

Created on October 28, 2009 at 06:20:17 by Eric Forgy