Eric Forgy cardinality

Warning: This page is a collection of notes trying to understand some of Tom Leinster‘s work on the cardinality of a category.

From page 2 of Tom’s paper:

A version of the definition can be given very succinctly. Let AA be a finite category; totally order its objects as a 1,,a na_1,\dots ,a_n. Let ZZ be the matrix whose (i,j)(i,j)-entry is the number of arrows from a ia_i to a ja_j. Let M=Z 1M = Z^{-1}, assuming that this inverse exists. Then χ(A)\chi(A) is the sum of the entries of MM.

This paragraph (and I don’t think I’m the first to say this, but can’t find the post on the nCafe right now) makes me think of MM as a Gram matrix

M i,j=a i,a j A.M_{i,j} = \langle a_i,a_j\rangle_A.

Then defining

𝟙= a iA 0a i\mathbb{1} = \sum_{a_i\in A_0} a_i

we’d have

χ(A)=𝟙,𝟙 A.\chi(A) = \langle \mathbb{1},\mathbb{1}\rangle_A.

However, we haven’t specified whether we can even add objects of AA together, so what should we do next?

If we could overcome this barrier, then we could write down a set of dual elements

a i *= a jA 0Z i,ja ja_i^* = \sum_{a_j\in A_0} Z_{i,j} a_j

having the property that

a i *,a j A=δ i,j\langle a_i^*,a_j\rangle_A = \delta_{i,j}

and

a i *,a j * A=Z j,i.\langle a_i^*,a_j^*\rangle_A = Z_{j,i}.

It is interesting that the indices switched. There must be an op{}^{op} somewhere.

I think we also want to define the product of a ia_i via

a ia j=δ i,ja i.a_i a_j = \delta_{i,j} a_i.

Light Bulb

Note that if we consider a ia_i to be sets (as a subset of multisets) and if all a ia_i are disjoint then we do have

a ia j=δ i,ja i.a_i a_j = \delta_{i,j} a_i.

For a multiset XX spanned by a ia_i, then we have

𝟙X=( a iA 0a i)( a jA 0c ja j)=X\mathbb{1} X = \left(\sum_{a_i\in A_0} a_i\right)\left(\sum_{a_j\in A_0} c_j a_j\right) = X

so that

𝟙= a iA 0a i\mathbb{1} = \sum_{a_i\in A_0} a_i

really is the unit multiset on the space of multisets spanned by a ia_i.

This makes me think of directed space and the question on the bottom of the page:

Could this be related to Leinster measure?


The following material is separate from that above.

𝒳×𝒴=X×Y,μ X×μ Y,\mathcal{X}\times\mathcal{Y} = \langle X\times Y, \mu_X\times\mu_Y\rangle,

where

μ X×μ Y(x,y)=μ X(x)μ Y(y).\mu_X\times\mu_Y(x,y) = \mu_X(x) \mu_Y(y).

The following material is separate from that above.

I think I might be trying to reinvent multiset theory.

Got it!

See inner product of multisets

Projection

Given a multiset 𝒳\mathcal{X} and a reference multiset 𝒳 0\mathcal{X}_0, we’d like to decompose 𝒳\mathcal{X} into orthogonal components

𝒳=w 0𝒳 0+.\mathcal{X} = w_0 \mathcal{X}_0 + \mathcal{E}.

The coefficient w 0w_0 is given by

w 0=𝒳,𝒳 0𝒳 0,𝒳 0.w_0 = \frac{\langle\mathcal{X},\mathcal{X}_0\rangle}{\langle\mathcal{X}_0,\mathcal{X}_0\rangle}.

Attempt 5

The cardinality of a multiset satisfies

|X+Y|=|XY c|+|YX c|+2|XY|=|X|+|Y|.|X+Y| = |X\cap Y^c| + |Y\cap X^c| + 2|X\cap Y| = |X| + |Y|.

It is tempting to try to interpret the first equality as a law of cosines. To do so, I suggest that the norm should not be the cardinality, but rather the square root of the cardinality, i.e. using horrible notation, I’m suggesting

X=|X|\|X\| = \sqrt{|X|}

so that

X+Y 2=XY c 2+YX c 2+2XY 2.\|X+Y\|^2 = \|X\cap Y^c\|^2 + \|Y\cap X^c\|^2 + 2\|X\cap Y\|^2.

Motivated by this, we can define an inner product

X,Y=12(X+Y 2XY c 2YX c 2)\langle X,Y\rangle = \frac{1}{2}\left(\|X+Y\|^2 - \|X\cap Y^c\|^2 - \|Y\cap X^c\|^2\right)

which is of the desired form but reduces to

X,Y=XY 2\langle X,Y\rangle = \|X\cap Y\|^2

or

X,Y=|XY|.\langle X,Y\rangle = |X\cap Y|.

Attempt 4

X,Y=|XY| 2.\langle X,Y\rangle = |X\cap Y|^2.

Sanity check:

X,X=|X| 2.\langle X,X\rangle = |X|^2.

If XX and YY are disjoint:

X,Y=0.\langle X,Y\rangle = 0.

Cosine:

cosθ=|XY| 2|X||Y|.\cos\theta = \frac{|X\cap Y|^2}{|X||Y|}.

L 2L^2 Norm

In the discussion below Toby made some very helpful comments and I continued rambling some more, but I think a picture is starting to emerge.

Attempt 3

X,Y=12(|X+Y| 2|X| 2|Y| 2)\langle X,Y\rangle = \frac{1}{2} \left(|X+Y|^2 - |X|^2 - |Y|^2\right)

Sanity check:

X,X=12(|2X| 2|X| 2|X| 2)=|X| 2\langle X,X\rangle = \frac{1}{2} \left(|2X|^2 - |X|^2 - |X|^2\right) = |X|^2

In order to add two sets XX and YY, we can note that

X=X/Y+XYX = X/Y + X\cap Y

and

Y=Y/X+XYY = Y/X + X\cap Y

so that

X+Y=X/Y+Y/X+2XYX+Y = X/Y + Y/X + 2X\cap Y

and

|X+Y|=|X/Y|+|Y/X|+2|XY|.|X+Y| = |X/Y| + |Y/X| + 2|X\cap Y|.

More generally,

|αX+βY|=α|X/Y|+β|Y/X|+(α+β)|XY|.|\alpha X+\beta Y| = \alpha |X/Y| + \beta |Y/X| + (\alpha+\beta) |X\cap Y|.

Sanity check:

|αX+βX|=α|X/X|+β|X/X|+(α+β)|XX|=(α+β)|X|.|\alpha X+\beta X| = \alpha |X/X| + \beta |X/X| + (\alpha+\beta) |X\cap X| = (\alpha+\beta)|X|.

Scalar multiplication distributes over intersections, i.e.,

α|XY|=|αXαY|.\alpha |X\cap Y| = |\alpha X\cap \alpha Y|.

How about associativity?

|(αX+βY)+γZ|=|(αX+βY)/Z|+γ|Z/(αX+βY)|+(1+γ)|(αX+βY)Z|.|(\alpha X+\beta Y)+\gamma Z| = |(\alpha X+\beta Y)/Z| + \gamma |Z/(\alpha X+\beta Y)| + (1+\gamma) |(\alpha X+\beta Y)\cap Z|.

Attempt 2

X,Y=12(|X| 2+|Y| 2|XY| 2)\langle X,Y\rangle = \frac{1}{2} \left(|X|^2 + |Y|^2 - |X-Y|^2\right)

Sanity check:

X,X=12(|X| 2+|X| 2|XX| 2)=|X| 2\langle X,X\rangle = \frac{1}{2} \left(|X|^2 + |X|^2 - |X- X|^2\right) = |X|^2

Attempt 1

Here are some thoughts on cardinality.

Consider a vector space freely generated by finite sets and define the cardinality to be

|αX|=α|X|,|\alpha X| = \alpha |X|,

where α\alpha is a scalar and |X||X| is the cardinality of XX.

Now, motivated by

|XY|=|X|+|Y||XY||X\cup Y| = |X| + |Y| - |X\cap Y|

we would like to write down a reasonable definition of |αX+βY||\alpha X+\beta Y|.

How about

|αX+βY|=α|X|+β|Y|(α+β)|XY||\alpha X+\beta Y| = \alpha|X| + \beta|Y| - (\alpha+\beta) |X\cap Y|

which implies

|αXβY|=(α+β)|XY|.|\alpha X\cap \beta Y| = (\alpha+\beta)|X\cap Y|.

Does that make sense? This means, in particular, that

|X+Y|=|X|+|Y|2|XY|.|X+Y| = |X| + |Y| - 2|X\cap Y|.

I think we want to use this definition of “+” for finite sets.

Sanity check

|X+X|=|X|+|X|2|X|=0.|X+X| = |X| + |X| - 2|X| = 0.

Argh!

|X+(X)|=|X|+(1)|X|(11)|X|=0.|X+(-X)| = |X| + (-1)|X| - (1-1)|X| = 0.

Nice.

Is “+” associative?

|(X+Y)+Z|=|X+Y|+|Z|2|(X+Y)Z|.|(X+Y)+Z| = |X+Y| + |Z| - 2|(X+Y)\cap Z|.

Ouch! How can we define (X+Y)Z(X+Y)\cap Z? Maybe

(X+Y)Z=(XZ)+(YZ).(X+Y)\cap Z = (X\cap Z)+(Y\cap Z).

Therefore,

|(X+Y)+Z| =|X+Y|+|Z|2|XZ+YZ| =|X|+|Y|+|Z|2(|YZ|+|XZ|+|XY|)+4|XYZ|\begin{aligned} |(X+Y)+Z| &= |X+Y| + |Z| - 2|X\cap Z+Y\cap Z| \\ &= |X| + |Y| + |Z| - 2\left(|Y\cap Z| + |X\cap Z| + |X\cap Y|\right) + 4 |X\cap Y\cap Z| \end{aligned}

This is clearly symmetric in XX,YY, and ZZ so the sum is associative.

More generally, we have

|(αX+βY)+γZ| =|αX+βY|+γ|Z|2(α+γ)|XZ+(β+γ)|YZ| =α|X|+β|Y|+γ|Z| 2[(β+γ)|YZ|+(α+γ)|XZ|+(α+β)|XY|] +4(α+β+γ)|XYZ|\begin{aligned} |(\alpha X+\beta Y)+\gamma Z| &= |\alpha X+\beta Y| + \gamma|Z| - 2(\alpha+\gamma) |X\cap Z+(\beta+\gamma) |Y\cap Z| \\ &= \alpha |X| + \beta |Y| + \gamma |Z| \\ &\quad - 2\left[(\beta+\gamma) |Y\cap Z| + (\alpha+\gamma) |X\cap Z| + (\alpha+\beta) |X\cap Y|\right] \\ &\quad + 4 (\alpha+\beta+\gamma) |X\cap Y\cap Z| \end{aligned}

Let’s also check

|k(αX+βY)| =|kαX+kβY| =kα|X|+kβ|Y|2(kα+kβ)|XY| =k|αX+βY|.\begin{aligned} |k(\alpha X + \beta Y)| &= |k\alpha X + k\beta Y| \\ &= k\alpha |X| + k\beta |Y| - 2(k\alpha+k\beta) |X\cap Y|\\ &= k |\alpha X + \beta Y|. \end{aligned}

So far so good!

Armed with this definition of “+” for finite sets, we want to define an inner product

X,Y12(|X+Y| 2|X| 2|Y| 2).\langle X,Y\rangle \coloneqq \frac{1}{2}\left(|X+Y|^2 - |X|^2 - |Y|^2\right).

Discussion

Doesn't this already define ++? How is anything below a definition of ++, when ++ is already the addition operation in this free vector space? It seems to me that the only thing that you might define is to extend |||\cdot| and \cap from individual sets to linear combinations of sets. (And I guess that these sets are all finite subset of some ambient universe, so that \cap makes sense for them.) —Toby

Eric: I dunno. Is it obvious how to define |||\cdot|? If so, please help :)

I guess I was trying to find a “+” that is compatible with “|||\cdot|” so that

|X+X|=|2X|=2|X||X+X| = |2X| = 2|X|

and I wanted to define |X+Y||X+Y| in such a way that we can define a nontrivial inner product.

The “problem” (if there is one) is that

|X+Y|=|X|+|Y||X+Y| = |X|+|Y|

leads to

X,Y=|X||Y|\langle X,Y\rangle = |X||Y|

so that

cosθ=1.\cos\theta = 1.

I was hoping to find some inner product giving rise to an “angle”

cosθ=X,Y|X||Y|.\cos\theta = \frac{\langle X,Y\rangle}{|X||Y|}.

Toby: Yeah, it's obvious how to define |||\cdot|, like this: |X+Y|=|X|+|Y||X + Y| = |X| + |Y|. (^_^)

Seriously, my point is only that when you talk about defining ++, you've forgotten that you just defined it. I can see that you want a more interesting definition of |||\cdot|.

Perhaps we should start on the angle. If X=YX = Y, then we should have cosθ=1\cos \theta = 1, but perhaps cosθ=0\cos \theta = 0 should occur when XX and YY are disjoint; that's a way in which they can be ‘orthogonal’. (Sanity check: if XX = YY and also XX and YY are disjoint, then there is no consistent θ\theta; but this can happen only when XX and YY are empty, in which case there should be no consistent θ\theta. So that's working out.) So I'm thinking that cosθ\cos \theta should be proportional to |XY||X \cap Y| or |XY| 2|X \cap Y|^2. To normalise the latter properly, it should be

cosθ=|XY| 2|X||Y|. \cos \theta = \frac{|X \cap Y|^2}{|X| |Y|} .

This suggests that X,Y=|XY| 2\langle{X,Y}\rangle = |X \cap Y|^2.

Then |X+Y| 2=|X| 2+2X,Y+|Y| 2|X + Y|^2 = |X|^2 + 2\langle{X,Y}\rangle + |Y|^2 gives us

|X+Y|=|X| 2+2|XY| 2+|Y| 2. |X + Y| = \sqrt{|X|^2 + 2|X \cap Y|^2 + |Y|^2} .

Compared to your earlier

|X+Y|=|X|+2|XY|+|Y|, |X + Y| = |X| + 2|X \cap Y| + |Y| ,

I think that the problems were that you were using an L 1L^1 norm when you really need an L 2L^2 norm to get a good inner product.

How's that sound?

Eric: Thanks Toby! I’m happy to have you put some thought into this. As you could tell, I was flailing around a bit. I’ll try to absorb what you said and come back with questions/comments.

Eric: Ok! I see. I am not sure. My L 1L^1 norm was not put in by hand. In fact, I didn’t even recognize it as an L 1L^1 norm. I was simply carrying out a calculation motivated by your first comment (above).

I was thinking that if we want to add two finite sets XX and YY, then if they are disjoint we should have

|X+Y|=|X|+|Y|.|X+Y| = |X| + |Y|.

Issues only arise when XX and YY intersect. So what I did then is decompose XX and YY into three disjoint pieces.

X=XY c+XYX = X\cap Y^c + X\cap Y

and

Y=YX c+YX.Y = Y\cap X^c + Y\cap X.

There I was using “+” without being 100% clear on what “+” is, but when sets are disjoint, I think it is safe to say “+” is just union.

So then I simply added XX and YY as three disjoint pieces resulting in

X+Y=(XY c+XY)+(YX c+YX)=XY c+YX c+2XY.X+Y = (X\cap Y^c + X\cap Y) + (Y\cap X^c + Y\cap X) = X\cap Y^c + Y\cap X^c + 2X\cap Y.

Then, the norm of disjoint sets is just the sum of their norms, we have

|X+Y|=|XY c|+|YX c|+2|XY|.|X+Y| = |X\cap Y^c| + |Y\cap X^c| + 2|X\cap Y|.

The fact that this is an L 1L^1 norm is a bonus and wasn’t put in by hand.

BZZZT! I see! I made a mistake. The norm of the sum of disjoint sets should not be the sum of the norms unless I assume an L 1L^1 norm, right? For an L 2L^2 norm, it should be

|X+Y| 2=|XY c| 2+|YX c| 2+4|XY| 2.|X+Y|^2 = |X\cap Y^c|^2 + |Y\cap X^c|^2 + 4|X\cap Y|^2.

Hmmm! We’re making progress :)

Eric: Oh! Wait wait! I think we do want an L 1L^1 norm here. Let’s partition a set XX into two disjoint sets X 1X_1 and X 2X_2 so that

X=X 1+X 2.X = X_1 + X_2.

Since X 1X_1 and X 2X_2 are disjoint we have

|X 1+X 2|=|X 1|+|X 2|,|X_1+X_2| = |X_1| + |X_2|,

which is what we want for cardinality. This, to me, says that cardinality of finite sets is an L 1L^1 norm on the vector space freely generated by finite sets.

Eric: Ok ok! I’m excited. With the L 1L^1 norm, which is the natural norm for finite sets and cardinality, we can define

X,Y=12(|X+Y||X||Y|)\langle X,Y\rangle = \frac{1}{2} \left(|X+Y| - |X| - |Y|\right)

which due to profound beauty comes out to be

X,Y=|XY|\langle X,Y\rangle = |X\cap Y|

as you thought it should be. Then we have derived

cosθ=X,Y|X||Y|=|XY||X||Y|.\cos\theta = \frac{\langle X,Y\rangle}{|X||Y|} = \frac{|X\cap Y|}{|X||Y|}.

However, we derived this rather than postulating it. Very neat!

Eric: Hah! I can only imagine what it must be like trying to read this :) I’ll come back and try to write something readable once the smoke settles. For now, I am flipping back from L 1L^1 to L 2L^2 (as you suggested!). It was a valiant guess, but stuff in my previous comment does not work out as I hoped.

Eric: I did some more doodling last night. I think that when you go with the L 2L^2 norm, you end up with

|X+Y|=|X|+|Y|.|X+Y| = |X| + |Y|.

I didn’t like that because a natural way to define inner product is

X,Y=12(|X+Y| 2|X| 2|Y| 2).\langle X,Y\rangle = \frac{1}{2} \left(|X+Y|^2 - |X|^2 - |Y|^2\right).

With this, combined with |X+Y|=|X|+|Y||X+Y| = |X| + |Y| leads to

cosθ=1.\cos\theta = 1.

I couldn’t see how to get a nontrivial angle out of this. That is why I was trying to come up with something where |X+Y||X|+|Y||X+Y| \ne |X| + |Y|. The best I could come up with was

|X+Y|=|XY c|+|YX c|+2|XY|.|X+Y| = |X\cap Y^c| + |Y\cap X^c| + 2|X\cap Y|.

Only after you pointed it out, did I realize that this was an L 1L^1 norm. HOWEVER, if you take this L 1L^1 norm and squint your eyes, it almost looks like a law of cosines. By normalizing the last term, we come up with (something like)

cosθ=|XY| 2|X||Y|,\cos\theta = \frac{|X\cap Y|^2}{|X||Y|},

which is what you had suggested.

The point is, I don’t think we can postulate the form of cosθ\cos\theta. It should come from our definition of inner product.

Ok ok. Maybe that is what you were trying to tell me all along. We want to define our inner product as

X,Y=|XY| 2.\langle X,Y\rangle = |X\cap Y|^2.

Sorry. It will soak in eventually :)

Eric: Another desirable property of an inner product would be that it is bilinear so that

αX,βY=αβX,Y.\langle \alpha X,\beta Y\rangle = \alpha\beta\langle X,Y\rangle.

I’m not sure that is the case with

X,Y=|XY| 2.\langle X,Y\rangle = |X\cap Y|^2.

Toby: Isn't it?

αX,βY=|(αX)(βY)| 2=|αβ(XY)| 2=(αβ|XY|) 2=α 2β 2|XY| 2. \langle{\alpha X, \beta Y}\rangle = |(\alpha X) \cap (\beta Y)|^2 = |\alpha \beta (X \cap Y)|^2 = (\alpha \beta |X \cap Y|)^2 = \alpha^2 \beta^2 |X \cap Y|^2 .

You're right, it isn't! Somewhow I thought that it was …

X+Y,Z=|(X+Y)Z| 2=|(XZ)+(YZ)| 2=|(XZ)| 2+2|(XZ)(YZ)| 2+|YZ| 2=X,Z+2|XYZ| 2+Y,Z. \langle{X + Y, Z}\rangle = |(X + Y) \cap Z|^2 = |(X \cap Z) + (Y \cap Z)|^2 = |(X \cap Z)|^2 + 2|(X \cap Z) \cap (Y \cap Z)|^2 + |Y \cap Z|^2 = \langle{X,Z} + 2|X \cap Y \cap Z|^2 + \langle{Y,Z} .

Heck, I thought that I'd checked this stuff, but I guess not!

Fortunately your version with multisets works out fine.

Revised on October 20, 2009 at 22:48:10 by Eric Forgy