We have established that there is a homomorphism called θ from B, the PROP of diagrams, to Mat, the PROP of matrices. To do this, it was enough to say where θ takes the four stars of the story so far: the add, zero, copy and discard generators.
The task for this and the next episode is to understand why θ is an isomorphism—a perfect translation—and to show that a PROP homomorphism is an isomorphism, it is enough to show that it is full and faithful.
The intuition for why the translation works well was established all the way back in Episode 10. The idea is that a diagram with n dangling wires on the left and m on the right translates to a matrix with n columns and m rows. In that matrix, the entry in the ith row and the jth column of the matrix is the number of paths from the jth dangling point on the left to the ith dangling point on the right in the diagram. It’s not difficult to verify that this is exactly how θ works: for example we have the following:
Now, to show that θ is full, we just need to convince ourselves that every matrix has a diagram that translates to it.
In fact, this may already seem to be pretty straightforward, given what we know about paths. If I give you some matrix then—looking at the example above for guidance—you should be able to go away and draw a diagram that does the job. Still, intuition doesn’t always cut the mustard, so it’s worthwhile to spend some time doing this properly. That’s our task for today. The proof of fullness also provides us with new syntactic sugars that make the diagrammatic language even nicer to use.
In Episode 9 we saw how diagrams with one wire dangling on the left and one on the right can be identified with the natural numbers. By now we know a whole lot more about diagrams. Today we will see that the diagram for a natural number n translates via θ to the 1×1 matrix (n). The proof that θ is full is the continuation of this story, since it amounts to building up a diagram for any matrix of any size.
Before we launch into the proof, we need to introduce a new family of syntactic sugars, one for each natural number m. Their names are m-copy and m-addition, and, just like copy and add, one is the bizarro version of the other. We therefore only need to concentrate on thinking about one, since the story for the other is simply the bizarro version.
Our old friend, the copy generator, has a single dangling wire on the left, and our standard intuition says that it works by copying whatever value appears on that wire to its two wires on the right. But what if there are values arriving on two or more wires, and we want to copy all of them? For example, in the case of two wires we would need to construct a circuit that has two dangling wires on the left and four on the right that behaves as follows, for any numbers x and y:
We can wire up an such a diagram using two copy constructors and a twist: we copy the x and the y, obtaining two xs and two ys, then we swap an x and y. Like this:
It seems that it should be fairly straightforward to construct such a gadget for three wires, four wires, and so on, using only copy, identity and twist. This is the idea of m-copy, where m is a positive integer, and we will draw it like this:
The white m in the middle of the black circle refers to the number of wires being copied. The black ms on the wires are a sugar that we have already seen: they mean m stacked identity wires, or m-wires for short.
The picture of the sugar is a bit busy with all of those ms. We want to keep our diagrams looking sharp, so when we use m-copies we will typically be a bit lax with writing all of these numbers. Just keep in mind that whenever you see one, it has an m-wire dangling on the left, and two m-wires dangling on the right. One more thing: if you want to draw this sugar on paper and you don’t have access to white pens, feel free to write the number above the circle.
Let’s go ahead and define it, using recursion. The base case, m = 1, is just the copy generator.
Next, the recursive case is:
So, to copy k + 1 wires, we can copy the first k wires recursively, copy the last wire, and then we have to make sure that the dangling wires line up properly on the right, using σk,1. We have already seen the case m = 2, so here’s what the construction gives us for m = 3:If, unlike me, you have a good memory, you probably remember a different sugar for copy, which we introduced in Episode 10. There we chained several copies one after the other, obtaining a copy sugar that has one dangling wire on the left and several on the right. Both the sugars are useful, and it’s possible to use them together, but it is important not confuse them.
It’s interesting to think about the matrices that correspond to m-copy. So let’s take a look at the first two:
The pattern is becoming apparent: in general, it’s not difficult to show that for an m-copy, the corresponding matrix is formed by stacking two identity matrices of size m, as illustrated below.
Although we don’t strictly need this for our fullness proof, in the future it will be useful to know that m-copy satisfies the same kind of equations as ordinary (1-)copy. In particular, there is the m-version of (CoComm):
where the role of the twist is played by σm,m. The proof is a pretty straightforward induction, and we will skip it. Similarly, we can prove an m-version of (CoAssoc)
and an m-version of (CoUnit).
The left hand side of the last equation features a new, but very simple, sugar: the m-discard: this is simply m discard generators stacked on top of each other. For example, this last equation, when instantiated at m = 2, looks like this when de-sugared:
The bizarro version of the m-copy sugar will also come in handy for us. Just as addition is the bizarro copy, m-addition is the bizarro m-copy. Being bizarro, we do not need to define it from scratch since we can take the recursive definition of m-copy, reflect it and invert the colours. But let’s just understand what it means in terms of our intuition; for example, the behaviour of 2-addition is as follows.
Following our bizarro tradition, we will draw m-addition like this:
and, just as expected, m-addition satisfies the m-versions of the adding equations.
The last equation features m-zero, which similarly to the m-discard, is simply m zero generators direct summed together.
Finally, let’s think about the matrices that correspond to m-addition. We already know that bizarro in the matrix world means transpose matrix, so it’s not surprising that we have:
We are ready to prove that θ is full, that is, given any m×n matrix A, there exists a diagram D such that θD = A. Let’s first consider the corner cases: m = 0, n = 0 or both m = n = 0.
When m = n = 0, there is precisely one matrix with 0 columns and 0 rows, the empty matrix (): 0 → 0. Coincidently, there is a diagram with 0 dangling wires on both ends, the empty diagram. Both the matrix and the diagram are actually id0 in their corresponding PROPs, so θ by definition takes the empty diagram to the empty matrix.
When n = 0, m ≠ 0, the diagram that does the job is the m-zero. Similarly, when n ≠ 0, m = 0, the diagram that does the job is the n-discard.
The interesting part of the proof concerns m×n matrices where m,n≥1. Our proof is divided into three parts, and each part is an induction. In Part 1 we will handle 1×1 matrices, next the column vectors (m×1 matrices) in Part 2, and finally in Part 3 matrices of arbitrary size (m×n). Each part uses the results of the previous parts.
Let’s get started by showing that fullness holds for every 1×1 matrix. What we want to prove is that, for any natural number a, we have:
where the diagram in the left hand side is the our diagrammatic sugar for natural numbers, which was the topic of Episode 9.
We can prove it by induction on a, the base case is:
where, by definition of composition in the PROP Mat of matrices, we multiply the unique 1×0 matrix with the unique 0x1 matrix. The result is a 1×1 matrix, and following the formula for matrix multiplication, it is the empty sum. And the number you get after adding up zero things is zero. So far so good.
Now for the inductive step: let’s assume that the claim holds for a = k, and show it for a = k + 1.
In the first step we use the recursive definition of the natural number sugar, and in the next two steps we use the definition of θ. We use the the inductive hypothesis in the fourth step. Next we just perform the calculation in the PROP of matrices. That completes the job for 1×1 matrices.
Using what we already know, we will show that every column vector, that is, every m×1 matrix has a diagram that maps to it via θ. Again we do it using induction, but this time on the height of the vector, m. The base case is 1×1 matrices, which we have established in Part 1.
So, let’s do the inductive step: supposing that fullness holds for k×1 matrices, we want to show that it holds for (k + 1)×1 matrices. The key observation here is that every such matrix looks like this:
where v is a column vector of height k and a is some natural number. Using the inductive hypothesis, we know that there exists some diagram (that we may as well call v) with one dangling wire on the left and k on the right, such that:
Using this inductive hypothesis and our result from Part 1, we can build a diagram for the larger column vector:
and this completes the job. So we now know how to construct diagrams for column vectors of arbitrary size.
This final part concentrates on arbitrary m×n matrices. The induction, this time, is on n. The base case is m×1 matrices, and we found out how to deal with these in Part 2.
Now, assuming that we can construct diagrams for m×k matrices, we need to show that we can construct them for m × (k + 1) matrices. The observation that we can make here is that any such matrix looks like this:
where A is some m×k matrix and v is a column vector tacked on at the end. The inductive hypothesis tells us that we have a diagram that translates to A:
and we can use this to construct a diagram for the larger matrix. To do this, we make use of the m-addition sugar that we cooked up before and the fact, shown in Part 2, that we know how to get a diagram for v.
That completes the proof: θ is full, as advertised.
Back in Episode 9 we introduced a syntactic sugar for natural numbers as certain diagrams. The proof of fullness shows us how to extend this to a sugar for any m×n matrix.
This means, that given a m×n matrix A, we have a diagram
We can reinterpret the induction proof, working back from Part 3 to Part 2, to obtain a recursive definition for this sugar, extending the natural number sugar of Episode 9:
This sugar will be quite useful for us as we do more graphical linear algebra, since it will allow us to consider any matrix of natural numbers as a diagram!
Continue reading with Episode 16 – Trust the Homomorphism, for it is Fully Faithful
16 thoughts on “15. Matrices, diagrammatically”
Shouldn’t theta of m-addition (in the middle of the post) be equal to (I_m, … , I_m) – m copies of I_m?
Oh wait, never mind – that would translate to adding up m copies instead of just 2 🙂