29. Dividing by zero to invert matrices

I’ve let the blog slide. At the end of this episode I’ll write a bit about what I’ve been doing in the last few months. For now, let’s just jump straight back into graphical linear algebra.

In order to motivate myself to get back to writing, I brought forward an episode that I’ve been excited about since late last year, when I first did the research. This material is closely linked to Episode 26 on division by zero. If you haven’t read it, or you have forgotten, it will be useful to take a look before continuing.

Episode 26 was quite popular — maybe the clickbait title helped. It did leave some people a bit skeptical: ok, so this guy has derived addition and multiplication tables that seem to say that dividing by zero makes sense. But can this weird idea of dividing by zero actually be useful, or is it merely a curiosity?

There are other number systems that, in principle “allow” you to divide by zero. Many of them work similarly to the IEEE standard for floating point arithmetic, which has an additional entity called NaN (not a number). Roughly speaking, it’s the result of all of the calculations that shouldn’t make sense, like dividing by zero. Next, if you add and multiply a NaN with anything else, you still have a NaN. So it’s kind of like a black hole number: once you get it part way through your calculation you can be pretty sure that something has gone wrong and that the overall result will be a NaN — no matter what you do, you will never end up with an ordinary number again.

This is not the case for the number system from Episode 26. In fact, the three additional elements ∞, ⊥, ⊤ do show up in calculations. The cool thing is that they are not a sinkhole like NaN: you don’t need to panic if you get, say, a ⊥ in one of your calculations. It’s possible that things will finish well. This episode is entirely devoted to one example: inverting matrices.

We will need a few definitions.

An n×m matrix A is said to be invertible when there exists an m×n matrix B, such that when you compose in either direction, you get the identity. In diagrams:

If such a B exists, we sometimes write A^-1 = B. Many matrices are not invertible. One property of those that are is that m=n, that is, invertible matrices are all square; the number of rows is equal to the number of columns.

Actually, it’s like this in most PROPs – you can usually only invert things where the left and the right boundaries have the same size. A useful intuition to keep in mind is that something can only be invertible if it can be undone. And since wires carry information, we can’t just throw one away or add a new one and expect to be able to undo it.

Back to definitions. We will say that an n×m matrix A is injective, or mono, when the following is true.

The backwards facing A is none other than the mirror image symmetry † that we discussed in Episode 27.

The above definition probably seems a bit strange. In most ordinary linear algebra textbooks, the defining property of an injective n×m matrix is that it takes different m-vectors to different n-vectors: that is, it is injective as a function. This is actually an equivalent formulation, but we don’t think about matrices as functions on this blog.

One interesting property of mono matrices is that they are left-cancellable: that is, given two equal-sized matrices B, C for which AB and AC is defined, if AB=AC then B=C. In fact, this is true when B and C are linear relations. Diagrammatically:

implies that

Here B and C are linear relations, i.e. arbitrary diagrams in IH. This of course includes the case where B and C are matrices.

Actually injectivity and left-cancellability are also equivalent, i.e. assuming one, you can prove the other. I’ll leave the proofs as an exercise but I’ll give you a hint: it’s handy to use the fact, stated in Episode 27, that for any linear relation A, A^† is its weak inverse.

The bizarro of injective is surjective. We will also use the word epi. Thus, an n×m matrix is surjective when the following is true:

Just as mono matrices are left-cancellable, epi matrices are right-cancellable; given B and C such that BA and CA make sense, if BA=CA then B=C. In terms of ordinary linear algebra, surjective matrices are those that are surjective as functions; for every n-vector b, there exists an m-vector a such that Aa=b.

There is one final observation that we will need later: any invertible matrix is both injective and surjective. To keep things moving along, I’ll also leave the proof of this as an exercise.

Let’s now prove a classic result of linear algebra. It refers to kernels and images, which are linear subspaces that we discussed in the last episode.

Kernel and Image Lemma. Suppose that A is an m×n matrix. Then:

A is injective exactly when it has a null kernel
A is surjective exactly when it has a full image.

Proof. Properties 1 and 2 are mutually bizarro, so it’s enough to prove just one. We will do 1.

Suppose that A is injective. Then:

Which means that A has a null kernel.

Suppose now that A has a null kernel; we want to show that it must be injective.

That’s it, we’re done, QED.

The following lemma is very useful because it gives us some useful graphical intuition about invertible matrices.

Invertible Matrix Lemma

Suppose that A is a square matrix of size m. Then the following are equivalent

A has inverse B
There exists matrix B such

inverse

Proof. Let’s first show that 1 implies 2. Assuming 1, Suppose that A has inverse B. Since A is invertible, we know that A is injective. So we have the following:

Which gives us 2. In the first step, we used the assumption that B is the inverse of A, and in the second step we used the fact that A is injective.

We still need to show that 2 implies 1. So, supposing that A is equal to some matrix B heading in the opposite direction, we need to show that B is the inverse of A. To do that, it’s useful to show that A must be injective. Here we can use our Kernel and Image Lemma.

So, by the kernel lemma, A is injective. We can now use this to show that A followed by B is identity.

To show that B followed by A is also identity, we can do the bizarro thing, first showing that B is surjective.

Our aim is to obtain a formula for the inverse of a matrix, if it exists. The starting point is an arbitrary 2×2 matrix, so something like this:

If an inverse exists, then we know it’s also going to be a 2×2 matrix:

The tricky part is, how can we calculate a’, b’, c’ and d’, knowing a, b, c and d? If you remember your linear algebra, maybe you’re thinking determinants and adjugate matrices. We will bypass all that and draw some pictures instead.

Let’s start by using our Invertible Matrix Lemma. As diagrams in IH, we have:

Now adding a matrix to -1 times itself gives us the zero matrix. Using this trivial observation, we get the following rather complicated looking equation:

We can extract the formula for each of a’, b’, c’ and d’ by plugging up some of the dangling wires in the above equation. Let’s work through the procedure for a’.

The first step is to plug a zero and a discard at the right places in both sides, like so:

pattern

Doing that, we get the first equality below.

The second equality follows from the bone equation of bimonoids.

So far so good: but now we can drastically simplify the left hand side. Essentially, we’ve plugged a white unit into a black comultiplication, and vice versa; usually when that happens, there is mass destruction. I’ll leave the graphical derivation to you. After you’ve killed off most of the diagram, the left hand side ends up looking like this:

step3

The nice thing is that now a’ is the only entry from the inverse matrix that’s left. And we already know that the diagram above is equal to zero, i.e.

rhs

So we are in the situation where a’ plus -1 times some mess is equal to 0; thus:

step4

This is starting to look like a formula. We can untangle the still somewhat messy right hand side above. The result is:

In terms of a bona-fide, old-fashioned syntactic formula—if you’re into that kind of thing—this translates to:

fomulaaprime

Note that I did not go further in simplifying the denominator because d might be zero, and in that case, multiplication may not be commutative. Relax; to evaluate the formula for specific numbers, we can use our addition and multiplication tables from Episode 26.

There is nothing special about a’. We adapt our procedure by plugging up the other combinations of wires and obtain formulas for b’, c’ and d’. All together:

fullformula

I find it quite cute how a, b, c and d permute in each of the formulas. And no determinants, nor adjugates in sight!

Let’s do a worked example, calculating the inverse of the matrix:

example

Referring to the formula for the inverse, we have d=0, so we will divide by zero when calculating a’. For that, we can refer to the multiplication and addition tables from Episode 26. Here’s the calculation:

calc

The other calculations don’t involve any funny stuff — we end up with

I promised you an example where the extended rational algebra of Episode 26 can be useful. And while inverting 2×2 matrices is not going to set the world alight, I hope that this is a hint that there is something interesting going on!

By the way, the formula that we derived for inverting 2×2 matrices generalises to arbitrary square matrices. I don’t have the space for this in this episode, but if there’s interest I can come back to this topic. Let me know.

Not writing is like not going to the gym (which, coincidentally, I’m also guilty of). I mean that once you start not writing, it’s difficult to get back into it. “I’ll do some writing next weekend” – you tell yourself. Before you know it, half a year has passed. Same with (not) going to the gym. I’m still working on that.

Having said that, it’s been a very busy half a year. In addition to teaching, I visited Paris—and more specifically, Jean Krivine—for a month. I love Paris; I spent a six-month postdoc there in 2005 and it was fantastic to be back in the 13^e arrondissement.

paris

Apart from enjoying the city, it was a really productive visit and I learned a lot. Jean is currently working on a graphical high-level programming language for biologists, which shares many features with the kind of string diagrams that this blog is obsessed with.

Amongst other things, Jean told me about a family of string-diagrammatic calculi for logic, developed by the famous 19th century American logician C.S. Peirce (pronounced “purse”; it’s Dutch, apparently). Jean uses some of Peirce’s ideas for his programming language. Peirce’s systems are called existential graphs and have a fascinating history. While Peirce considered them his most important scientific achievement, they remained unpublished in his lifetime; I guess due to a lack of typesetting technology for diagrams, but maybe also due to the fact that his ideas were so revolutionary that they were about 100 years before their time. In fact, Peirce’s diagrams came before the standard syntactic way of writing logic, with all the upside down As and backwards Es that logicians are so fond of. But his work was forgotten and were rediscovered only in the 1960s. They are still a topic of research.

Existential graphs really blew me away. I plan to understand them properly over the next few months, and maybe I will write about them more seriously one day. For now, if you’re interested in reading further, the PhD thesis of Frithjof Dau seems like a great resource.

This weekend I will be in Porto for the HDRA workshop, and the weekend after that I’m flying to New York for the LiCS conference. It’s going to be a busy summer, but I will try to work more regularly on the blog from now on.

Continue reading with Episode 30 – The essence of graphical linear algebra.

26 thoughts on “29. Dividing by zero to invert matrices”

Jason Erbele says:

June 22, 2016 at 9:11 pm

Great to see you’re back! I don’t know why I never thought to describe monos and epis graphically, especially considering one of my results is (in terms of the language of this blog) X is true when a particular diagram in B is a mono, and Y is true when a related diagram in B is an epi. It would be cute to state that result purely diagrammatically: X is true when a particular diagram attached to its mirror image (which takes us into the world of IH, of course) is an identity diagram, mutatis mutandis for Y. It’s fun to see how the mirror and bizarro symmetries interact.
As one already converted to dividebyzerophilia, I enjoyed seeing the application to matrix inversion. When I first learned about √-1 I tried to figure out a number system that involved 1/0, but I lacked the insight and/or tools needed to make it work. So I am naturally pretty excited about this stuff, even after modding out by the addiction to playing with string diagrams. (:
By the way, minor typo just before the 2×2 matrices: “to do show”.

LikeLiked by 1 person

Reply
- Jason Erbele says:
  
  June 22, 2016 at 10:15 pm
  
  Speaking of typos, my Bs above would be closer to Hs, but <b>B</b> wants to be typed more than <b>H</b> since the finger is already over the B key for the bold tag. If I was being careful, it should be more like Mat_Q, which already allows some of IH to slip in.
  
  LikeLike
  
  Reply
  - pawel says:
    
    June 23, 2016 at 9:50 am
    
    Thanks for the feedback Jason!
    
    LikeLike
Rasmus Bonnevie says:

June 25, 2016 at 3:33 pm

You never say it explicitly, but if a matrix A is both injective and surjective (which you state must be true for every invertible matrix) isn’t the inverse of A just the mirror image of A? Or am I forgetting some rules related to the “orientation” of the matrix blocks?

LikeLike

Reply
- pawel says:
  
  June 25, 2016 at 3:47 pm
  
  I think that you’re asking two questions:
  
  1. is every injective and surjective matrix invertible?
  2. is the inverse of a matrix its mirror image?
  
  The answer to both questions is yes. For 2, this is just the Invertible Matrix Lemma in this episode. I avoided 1 in this episode, but I will come back to it soon.
  
  LikeLike
  
  Reply
Will says:

July 1, 2016 at 11:11 am

I don’t suppose you could do something similar for some variant of Jordan Normal Form (without forcing the off-diagonal 1s to be on the main-diagonal-but-one, perhaps)?

LikeLike

Reply
Roy Dong says:

July 7, 2016 at 4:48 pm

I just binge-read this entire blog. Great work! It’s incredibly interesting and well-presented. I mean both earnestly, not as faint praise 😉

One question jumps to mind right now. For orthonormal matrices, $U^{T} = U^{-1}$. In this diagrammatic way of thinking, it translates to the fact that taking the dual (which I’m kind of assuming is the technical term for ‘bizarro’ transformations) and reversing the graph yield the same graph afterward. Is there any graph-centric way to think of this particular symmetry?

Thanks for all your work writing this blog!

LikeLike

Reply
- pawel says:
  
  July 8, 2016 at 3:55 pm
  
  Thanks a lot 🙂
  
  Yes, the transpose of A is its inverse if and only if “mirror image” A is equal in the diagrammatic theory to “bizarro” A. It’s a nice point, so I will try to fit an example in a future episode!
  
  LikeLike
  
  Reply
markusq says:

July 11, 2016 at 2:39 pm

Thank you for coming back!

LikeLike

Reply
reproducingkernel says:

July 25, 2016 at 10:43 am

On the division by zero, please look the papers:

(2016) Matrices and Division by Zero z/0 = 0. Advances in Linear Algebra
& Matrix Theory, 6, 51-58.
http://www.scirp.org/journal/alamt　　http://dx.doi.org/10.4236/alamt.2016.62007
http://www.ijapm.org/show-63-504-1.html

Click to access 9.pdf

DOI:10.12732/ijam.v27i2.9.

Albert Einstein:
Blackholes are where God divided by zero.
I don’t believe in mathematics.
George Gamow (1904-1968) Russian-born American nuclear physicist and cosmologist remarked that “it is well known to students of high school algebra” that division by zero is not valid; and Einstein admitted it as {\bf the biggest blunder of his life} ：
Gamow, G., My World Line (Viking, New York). p 44, 1970.

LikeLiked by 1 person

Reply
- pawel says:
  
  July 25, 2016 at 11:35 am
  
  Interesting! Note that for the “graphical fractions” of this blog, it is not in general the case that (b,a)(c,d)=(bc,ad).
  
  LikeLike
  
  Reply
- Jason Erbele says:
  
  August 9, 2016 at 11:58 pm
  
  It seems that by turning division from a partial function in two variables into a function in two variables, this z/0=0 approach sacrifices multiplication, making it a multifunction in two variables instead of a function in two variables, where 0*0 takes any value. I hesitate to judge the merits and demerits of this sacrifice – it seems like a plausible mathematical universe – but I will say that the purported proofs of z/0=0 seem less than airtight. This is nothing against z/0=0 by itself; I would be highly sceptical of a proof of the parallel postulate based on Euclid’s other postulates.
  
  LikeLike
  
  Reply
violeta (@pksomethin) says:

January 22, 2019 at 5:06 pm

in the proof of equivalence between injectivity and null kernel, are the equations in parenthesis proven somewhere in the blog? it’s easy to prove them using the underlying linear relations, but i wanted to find a purely pictorial proof…

LikeLike

Reply
- pawel says:
  
  January 22, 2019 at 5:08 pm
  
  Yes they are — the proof is purely pictorial. Which derivation are you missing?
  
  LikeLike
  
  Reply
  - violeta (@pksomethin) says:
    
    January 23, 2019 at 3:45 pm
    
    the second and the third one here https://graphicallinearalgebra.files.wordpress.com/2015/12/kernellem2.gif
    
    LikeLike
pawel says:

January 23, 2019 at 3:58 pm

For the second, first add -< A | – | A > – on the lower branch, take -<A|- past the addition, and cancel out. If it's not clear, email me — I'll send you a picture. The third is the assumption of null kernel, no?

LikeLike

Reply
- violeta (@pksomethin) says:
  
  January 23, 2019 at 4:53 pm
  
  oh right, about the third one, i made a mistake, i was thinking of the first one in parenthesis here https://graphicallinearalgebra.files.wordpress.com/2015/12/kernellem1.gif
  
  and about your proof, i’m not sure if i understood it right: “first add – – on the lower branch” doesn’t that assume surjectivity, so that – – = identity? and then after taking –-?
  
  LikeLike
  
  Reply
  - violeta (@pksomethin) says:
    
    January 23, 2019 at 4:57 pm
    
    weird, wordpress ate a bunch of stuff i wrote. here it is again:
    
    and about your proof, i’m not sure if i understood it right: “first add – – on the lower branch” doesn’t that assume surjectivity, so that – – = identity? and then after taking – -?
    
    LikeLike
violeta (@pksomethin) says:

January 23, 2019 at 5:03 pm

haha oh my, it happened again, i don’t know, it’s probably something with those symbols. anyways, i think the first objection is clear, the second one is just that to be able to cancel out the A’s, i need injectivity, which is the thing we’re trying to prove.

LikeLiked by 1 person

Reply
- pawel says:
  
  January 23, 2019 at 5:12 pm
  
  Yes!! You’re right — sorry, I was replying on autopilot. 🙂
  It is nonetheless true, you don’t need to assume injectivity.
  Here’s a better proof, written out very quickly. Let me know if anything is not clear.
  
  View post on imgur.com
  
  LikeLike
  
  Reply
  - pawel says:
    
    January 23, 2019 at 5:40 pm
    
    BTW – the “going around the black cup” in the third step of my proof can be proven by induction on the structure of A. It’s not too difficult – prove that it works for all the generators, the interesting case is addition. Then, for the inductive step show that it’s closed under ⊗ and composition, both are trivial.
    
    LikeLike
  - violeta (@pksomethin) says:
    
    January 23, 2019 at 6:51 pm
    
    oh, right, that seems to work! i’m a brazillian master student working on my dissertation. it’s an introduction to applied category theory, and i’m writing a little about graphical linear algebra, basing it on spivak’s 7 sketches book and your work. thanks a lot 🙂
    
    LikeLiked by 1 person
  - S says:
    
    January 30, 2022 at 9:49 pm
    
    Thanks for this, I was also stuck here. I’m still working through the “around the black cup” (which seems straightforward and obvious), but the more important question is: did this parenthetical seem obvious, despite this rather unusual proof? Because when I try to write it down as matrix equations, it doesn’t seem obvious at all!
    
    LikeLike
  - Sahil says:
    
    January 31, 2022 at 1:10 pm
    
    I see that in the new axiomatization[0], this lemma is taken as an _axiom_ called “Wrong Way”! That explains my previous concern about how this could be obvious: Pawel probably confers it axiomatic status in his mind!
    
    And it is really simple to read out, essentially saying that an operator distributes over minus, ie. Ax – Ay =A (x-y). This also explains why the antipode showed up when it doesn’t seem a prioi like it has anything to do with antipodes. It was a transformation of Ax-Ay to Ax + A(-y) to A (x+ -y) to A (x-y).
    
    But leaning on equations to get the intuition for the proof is a maybe a bit of a crutch, only an artefact of mathematical history. One might instead want to get a native taste of the diagrams without “translating”. It seems both more powerful and more vivid.
    
    [0] Joao De Paixao and Pawel Sobocinski, “Calculational Proofs in Relational Graphical Linear Algebra”, In 23rd Brazilian Symposium on Formal Methods (SBMF 2020), 2020.
    
    LikeLiked by 1 person
pawel says:

January 23, 2019 at 7:23 pm

Awesome!! Good luck in the dissertation and feel free to contact me if you get stuck on anything else!

LikeLike

Reply
Pluma says:

March 26, 2021 at 6:23 pm

Does having non-commutative multiplication for the infinity symbols helps in some calculations? It seems in this example it did not matter.

LikeLike

Reply