Occam’s razor | Graphical Linear Algebra

The discussion over the last few episodes has now led us to the point where we have two different languages to talk about the same thing. The two languages, however, are rather different. One consists of diagrams, the other of matrices populated with natural numbers. If you already knew about matrix algebra, maybe you’re asking yourself the question “What’s the point of all these diagrams then? They don’t tell me anything that I didn’t know before. Is this guy claiming that this diagrammatic language is better in some way?”

The short answer is no. I don’t want anyone to throw away their matrices: there are extremely good reasons to know and care about them. But I have two arguments for why we should also care about diagrams.

The first argument is that a different language means thinking about the conceptual landscape in a new way. We will see this over and over again in graphical linear algebra, and sometimes it will cause us to reevaluate certain dogmas: ways of doing things that have come about because of the way that the standard notation works, rather than real underlying mathematical issues. These kind of things are not easy to spot if you only know one notation.

More generally, the idea that the languages we use influence how we think and act—linguistic relativity if you will—is well-known and appreciated by programmers. Most programming languages are Turing complete and so have equivalent power. So why not just learn one, say x86 assembly, and stick with it?

Programmers deal with real problems, which usually do not involve writing functions to compute the Fibonacci sequence or, infamously, inverting a binary tree. Real problems are multi-faceted, and for some parts it may be useful to put on a functional hat, for others an object-oriented hat, and so on. The different ways of thinking that the various programming methodologies bring to the table are appreciated because

complex problems benefit from a combination of approaches.

When I was a student in the 90s that there were many entertaining flame-wars about which programming language approach was the best. As with most flame-wars, they were pretty pointless: programming has since grown up and nowadays it’s very rare to hear somebody claiming that a particular way of doing things is intrinsically better than another. Good programmers are comfortable thinking in several different languages, and choose the one that is most suitable for the current (sub)problem at hand. Flexibility is key.

Because of this, in the last 10 years or so, the boundaries between different language families have been blurring. For example, most functional programming languages have clever ways to do imperative things, and conversely, more and more functional features are making their way into industrial strength imperative languages. It’s pointless to argue which is the best approach, or the best language: such a thing doesn’t and will never exist. And anyway, it’s fun to see how your favourite problems can be solved with some newfangled language!

It’s been really interesting to see how different communities have reacted to this blog. About a month ago somebody posted a link to this blog on Hacker News. That link made it to the front page and resulted in a massive amount of interest, which really surprised me… who knew that so many people cared about linear algebra! Last week I posted a link there myself and again the feedback has been really great. I’ve also been regularly posting links on Reddit’s /r/math, which has also resulted in a lot of really interesting and useful comments, but also quite a bit of skepticism and maybe even a touch of hostility.

Allow me to philosophise for a moment for why this is the case. I think that it’s because many mathematicians are just not used to the idea of having different languages to talk about their concepts. In many mathematical subfields the notation is very conservative and standardised. It has evolved so slowly, so imperceptibly, that mathematical culture almost brings with an an understanding that “standard notation” has been given to us by some higher power and it is not to be disturbed. Anyway, the concepts are the important things so who cares about notation. Many mathematicians simply don’t see the value in exploring different ways of expressing themselves. Programmers, it seems, are much more receptive to the idea.

The second argument for why we should care about diagrams is Occam’s razor — the idea that one should trim the fat: discard unnecessary assumptions to try to get at the core of problems. So let’s consider the amount of intellectual technology necessary to develop the concept of

matrices of natural numbers, and their algebra ①

I need you to forget for one moment that you know all of these things already and that they are totally obvious to you; let’s pretend, for one moment, that we are explaining Earth mathematics to a group of interested Martians. And let’s not be too creative about the Martians. Let’s assume that they are of the Star Trek variety: three dimensional and humanoid enough to interest Dr McCoy in extra-curricular activities, but without the experience of an Earth education. So, to make them understand what ① means, we need to:

Explain the concept of natural numbers, together with addition and multiplication
Show that the natural numbers form what is called a commutative semiring. Let’s go through these requirements:
1. addition is associative, commutative, and has 0 as it’s identity; that is, for any natural number n, we have 0 + n = n = n + 0
2. multiplication is associative, commutative and has 1 as it’s identity; that is, for any natural number n, we have 1· n = n = n · 1
3. multiplication distributes over addition, that is, for all natural numbers a,b and c, we have a·(b + c) = a·b + a·c (in a non-commutative semiring we would also require (b+c)·a = b·a + c·a, but that follows here from commutativity of multiplication)
Define the concept of an m×n matrix, a grid of numbers. If we are totally honest then we ought to probably also go through the corner cases where any combination of m and n can be 0.
Write down the formula for matrix multiplication. This relies on already knowing about addition and multiplication of natural numbers. The semiring properties are necessary for matrix multiplication to behave as expected (e.g. for it to be associative).
Write down the formula for matrix addition.

This is actually a significant amount of work. And maybe the Martians are a bit curious about where exactly did the formula for multiplying matrices came from. They know from previous Earth experience that you can’t really trust those pesky humans and their crazy ideas.

On the other hand, we have already done all of the work of showing how diagrams work in the first few episodes of this blog. If explaining to the Martians, we would need to:

Explain the idea of magic Lego; the algebra of diagrams that consists of the two operations of direct sum ⊕ and composition ;
Go through the rules of diagrammatic reasoning: stretching, tightening, moving generators along wires
Introduce the four generators together with the ten equations.

Let’s compare the two ways. In the traditional approach, one first defines the natural numbers and then “bootstraps” matrices on top. With diagrams, there is no precedence given to natural numbers as such; everything gets defined at the same time. Natural numbers can be recovered as special kinds of diagrams, those with one dangling wire on each end. We don’t need to invent any extra formulas for multiplying matrices because composition is a basic operation of the algebra of diagrams. Matrix addition is not a primitive operation at all, it can be constructed from the diagrammatic primitives.

Having said all that, this blog so far has followed a rather conventional route: first we talked of natural numbers as diagrams and then of matrices as diagrams. Today, we are going to go back and think about how the maths would have been developed without knowing about natural numbers and matrices from the outset. We will rediscover diagrams by focussing on the operations and constructions that are natural in their language and see where that takes us.

To spice things up let’s try some alternate history. This is something that historians really hate; my sister, who is a historian, always gets annoyed when I bring it up. But who doesn’t love to annoy their siblings?

We need to turn the clock back to before matrices were common, before notation settled into what appears in textbooks today. Our story thus takes place in early 13th century Pisa. Our hero is a certain young man, Fibonchi, a contemporary and friend of Fibonacci.

While a teenager, Fibonchi sailed to the Balearic Islands with a Pisan trade fleet. In Palma, he met a wise old Berber merchant who introduced him to four curious operations that spoke of putting things together and copying them, governed by collection of simple, self-evident rules. The old men of Palma enjoyed drawing the diagrams and even used them to perform some calculations.

On his way back home, Fibonchi realised that the diagrams are what is now known as a recursive data type: every diagram is put together from the four simple bricks, the identity and twist. Using this insight, he was able to prove a number of interesting things about them, using the principle of induction.

To help in his calculations Fibonchi wrote down the formula for diagrams that copy and add not just one wire, but an arbitrary number on wires. Coincidentally, this was the syntactic sugar we discussed a couple of episodes back.

Next, Fibonchi proved that these sugars satisfied a very pleasing property with respect to arbitrary diagrams. For example, given any diagram D with m dangling wires on the left and n dangling wires on the right, the following is true:

Of course, once Fibonchi had finished this proof, he immediately wrote down his second theorem.

He did not even bother writing down a proof, since he was told about about the principle of “bizarro” by the old man in Palma. Because all the equations in the system have bizarro (reflected, colour-swapped) versions, once you proved a theorem you’ve also proved its bizarro cousin.

Fibonchi also noticed the following two related facts, the second the bizarro version of the first.

These properties intrigued Fibonchi, and he went ahead and defined an operation on compatible diagrams—those that agreed in their respective numbers of dangling wires on the left and right. He called the operation sum. This operation would be rediscovered in different notation a few centuries later and given the name matrix addition.

He proved that sum was associative

and commutative:

Moreover, each compatible family of diagrams contained an identity for this operation:

Fibonchi showed that D + 0_m,n = D = 0_m,n + D.

He also noticed that composition distributes over sum: we have

D ; (E+F) = D ; E + D ; F

The proof, reproduced below, was a straightforward application of his first theorem.

He was, of course, immediately able to state the bizarro version:

(D+E) ; F = (D ; F) + (E ; F)

Fibonchi knew that composition of diagrams was not commutative, in most examples it didn’t even make sense to compose diagrams in more than one way because the dangling wires did not match up. Non-commutativity was the norm. However, he did notice some peculiar things about diagrams with exactly one dangling wire on each end.

He noted that there was exactly one such diagram for every natural number, which gives the number of paths from the left dangling point to the right one. He understood that if two such diagrams have an equal number of paths then they are equal. His arguments were close to the rewriting argument of the last episode. Another strange thing was that the bizarro of every such diagram was equal to the diagram you started with.

Most curiously of all, composition between such diagrams was commutative, which Fibonchi confirmed with a simple induction. First the base case, where the first diagram has zero paths:

and the inductive step:

Inspired by Fibonacci’s masterpiece Liber Abaci, Fibonchi wrote his own treatise entitled Liber Diagrammatis. Unfortunately it is now lost; the last remaining copy fell into the hands of a Livornese fishmonger sometime in late 16th century, who used it to wrap his wares. The Livornese never much liked the Pisans or what they had to say.

Fortunately, we can continue Fibonchi’s story on this blog.

Continue reading with Episode 18 – Introducing the Antipode.

Graphical Linear Algebra

Menu

Tag Archives: Occam’s razor

17. Maths with Diagrams