# Why string diagrams?

Sometimes I put the URL of this blog into a search engine to see what people are saying. When I first started – having heard about the fabled Internet Hate Machine – I was a little worried that I’d have to deal with tons snarky remarks and negativity. The opposite has happened. I’ve received several touching personal “thank you” emails, and a large majority – maybe even 90% – of social media feedback has been positive and extremely polite.  So, unexpectedly, it’s been a bit of an Internet Love Fest. It’s all a bit disconcerting: hey, I enjoy arguments as much as the next guy!

Of the non-positive 10% of remarks, 95% have been quite useful and constructive. They allowed me to look outside my own conceptual bubble and understand the parts of blog that people have found mysterious, unclear or even annoying. In particular, there is one recurring sentiment.

The guy has some interesting insights, but what’s that weird diagrammatic notation all about?

I realised that I spent a lot of time talking about “the rules of the game” for dealing with string diagrams, using them in proofs and explaining some technical facts about what they mean mathematically, but I never really explained the fundamental questions of why people should care about them. This made me re-evaluate the way I present this material: at the recent Midlands Graduate School I spent half of my four lectures explaining the story of string diagrams and why computer scientists (and engineers, physicists, mathematicians, economists, philosophers,…) ought to care about them.

I think that this is an important missing chapter of the story so far. It’s an interesting tale in itself, involving natural language, Universal Algebra, programming languages and higher category theory. I will make the case that string diagrams are not only important, they’re inevitable. And it’s due to something called resource-sensitivity.

## What is resource-sensitivity?

It’s a cliché that the digital revolution has changed the way we understand the world. One, somewhat under appreciated aspect is the way that the concept of data has changed the way human beings perceive reality.  Sequences of 0s and 1s encode all kinds of information and have spoiled us: millenia of experience of coping with finiteness (food, money, time, etc.) has been augmented with an immensely powerful resource that comes with an unprecedented ability to be copied and thrown away at will. Media companies have been struggling with this disturbing new reality since the days of Napster. Dealing with data is also a conceptual leap that programmers have to make when first learning to program; they learn to navigate this unfamiliar and exotic world, governed by a resource like few other in human experience.

Let’s start with a simple example of resource-insensitivity. The following is a very simple function, in C syntax, that takes an integer x, multiplies it by itself, and returns the result.

int square (int x) {
return x * x;
}

The same identifier x appears twice in the body of the function (and another time, at the start, where it’s declared as the parameter). That means that the resource that the function deals with, in this case an integer, is implicitly assumed to be copyable. This is fine: an integer is represented by a sequence of 0s and 1s in memory. It’s not a problem to use it several times, copy it somewhere else or even overwrite it.

But not every resource is as permissive as data. One of the first string diagrams on this blog was the recipe for Crema di Mascarpone. The wires were typed with ingredients and there were several generators; for example a crack-egg generator that took an egg and produced a white and a yolk.

Actually, it was subsequently criticised by a colleague on my Facebook for not being truly honest, because it implicitly threw away the shell. But here’s something less controversial: I think everyone would agree that doesn’t make sense to include an egg-copy generator.

So, while copying numbers is ok, copying eggs is not. It’s an amusing idea precisely because we are intuitively aware of the concept of resource-sensitivity. Actually, once you start thinking about it, there are examples everywhere. For example consider the following, grammatically correct sentence in a natural language:

Have your cake and eat it too.

This aphorism is used in situations where you can’t have something both ways. But the literal meaning is referring to a physical object: you can’t have a physical cake, eat it and still have it, since the action of eating it destroys it. So, perhaps, a hypothetical programming lanuage for cake handling should not allow you to write the following routine:

cake handle_cake (cake c) {
eat c;
return c;
}

Here executing the command eat c; consumes the resource c, so there is nothing to return.  It seems that type cake is quite unlike the type int of integers.  We will call the integer kind of data classical: classical data can be copied and discarded, whereas non-classical data (like eggs and cakes) cannot.

The closest thing to classical data that we had before computers is referencing things or concepts. In other words, information, which can be copied (e.g. by monks) and discarded (e.g. in a bonfire). Natural languages like English evolved to express information, and so they have several classical features built-in. In particular, the way that sentences of natural language are constructed are an example of something called classical syntax. Classical syntax is a good fit for dealing with classical data like information, but as we will see, also has some annoying limitations.

## What is classical syntax?

Trees are a popular way of organising classical data, and are closely related to the notion of classical syntax.

Unlike traditional trees, like the Hawaiian one in the photo above, abstract trees are usually drawn with the root at the top. For example, the following is the tree representation of the expression 2 × 3 + 4.

The usual convention that × is evaluated before + is captured by the fact that × appears deeper in the tree than +. So actually, trees are better than formulas because they eliminate the need for parentheses. For example, the tree for 2 × (3 + 4) is the following:

Expressions in natural language such as “you can’t have your cake and eat it too” can also be represented as trees. A lot of this is due to work by Noam Chomsky on the theory of context free grammars (CFGs), although some people date work on CFGs and their relationship with natural language back to ancient India. So “have your cake” looks something like this, as a syntax tree:

(Don’t trust me, I never actually studied English grammar. When I went to high school in Australia back in the 90s, teaching formal grammar was unfashionable and considered to “stifle creativity”. What a load of…)

Anyway, as you can see from the tree above, natural language grammars are complicated things. CFGs are also used to define the syntax of programming languages; fortunately, these are actually much simpler to describe than natural language grammars, which are full of exceptions and special cases. So here’s how the cake handling program from before might look like as a tree:

This is roughly the kind of tree that’s constructed by the compiler, or, more precisely, by one of its components, the parser. Here I’m being somewhat reductive: in the programming language context such trees have special features to deal with variables and identifiers, and these kinds of extended trees are referred to as abstract syntax.

The cake c is referred to twice in the body of the function and once in the declaration.  But since cake is a non-classical, resource-sensitive type, then perhaps the compiler ought to ensure that the variable appears only once in the function body? So maybe our cake programming language should reject this program as a syntax error?

This doesn’t really work very well. For example, some actions might preserve the resource. For example, following ought to be ok: there’s no problem with looking at your cake and having it too.

cake handle_cake (cake c) {
look_at c;
return c;
}


One popular way to go about solving this problem is to build a little type theory for cakes and catch these kinds of bugs during type checking. Although this often works, is often very cute, and is quite similar to the kind of semantic analysis that goes on in our heads when trying to make sense of expressions in natural language, I think that it ignores a deeper issue:  we are cramming non-classical data into classical syntax.

Other interesting repercussions that will become clear when we think about the mathematics that governs the behaviour of classical syntax. More about this next time. For now, I’ll highlight some things that I think are symptoms of the widespread disease of classical thinking about non-classical things.

## Limitations of classical thinking

Apart from cakes, here are some other examples of non-classical data, in order of decreasing coolness:  qubits, money and time. In fact, almost any finite resource you can think of is non-classical, and they all pose similar problems. All these examples are of things that, like eggs and unlike information, cannot be arbitrarily copied and discarded.

But even for ordinary, classical 0s and 1s kind of data, sometimes resource-sensitivity comes into play. A good example is concurrency: the idea that a computation can be split into a number of threads that are executed in parallel. The threads then maybe communicate and synchronise several times during processing in order to perform a task faster than would be possible when using a single thread.

Concurrent programming is the the task of writing programs that can do several things at the same time. It is more important and necessary then ever before, because hardware has been getting increasingly parallel in the last 15 years. Unfortunately, it is also widely feared and considered to be extremely hard to get right. Here, although the data is classical, the actual act of computation is resource-sensitive.  Two or more threads accessing and modyifying the same piece of data can have weird consequences, because of the non-determinism: we can’t predict before hand which thread will get to the data first, and it quickly becomes infeasible to examine all the possibilities.  Getting this wrong can sometimes have deadly consequences. I’ll return to the topic of concurrency later because it’s one of my favourite things to talk about. For now, I will tell you my own personal hypothesis: one of the main reasons why programming language technology has problems with concurrency is that ordinary, classical syntax has problems dealing with resource-sensitive applications such as concurrency. I think that string diagrams can help. (Incidentally, I recently watched Bartosz Milewski’s introduction to category theory lectures:  it was extremely cool to see someone introducing category theory and talking about programming in general, and concurrency in particular! I highly recommend the videos.)

Finally, one of the main points of this blog is that resource-sensitive thinking can actually help in areas that have traditionally been understood in a resource-insensitive way. For example, doing linear algebra with string diagrams reveals all kinds of beautiful symmetries that are hidden or only half visible when expressed in traditional, classical language. This is because the copying and discarding is explicitly a part of the graphical lexicon – it is the black structure of interacting Hopf monoids. Here, although the data is classical, it is nevertheless useful to be explicit about when the copying and discarding is done: only then the full picture becomes visible.

This episode has turned into one long piece of motivational text, with not too much technical content. This will change, because there is quite a bit of nice mathematics involved. So, my plans in the next few episodes are to answer the following questions in more detail:

• what are the mathematical properties of classical syntax?
• how can classical syntax be made (more) resource-sensitive?
• what is true resource-sensitive syntax? (spoiler: it’s string diagrams, just like the ones all over this blog!)
• how to convert classical syntax to resource-sensitive syntax?
• what are some examples of classical syntax, viewed through resource-sensitive glasses?
• what is the history of string diagrams?
• what are some future research and technological challenges to be solved?

The moral of the story will be that string diagrams are not just some weird, trendy notation. They are inevitable, once we accept that the world is full of non-classical things that need to be described properly. String diagrams are resource-sensitive syntax.

# …, a monoid is a category, a category is a monad, a monad is a monoid, …

In the last few months I taught three graduate mini-courses about string diagrams and graphical linear algebra – at Phd Open in Warsaw, a winter school in Estonia, and a graduate school in Leicester. That and teaching have been keeping me busy, so I haven’t invested as much time as I would have liked on the blog. There was also the need to write down all the work done in Hawaii, and Dusko, Filippo and I just finished writing a meaty technical report, which will soon appear on the arXiv. The time that I actually did spent writing blog stuff was distributed across a number of different articles; so now I have a whole bunch of half-finished stuff. I really need to start managing this whole project a little better!

Here’s something really short and off the cuff. At MGS in Leicester, during the exercise sessions, I talked about distributive laws of props. At the same graduate school, Alexander Kurz was giving a course on category theory.  I got the inspiration for this article from one of his lectures.

As part of the distributive laws of props story, you realise that categories are a special case of the concept of monad. But monads are a special case of the concept of monoidAnd monoids–of course–are a special case of the concept of category. By transitivity of the “special case of” relation, we should therefore probably just admit that they are all really the same thing: some kind of Platonic form of plugging things into each other. The story is full of pleasant categorical mental gymnastics. Think of it as the categorical equivalent of a 5km fun run.

## … , a monoid is a category, …

A category is a collection of objects and arrows. Every object has an identity arrow, and any pair arrows f: A → B, g: B → C, where f‘s codomain is the same as g‘s domain, can be composed. Composition is associative, and composing with identities does nothing.

If we restrict to categories with exactly one object – let’s give it a name, say – there is just one identity arrow  (the identity id* on *). Every arrow has * as both domain and codomain, since there are no other objects around. So every arrow can be composed with every other arrow. Composition is associative, and composing with id* on either side leaves any arrow unchanged. In other words, there is a set of arrows Ar with a total, associative, unital operation: i.e. a monoid. The fact that there is an object * adds no interesting information.

So every monoid is a kind of category, a category with one object. Categories, it would seem, are the more general concept. Indeed, some people say that categories are the “categorification” of the concept of monoid. Maybe categories should actually have been called monoidoids?

## … , a category is a monad, …

One way to define the concept of monad is as a triple (T, μ, η), where T: C → C is an endofunctor, μ: T2 → T is the multiplication natural transformation, and η: 1 → C is the unit natural transformation. Multiplication μ is associative and has η as unit, which is usually said with the aid of three commutative diagrams.

In the famous paper 1970s paper “The formal theory of monads“, Ross Street realised that this popular definition is unnecessarily specific. There’s no reason to be overly specific in mathematics: if you can easily have a more general theory/theorems, then why not? What matters is getting to the core of the issues, and not getting too hung up on extraneous details.

In this specific case, an endofunctor is a 1-cell in the 2-category Cat of categories, functors, and natural transformations. The data defining a monad is thus one 1-cell and two 2-cells, subject to some commutative diagrams of 2-cells. The fact that this data lives in Cat brings nothing important to the story: the definition clearly makes sense in any 2-category.

The 2-category Span(Set) is one of my favourites. It’s actually a bicategory–composition is not strictly associative, but we will ignore that detail for the purposes of this lightning article. It’s really just a bit of bureaucracy and adds nothing to the story. Trust me.

The objects of Span(Set) are sets. The arrows from X to Y are spans of functions

X ← Z → Y

Identities are spans formed of two identity functions. Composition of spans involves taking pullbacks. Finally, the 2-cells are span morphisms; functions Z→Z’ that make the two resulting triangles commute. I won’t bore you with the details which you can find on nlab.

The interesting question is what happens when we consider a monad in Span(Set). An endo-1-cell is span of the form

X ← Y → X                           (†).

Let’s call the left function δ0 and the right δ1. Now we need to give two 2-cells, that is, morphisms of spans. Let’s do the unit first: when you unpack the definitions it is a function

ι: X → Y such that δ0ιx = δ1ιx = x.

The multiplication μ goes from the span composed with itself back to itself. When you compute the composition, taking the pullback, the span is

X ← Y×XY → X

where XY = { (y,y’) | δ1(y) = δ0(y’) }. So μ: Y×XY → Y, satisfying δ0μ(y, y’) = δ0y and δ1μ(y, y’) = δ1y’.

So far so good. Associativity says simply that μ(μ(y, y’), y”) = μ(y, μ(y’, y”)) and the unit conditions say that μ(y, ιδ1 y) = y = μ( ιδ0y, y).

All this data has another name: a category. Let’s make it clearer with more familiar terminology: X is the set of objects, Y is the set of arrows. δ0 and δ1 give, respectively, the domain and codomain of each arrow. The function ι: X→Y picks out the identity arrows. Next, XY is the set of composable arrows, and μ: Y×XY → Y defines the composition for any composable pair. The remaining conditions say exactly that composition is associative and identities work as they should. It’s precisely the data we need.

So, a category is just a special case of the more general concept of monad, right?

## … , a monad is a monoid, …

Not so fast. We talked about monads in the generality of 2-categories. The theory of monoids can be easily generalised too: there’s no reason for the specificity of  “set” M, with unit “element” e and multiplication “function” m, satisfying associativity and the unit equations. The world would be boring if everything was just sets and functions. In fact, the concept of monoid makes sense in any monoidal category. Pick some monoidal category X; the data of a monoid consists of

an object X, and arrows m: X⊗X → X and e: I → X

where I is the monoidal unit object. Associativity and the unit laws can be expressed as three commutative diagrams. The usual monoid-as-a-set version is now a monoid, in the more general sense, instantiated in the monoidal category of sets and functions Set×. Here the cartesian product serves as monoidal product.

So now let’s consider some 2-category C, pick an object  C and focus on the set of 1-cells from C to C. Let’s call this thing [C, C]. Actually, it’s not really a set: it’s the full sub-2-category of C on C; i.e. a 2-category with one object. We saw before that to give a category with one object is to give a monoid. In the same spirit, a 2-category with one object has the same information as a (strict) monoidal category. The objects are the 1-cells, and monoidal product (on 1-cells) is composition. I’ll leave the remaining details as a simple exercise.

So what is a monoid in [C, C], considered as a monoidal category? You guessed it – it’s a monad.

So monads are really special kinds of monoids. And monoids are really special kinds of categories. And categories are really special kinds of monads. And …

Aside

# Leicester and the battle for universities

This is an off-topic post, but it concerns something very important. More graphical linear algebra is coming next week.

The short story is this: the University of Leicester is going through a top-down restructuring. The management have decided to force a large number of academics to reapply for their jobs. Of course, following this charade, there will be fewer jobs than academics, so the whole thing is a depraved game of musical chairs. In particular, in the excellent mathematics department, around six academics will be made redundant.

It’s very important that we come together and fight, so please sign this petition. In the grand scheme of things, the very concept of the university as an institution is under threat, and this is just the latest battle. Leicester is a reputable British university, but other countries are increasingly heading in the same direction: turning world-leading universities into failing businesses.

If you’re interested in my views on this topic, see the rant below.

Over the last twenty years, British universities have slowly transitioned away from being institutions that deliver learning and research as a public good. Universities were once seen as advancing human knowledge, and this was considered to be a positive contribution, a raison d’etre in its own right. Higher education meant that people could pursue their interests, fulfil their calling, have access to state-of-the-art research and knowledge in the field of their choosing, and perhaps even contribute to its further development.

At some point in the 1980s–coinciding with Thatcher in the UK and Reagan in the US–universities were infected by neoliberal thinking. Abstracts concepts like “advancing human knowledge” were considered suspicious and retrograde; it’s difficult to assign a monetary value to something abstract. Instead, universities were now seen as a engines of future economic growth. This led to investment in fields deemed as economically beneficial, e.g. applied STEM, law, economics, and savage cuts to fields deemed as less useful to the economy, typically meaning the arts, “less marketable” social science, fundamental scientific research, pure mathematics, and so on.

Stefan Collini, writing in the LRB, put it succinctly:

… gradually, what we still call universities are coming to be reshaped as centres of applied expertise and vocational training that are subordinate to a society’s ‘economic strategy’.

Moreover, universities themselves are increasingly being seen as businesses in their own right. The income comes from ever growing student fees, some government support for research and grant income. Academic career progression is now, in reality, often based on how much cash an academic generates. Non-performing (read, non profit-generating) academics are often bullied and threatened, sometimes with tragic consequences.

At the same time, management depressingly talks about “vision”, “strategy”, “stakeholders”, etc. Their emails are often peppered with meaningless middle-management business speak (think David Brent, or if you’re in the US, Michael Scott). Many British universities are expanding and investing millions to build campuses around the world. My own university has built a campus in Malaysia and the jury is still out on whether this was a sound investment. In Sold Out, another excellent LRB article by Stefan Collini, we find an indicative anecdote:

In June 2012 the University of East London in Cyprus promised to offer ‘high quality British degree programmes in one of Europe’s most popular study destinations’ at a ‘stunning new campus’, but in April 2013 it was announced that after recruiting just 17 students UEL Cyprus would be closed. A spokesman for the university, the Times Higher Education reported, ‘would not disclose how much money the university will lose’.

Senior management, of course, keep giving themselves enormous pay rises and bonuses, regardless of performance: this is justified because they think themselves the equivalent of CEOs of large companies, deserving to be paid at “market value”. For example, the vice-chancellor of the University of Leicester takes home a salary of more that £300k a year. This makes the currently proposed redundancies even more perverse.

Even if you’re not based in the UK, if you care about universities, you should still get involved. Here’s Collini again:

Other countries are looking on with a mixture of regret and apprehension: regret because the university system in this country has been widely admired for so long, apprehension because they fear similar policies may soon be coming their way. In many parts of the world English higher education is, to change the metaphor, seen less as a useful pilot experiment and more as the canary in the mine.

So please, go ahead and sign the petition. As Collini writes in the last paragraph of his essay, we must stop first-rate British universities from turning into third-rate companies.

# 31. Fibonacci and sustainable rabbit farming

Chapter 12 of Fibonacci’s Liber Abaci is full of compelling problems. One is called “On two men with fish and the customs agent”. Another is a riddle about a man who visits a pleasure garden – accessible through seven guarded doors – where it’s possible to collect apples. You can enter freely, but on your way out you must pay each of the seven guards a tax: half of the apples you’re carrying, plus one additional apple.

Artistic impression of Fibonacci’s pleasure garden. The tax at this door is 10/+ 1 = 6 apples.

So, Fibonacci asks, how may apples does the man need to collect in the pleasure garden so that he gets to take one apple home? (spoiler: 382).

Given Fibonacci’s trading roots, his obsession with tax and customs duties is understandable. Without doubt, however, the most famous part of the chapter is not about taxes. It’s a very short exposition entitled “How many pairs of rabbits are created by one pair in one year”, and it starts as follows:

A certain man had one pair of rabbits together in a certain enclosed place, and one wishes to know how many are created from the pair in one year when it is the nature of them in a single month to bear another pair, and in the second month those born to bear also.

Fibonacci doesn’t tell us the name of the certain man. It will become boring to refer to him repeatedly as “the certain man”, so from now on we will change his gender to female and refer to her as Giusy (pronounced Juicy), short for Giuseppa the Rabbit Breeder.

The following helpful table is included in the margin, with the number of rabbit pairs for each month

 beginning 1 first 2 second 3 third 5 fourth 8 fifth 13 sixth 21 seventh 34 eighth 55 ninth 89 tenth 144 eleventh 233 twelfth 377

and the procedure for calculating the successive entries is explained as follows

You can indeed see in the margin how we operated, namely that we added the first number to the second, namely the 1 to the 2, and the second to the third, and the third to the fourth, and the fourth to the fifth, and thus one after another until we added the tenth to the eleventh, namely the 144 to the 233, and we had the above written sum of rabbits, namely 377, and thus you can in order find it for an unending number of months.

This sequence of numbers in the table

1, 2, 3, 5, 8, 13, 21, 34, 55, 89                          ①

is now known as the Fibonacci numbers. As Fibonacci explains, starting with F= 1 and F= 2, we obtain successive entries by recursively calculating

Fn+2 = Fn+1 + Fn.

This famous series—and the related golden ratio–are the absolute bread and butter of popular mathematics. Many facts, factoids and myths involving pineapples, pine cones, aesthetically pleasing rectangles, renaissance paintings etc. have been assigned to them.  They spice up the otherwise vapid pages of Dan Brown novels. As well as books, there are hundreds of videos on YouTube. While many are terrible, there are some gems: for example, this lecture by Keith Devlin is very good since he manages to be engaging, while at the same time challenging some of the most egregious numerological pseudo-science.

In modern presentations, Fibonacci’s famous sequence is usually given as

0, 1, 1, 2, 3, 5, 8, 13, 21,                              ②

or

1, 1, 2, 3, 5, 8, 13, 21, 34,                            ③

We will stick with Fibonacci’s original version .  There’s a good reason for this: Giusy’s first couple of rabbits start breeding already in the first month. So if the first rabbit couple was delivered one month late, Giusy would observe the following series

0, 1, 2, 3, 5, 8, 13, 21, …                               ④

Fibonacci’s  starts with the third number of  and the second of .  But it wouldn’t really work to simply shift  or  to the left: differently from , both  and  contain the sequence 1, 1, which Giusy would never observe.

We can obtain  by adding  and . Addition works pointwise, scanning vertically, 0+1=1, 1+1=2, 1+2=3 etc, as illustrated below. As we will see, this is the way of defining addition on number sequences. We will also define multiplication on sequences, but it won’t work anything like this.

The story of graphical linear algebra so far has taken us from zero to the natural numbers, to integers and then fractions. For the next few episodes we will shift our focus of from numbers to sequences. My aim for the rest of this episode is to whet your appetite by giving you a taste of the maths that we will use. For now, it will be a high-level overview, we will not go over everything in detail.

Eventually we will discuss some applications in control theory and I will do my best to explain when we come to that, although I’m not an expert. In this episode, though, I’ll talk about computer science motivations; some of the things that I found the most interesting while playing around in this particular neck of the graphical linear algebra woods.

We start by generalising Fibonacci’s rabbit problem.

Giusy originally got a pair of rabbits. No external rabbits get added, nor any taken away at any point afterwards. So we can think of this as the following rabbit input sequence

1, 0, 0, 0, 0, 0, 0, 0, 0, 0, …

with resulting rabbit output sequence

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

Similarly, as we have already discussed, adding the first pair one month late would mean that the shifted rabbit input sequence

0, 100000000, …

results in the shifted output sequence

0, 1235813213455, …

But there’s no reason to add just one pair. For example, what if Giusy started with two pairs of rabbits? And, moreover, let’s suppose that she goes to the rabbit market every second month to buy another pair of rabbits. This rabbit input sequence would then be

2, 0, 1, 0, 1, 0, 1, 0, 1, 0,                           ⑤

Following the Fibonacci principle of rabbit reproduction, we would get output sequence

24, 7, 12, 20, 33, 54, 88, 143, 232,        ⑥

The reasoning is as follows: the input sequence 2, 0, 0, 0, 0, produces the result 2, 4, 6, 10, 16, … obtained by doubling each entry of . Twice the rabbits, twice the breeding. Next, 0, 0, 1, 0, 0, 0, 0, … produces 0, 0, 1, 2, 3, 5, …; that is, a shifted . Similarly for 0, 0, 0, 0, 1, 0, 0, 0, 0, … and so forth. Adding up all of these input sequences results in , so adding up all of the result sequences results in .

We can use the same kind of reasoning to figure out what would happen if Giusy started an ambitious commercial rabbit breeding farm. Say her business plan started with investing in three pairs, then selling one pair in the first month, two in the second, three in the third, and so forth. In this case, the rabbit input sequence would be

3, -1, -2, -3, -4, -5, -6, -7, -8, -9, …

and the result is, unfortunately

3, 5, 5, 5, 3, -1, -9, -23, -47, -87, …

so it looks like Giusy’s rabbit business would collapse sometime in the fifth month: negative rabbits doesn’t sound good. To be a successful rabbit entrepreneur, Giusy needs to understand the principles of sustainable rabbit farming, which we will discover at the end of this episode.

What’s going on here, mathematically speaking?

It looks like we are defining a function that takes number sequences to number sequences. By introducing and taking away rabbits we are causing the total numbers to change. We could try to write down a formula to calculate subsequent entries in the output sequence based on previously outputted values and the current input value. But formulas are boring.

Moreover, I hope that I’ve managed to make you at least a little bit uncomfortable about the pseudo-scientific language of causality. In graphical linear algebra, relations take centre stage. Of course, it is useful to know when a relation describes a function – as we will see below – but functions are not part of our basic setup. Instead, we think of the Fibonacci rules of rabbit breeding as defining a relation between number sequences. For example, it relates

1000000000, … with 123581321345589, …

2010101010… with 2471220335488143232

3-1-2-3-4-5-6-7-8-9, … with 35553-1-9-23-47-87, …

and so forth.

This relation is a kind of “ratio” between sequences. We have already seen how fractions in graphical linear algebra work and that they can be thought of as ratios. A little reminder example: the fraction 3/4 is drawn

and, seen as a linear relation, contains (4, 3), (8, 6), (-2/3, –1/2) and so on; all those ordered pairs that satisfy the ratio “three times the first number is four times the second”.

Next, a sneak preview of how we will use graphical linear algebra to capture the Fibonacci sequence ratio with diagrams.

The Fibonacci number sequence can be expressed nicely by something called a generating function. We will discover some of the underlying mathematics in the future. If you can’t wait, the classic generatingfunctionology by Herbert S. Wilf is free to download. Wilf’s Chapter 1 starts with the following, well-known simile:

A generating function is a clothesline on which we hang up a sequence of numbers for display.

The kinds of generating functions that we will use will be fractions of polynomials. So, as we will see, the generating function for  is

the one for  is

and the one for  is

If you haven’t seen this kind of thing before then this will all look very mysterious. Don’t worry, with graphical linear algebra everything will become crystal clear. At this point, just trust me: there’s a tight relationship between these fractions and the sequences that we saw previously.

I can’t resist making one remark though: look at the three fractions above. Since all of the denominators are the same, following the usual rules of addition of fractions, the first is just the addition of the second and the third, just like  is + as sequences. Indeed, an extremely interesting and useful property of generating functions is that their algebra as polynomial fractions is compatible with the algebra of the sequences they represent.

Generating functions go back to the 18th century and Euler. I’ve always thought of them as a kind of “prehistoric” computer science, 2oo years before Turing. Indeed, computer scientists are often faced with coming up with finite specifications (e.g. programming languages, automata, regular expressions, Turing machines, …) for infinite behaviour. Generating functions are a beautiful example of this phenomenon: they are a compact, finite syntax for infinite sequences.

It’s time to start drawing diagrams. The following is the diagram for (1+x)/(1-x-x2), the generating function for Fibonacci’s .

You should recognise most of the components of this diagram, apart from all the white x‘s on black backgrounds. They are new generators, and they stand for the indeterminate x of a polynomial. Adding them and their associated equations to our graphical language will give us the ability to write down any polynomial fraction whatsoever. We will go through this properly in the next episode.

Now for a preview of the cool stuff. We already noticed that the Fibonacci sequence ratio can be seen as a function from input rabbit sequences to output rabbit sequences. It turns out that this can be observed just by looking at the diagram. In fact, we can use diagrammatic reasoning to rewire the above into the following

which can be understood as a signal flow graph, a concept to which we will devote quite a bit of time in the future. Here the left boundary can be thought of as input and the right as an output. The denominator of the polynomial has been transformed into a fancy feedback loop.

We can now this latest diagram and turn it into some code. There’s no intelligence involved, a simple procedure takes a signal flow diagram and produces code. Here’s the result in OCaml. We will go through the procedure in due course.

(* "forward" Fibonacci *)
let ffib x =
let rec ffibaux r1 r2 r3 x =
match x with
| [] -> []
| xh :: xt ->
xh + r1 + r2 + r3 ::
ffibaux xh r3 (xh + r1 + r2 + r3) xt
in ffibaux 0 0 0 x;;

If you don’t know OCaml, don’t worry, you can still cook along: open this in a fresh browser window. It’s an interactive OCaml shell running right in your browser. Copy and paste the above code into the prompt.

If everything worked as expected, you ought to get the following

val ffib : int list -> int list = <fun>

which says that ffib is a function that takes a list of integers and returns a list of integers. OCaml is nice like that, it figures out the type for you. Now paste the following

# ffib [1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0];;
- : int list = [1; 2; 3; 5; 8; 13; 21; 34; 55; 89; 144; 233; 377]

which is the list of the first thirteen Fibonacci numbers, as in Liber Abaci. Say Giusy throws in one additional rabbit pair each month, then the numbers will be

# ffib [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1];;
- : int list = [1; 3; 6; 11; 19; 32; 53; 87; 142; 231; 375; 608; 985]

You get the idea; feel free to try other inputs.

Finally, we return to Giusy’s dream of becoming a successful, sustainable rabbit breeder. She’s in luck because the original diagram can be rewired in another way, again looking like a signal flow graph but this time where the input is on the right and the output in on the left. Now it’s the numerator that turns into a feedback loop.

But it’s still morally the same diagram, the same ratio, relating the same sequences. Just that now we understand it as having its input on the right and its output on the left. Nothing wrong with that: graphical linear algebra is good with symmetries. It’s also why dividing by zero wasn’t a problem.

We can also turn the new diagram into code.

(* "backward" Fibonacci *)
let rfib x =
let rec rfibaux r1 r2 r3 x =
match x with
| [] -> []
| xh :: xt ->
xh - r1 + r2 + r3 ::
rfibaux (xh - r1 + r2 + r3) r3 (-xh) xt
in rfibaux 0 0 0 x;;

and, as a function, this now computes the inverse to ffib. So, for example:

# rfib [1; 2; 3; 5; 8; 13; 21; 34; 55; 89; 144; 233; 377];;
- : int list = [1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]

Now, let’s say that Giusy has space for 5 rabbit pairs and wants to sell off any overflow. What should her business plan look like?

# rfib [5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5;];;
- : int list = [5; -5; 0; -5; 0; -5; 0; -5; 0; -5; 0]

And we see that she ought to start with 5 pairs, sell off 5 in the first month, and every second month after that. Giusy’s happy, with both bankruptcy and rabbit plagues avoided. Graphical linear algebra saves the day again.

# 30. The essence of graphical linear algebra

Before we get started, I have an annoucement. Recently I have been contacted by Vincent Verheyen, a Belgian polyglot mathematician/developer who generously offered to translate the blog into several languages, starting with Dutch and French, but also German, Spanish, and even Chinese (with the help of his girlfriend Chung-Yi Lan). The first few episodes are already available and more will follow in the coming weeks. Vincent has set up his website in a way that makes external contributions possible, and the translations themselves are published under the Attribution-ShareAlike 4.0 International licence. So, if you are interested, please do get involved! I’m really honoured that Vincent has invested his time and energy in this effort, and it’s wonderful that the material will find a wider readership: at the moment, about 80% of the hits on the blog are from English speaking countries.

There is a new video series on YouTube by 3Blue1Brown called “Essence of linear algebra”. It’s slick and gives visual intuitions for matrices, spaces, determinants etc. It makes the important point that intuitions are not emphasised enough: too often linear algebra is presented as a series of definitions and algorithms to memorise. But you’ll notice that the diagrams there are quite different to the kinds of diagrams on this blog. The series has inspired me to think about the “essence of graphical linear algebra”. The story on this blog has been divided into bite size chunks, and as a result, it is sometimes difficult to see the wood for the trees. So in this episode we take a step back and look at the big picture. But first, let’s start with a little history.

Early 20th century mathematicians and logicians were a nervous lot, facing an existential crisis and a severe inferiority complex. Engineers were building astonishing machines that transformed the world in the timespan of a single generation. Their work was made possible by physicists who made breakthroughs in understanding the basic laws that govern the physical world, and by chemists who charted interactions between different types of molecules and compounds. For all of these pursuits there was an external reality to point to, a sanity check. A plane that did not fly was not much use; the design was improved or thrown out. Similarly, theories in the physical sciences could be put to the test in the lab. They were falsifiable: if experimental data did not fit the theory, it was the theory that was thrown out, not the real world.

Mathematicians, instead, worked with abstract systems, applying various levels of rigour. Instead of an external reality, they could point to the beauty and intellectual depth of their intuitions and results.  An added bonus was—as the physicist Eugene Wigner put it in 1960—the unreasonable effectiveness of mathematics in the natural sciences. Therefore the party line was (and still is) that even if the applications were not immediately obvious, eventually research in mathematics pays off. Mathematical advances are thus a vital ingredient in the supply chain of civilisation and technological advancement.

It’s a nice story, but what if it’s all built on sand? The mathematician’s ultimate defence–and for many the raison d’être of the entire discipline–is rigour and logical certainty of the work. Unfortunately, this was shaken up severely by Bertrand Russell in 1901 with his famous paradox. We have already seen that Russell had interesting things to say about causality. With the paradox, he showed that (naive) set theory, a branch of mathematics developed in the 19th century as a unifying formalisation for mathematics, was inconsistent. Basically, this meant that any proposition whatsoever could be proved. We could prove 2+2=4, but we could also prove 2+2=5. Nothing could be falsified because everything was both true and false at the same time. This, understandably, led to a moral panic. There were several important consequences, including Hilbert’s programme, Zermelo-Frankel set theory, Gödel’s incompleteness and Nicolas Bourbaki.

This blog is not about philosophy of mathematics, so I will skip right ahead to the main point: my intense dislike of Bourbaki. If you have not heard of Bourbaki before, it’s a rather romantic, movie-script kind of story: the scene is Paris in 1935, where a group of young, disaffected, rebel mathematicians wanted to change the world; or, at least, their discipline. They began to discuss their ideas of how to change mathematics, which they saw as too often informal, too disconnected, old-fashioned and chaotic. It was time for a revolution.

Street scene, Paris 1935.

To emphasise the collective nature of the endeavour, they published under the nom de guerre Nicolas Bourbaki. From 1939 onwards, Bourbaki authored a series of very influential textbooks. The project continued and gained power and prestige: over subsequent decades the group’s membership changed and at various times included many of the leading lights of modern mathematics as contributors, people such as Eilenberg and Grothendieck. The following is a description by Marjorie Senechal, taken from her wonderful 1997 interview with Pierre Cartier, a Bourbaki member between 1955 and 1983:

From its beginning, Bourbaki was a fervent believer in the unity and universality of mathematics, and dedicated itself to demonstrating both by recasting all of mathematics into a unified whole. Its goals were total formalization and perfect rigor. In the post-war years, Bourbaki metamorphosed from rebel to establishment.

Bourbaki’s unifying idea was that mathematics ought to be viewed as the study of mathematical structures. A mathematical structure is a set, i.e. an abstract collection of elements, with some structure: e.g. operations, subsets, constants. These then are required to satisfy a number of axioms. A vector space is one example: it is a set, whose elements we call vectors, containing a special element 0, an action of a field on that set, and an addition operation, all together satisfying a long list of axioms. All this data is now found in the first two pages of any advanced linear algebra textbook. Indeed, linear algebra–from the point of view of a Bourbakian–is the study of the mathematical structure called vector space. End of story.

Bourbaki resulted from similar currents of thought that produced fascism and totalitarian communism: moral panics leading to revolutions, and ultimately “final solutions”, all terrible and evil in various ways and degrees. Bourbaki was an attempted “final solution” for mathematics. This is not my hyperbole: in Senechal’s interview, Cartier says:

A final solution. There are good reasons to hate that expression, but it was in the people’s minds that we could reach a final solution. … In science, in art, in literature, in politics, economics, social affairs, there was the same spirit. The stated goal of Bourbaki was to create a new mathematics. He didn’t cite any other mathematical texts. Bourbaki is self-sufficient. … It was the time of ideology: Bourbaki was to be the New Euclid, he would write a textbook for the next 2000 years.

Bourbaki’s new mathematics had no space for pictures. Diagrams were too fragile, not nearly mensch enough. Cartier says:

Bourbaki’s abstractions and disdain for visualization were part of a global fashion, as illustrated by the abstract tendencies in the music and the paintings of that period.

Bourbaki is the reason why we still have boring textbooks in mathematics in general, and linear algebra in particular. Bourbaki hated intuitions. Bourbaki is partly responsible for the insane levels of specialisation in modern mathematics. In this blog we fight Bourbakism. ¡Hasta la victoria siempre!

Graphical linear algebra is not the study of a mathematical structure in a Bourbakian sense. Instead, we use diagrams to describe linear algebraic concepts. So matrices are not homomorphisms between vector spaces, but diagrams in a theory of adding and copying. As we have seen, an m×n matrix A is a diagram with n dangling wires on the left and m on the right. In our graphical shorthand:

The cool thing is that matrices are just the beginning. For example, a Bourbakian would consider the kernel of A as a sub vector space ker A of the domain of A: that is, a set of vectors v such that Av = 0. In graphical linear algebra the kernel is not a set. As we have seen, it is also a diagram, one that is obtained in a modular way from the diagram for A.

Similarly, the image of A is not the set { w | ∃ v. Av = w }. It is the following diagram.

Our diagrams are not fragile. They are precise mathematical objects and come with a set of equations which allows us to write proofs, transforming one diagram into another. As we have seen, all of these equations have rather simple justifications and describe how an abstract operation of addition interacts with an abstract operation of copying. Hence the slogan, which we saw back in Episode 7:

Linear algebra is what happens when adding meets copying.

Proofs are done through diagrammatic reasoning, which we discussed in Episode 6 . Their style departs from Bourbakian set theoretic arguments like “take an element x in ker A”. To show, for instance, that A has a null kernel if it is injective, we start with the diagram for A, append another diagram to obtain its kernel, and use the assumption of injectivity to transform it into the diagram for the null subspace. We saw this in the last episode, but here’s that part again, diagrammatically.

What really excites me about graphical linear algebra is the fact that it we can have our cake and eat it too: the diagrams provide intuition but are also sentences of rigorous, formal language.

Moreover, the diagrammatic method feels like a high level language when compared to the low level “machine code” of Bourbakian mathematical structures. We saw that the concepts of natural numbers, integers and fractions all follow from the interactions of the basic generators. There was no need to encode these concepts as separate mathematical structures. And the story led to some surprises too: in particular, we had to get over our dividebyzerophobia.

I hope to emphasise the idea of “diagrams as a high level language” in the next ten episodes or so. My plan is to start exploring how string diagrams can help in describing and reasoning about physical systems; in particular we will look at things called signal flow graphs in systems theory. There are some surprises in store: we will look at the idea of feedback in a new way. As we have seen, feedback is a concept that deeply troubles people who believe in causality. If causalists had their way, they’d happy throw the real world away to get away from the problems that feedback brings.

Bourbakism was the effect of the foundational crisis in mathematics of the early 20th century. Mathematicians can feel a little bit more secure today: after all, computer science and the information revolution has been built on mathematics and formal logic. Mathematicians also no longer seem to care very much about foundations, and computers mean that we don’t need to taylor notation for ease and speed of calculation. Instead, we can emphasise what I believe is the true purpose of maths: making complicated, seemingly impenetrable things intuitive and simple.

My sabbatical has finally started. On Monday evening I arrived in Honolulu, and I will be working with Dusko Pavlovic at the University of Hawaii for the rest of the year. Dusko has been working on very exciting things, one of which is to try to understand the “essence of computability theory” with string diagrams, not unlike the kinds of diagrams that we’ve been using on this blog. He uses them to form a theory called monoidal computer. You can check out an introductory paper here. I’m really excited to find out more and hopefully contribute something interesting.

I will also, hopefully, have more time to write.

Continue reading with Episode 31 – Fibonacci and sustainable rabbit farming.

# 29. Dividing by zero to invert matrices

I’ve let the blog slide. At the end of this episode I’ll write a bit about what I’ve been doing in the last few months. For now, let’s just jump straight back into graphical linear algebra.

In order to motivate myself to get back to writing, I brought forward an episode that I’ve been excited about since late last year, when I first did the research. This material is closely linked to Episode 26 on division by zero. If you haven’t read it, or you have forgotten, it will be useful to take a look before continuing.

Episode 26 was quite popular — maybe the clickbait title helped. It did leave some people a bit skeptical: ok, so this guy has derived addition and multiplication tables that seem to say that dividing by zero makes sense. But can this weird idea of dividing by zero actually be useful, or is it merely a curiosity?

There are other number systems that, in principle “allow” you to divide by zero. Many of them work similarly to the IEEE standard for floating point arithmetic, which has an additional entity called NaN (not a number). Roughly speaking, it’s the result of all of the calculations that shouldn’t make sense, like dividing by zero. Next, if you add and multiply a NaN with anything else, you still have a NaN. So it’s kind of like a black hole number: once you get it part way through your calculation you can be pretty sure that something has gone wrong and that the overall result will be a NaN — no matter what you do, you will never end up with an ordinary number again.

This is not the case for the number system from Episode 26. In fact, the three additional elements , , do show up in calculations. The cool thing is that they are not a sinkhole like NaN: you don’t need to panic if you get, say, a  in one of your calculations. It’s possible that things will finish well. This episode is entirely devoted to one example: inverting matrices.

We will need a few definitions.

An n×m matrix A is said to be invertible when there exists an m×n matrix B, such that when you compose in either direction, you get the identity. In diagrams:

If such a B exists, we sometimes write A-1 = B. Many matrices are not invertible. One property of those that are is that m=n, that is, invertible matrices are all square; the number of rows is equal to the number of columns.

Actually, it’s like this in most PROPs – you can usually only invert things where the left and the right boundaries have the same size. A useful intuition to keep in mind is that something can only be invertible if it can be undone. And since wires carry information, we can’t just throw one away or add a new one and expect to be able to undo it.

Back to definitions. We will say that an n×m matrix A is injective, or mono, when the following is true.

The backwards facing A is none other than the mirror image symmetry  that we discussed in Episode 27.

The above definition probably seems a bit strange. In most ordinary linear algebra textbooks, the defining property of an injective n×m matrix is that it takes  different m-vectors to different n-vectors: that is, it is injective as a function. This is actually an equivalent formulation, but we don’t think about matrices as functions on this blog.

One interesting property of mono matrices is that they are left-cancellable: that is, given two equal-sized matrices B, C for which AB and AC is defined, if AB=AC then B=C. In fact, this is true when B and C are linear relations. Diagrammatically:

implies that

Here B and C are linear relations, i.e. arbitrary diagrams in IH. This of course includes the case where B and C are matrices.

Actually injectivity and left-cancellability are also equivalent, i.e. assuming one, you can prove the other. I’ll leave the proofs as an exercise but I’ll give you a hint: it’s handy to use the fact, stated in Episode 27, that for any linear relation A, A is its weak inverse.

The bizarro of injective is surjective. We will also use the word epi. Thus, an n×m matrix is surjective when the following is true:

Just as mono matrices are left-cancellable, epi matrices are right-cancellable; given B and C such that BA and CA make sense, if BA=CA then B=C. In terms of ordinary linear algebra, surjective matrices are those that are surjective as functions; for every n-vector b, there exists an m-vector a such that Aa=b.

There is one final observation that we will need later: any invertible matrix is both injective and surjective. To keep things moving along, I’ll also leave the proof of this as an exercise.

Let’s now prove a classic result of linear algebra. It refers to kernels and images, which are linear subspaces that we discussed in the last episode.

Kernel and Image Lemma. Suppose that A is an m×n matrix. Then:

1. A is injective exactly when it has a null kernel
2. A is surjective exactly when it has a full image.

Proof. Properties 1 and 2 are mutually bizarro, so it’s enough to prove just one. We will do 1.

Suppose that A is injective. Then:

Which means that A has a null kernel.

Suppose now that A has a null kernel; we want to show that it must be injective.

That’s it, we’re done, QED.

The following lemma is very useful because it gives us some useful graphical intuition about invertible matrices.

Invertible Matrix Lemma

Suppose that A is a square matrix of size m. Then the following are equivalent

1. A has inverse B
2. There exists matrix B such

Proof. Let’s first show that 1 implies 2. Assuming 1, Suppose that A has inverse B. Since A is invertible, we know that A is injective. So we have the following:

Which gives us 2. In the first step, we used the assumption that B is the inverse of A, and in the second step we used the fact that A is injective.

We still need to show that 2 implies 1. So, supposing that A is equal to some matrix B heading in the opposite direction, we need to show that B is the inverse of A. To do that, it’s useful to show that A must be injective. Here we can use our Kernel and Image Lemma.

So, by the kernel lemma, A is injective. We can now use this to show that A followed by B is identity.

To show that B followed by A is also identity, we can do the bizarro thing, first showing that B is surjective.

Our aim is to obtain a formula for the inverse of a matrix, if it exists. The starting point is an arbitrary 2×2 matrix, so something like this:

If an inverse exists, then we know it’s also going to be a 2×2 matrix:

The tricky part is, how can we calculate a’b’c’ and d’, knowing ab, c and d? If you remember your linear algebra, maybe you’re thinking determinants and adjugate matrices. We will bypass all that and draw some pictures instead.

Let’s start by using our Invertible Matrix Lemma. As diagrams in IH, we have:

Now adding a matrix to -1 times itself gives us the zero matrix. Using this trivial observation, we get the following rather complicated looking equation:

We can extract the formula for each of a’, b’, c’ and d’ by plugging up some of the dangling wires in the above equation. Let’s work through the procedure for a’.

The first step is to plug a zero and a discard at the right places in both sides, like so:

Doing that, we get the first equality below.

The second equality follows from the bone equation of bimonoids.

So far so good: but now we can drastically simplify the left hand side. Essentially, we’ve plugged a white unit into a black comultiplication, and vice versa; usually when that happens, there is mass destruction. I’ll leave the graphical derivation to you. After you’ve killed off most of the diagram, the left hand side ends up looking like this:

The nice thing is that now a’ is the only entry from the inverse matrix that’s left. And we already know that the diagram above is equal to zero, i.e.

So we are in the situation where a’ plus -1 times some mess is equal to 0; thus:

This is starting to look like a formula. We can untangle the still somewhat messy right hand side above. The result is:

In terms of a bona-fide, old-fashioned syntactic formula—if you’re into that kind of thing—this translates to:

Note that I did not go further in simplifying the denominator because d might be zero, and in that case, multiplication may not be commutative. Relax; to evaluate the formula for specific numbers, we can use our addition and multiplication tables from Episode 26.

There is nothing special about a’. We adapt our procedure by plugging up the other combinations of wires and obtain formulas for b’, c’ and d’. All together:

I find it quite cute how a, b, c and d permute in each of the formulas. And no determinants, nor adjugates in sight!

Let’s do a worked example, calculating the inverse of the matrix:

Referring to the formula for the inverse, we have d=0, so we will divide by zero when calculating a’. For that, we can refer to the multiplication and addition tables from Episode 26. Here’s the calculation:

The other calculations don’t involve any funny stuff — we end up with

I promised you an example where the extended rational algebra of Episode 26 can be useful. And while inverting 2×2 matrices is not going to set the world alight, I hope that this is a hint that there is something interesting going on!

By the way, the formula that we derived for inverting 2×2 matrices generalises to arbitrary square matrices. I don’t have the space for this in this episode, but if there’s interest I can come back to this topic. Let me know.

Not writing is like not going to the gym (which, coincidentally, I’m also guilty of). I mean that once you start not writing, it’s difficult to get back into it. “I’ll do some writing next weekend” – you tell yourself. Before you know it, half a year has passed. Same with (not) going to the gym. I’m still working on that.

Having said that, it’s been a very busy half a year. In addition to teaching,  I visited Paris—and more specifically, Jean Krivine—for a month. I love Paris; I spent a six-month postdoc there in 2005 and it was fantastic to be back in the 13e arrondissement.

Apart from enjoying the city, it was a really productive visit and I learned a lot. Jean is currently working on a graphical high-level programming language for biologists, which shares many features with the kind of string diagrams that this blog is obsessed with.

Amongst other things, Jean told me about a family of string-diagrammatic calculi for logic, developed by the famous 19th century American logician C.S. Peirce (pronounced “purse”; it’s Dutch, apparently). Jean uses some of Peirce’s ideas for his programming language. Peirce’s systems are called existential graphs and have a fascinating history.  While Peirce considered them his most important scientific achievement, they remained unpublished in his lifetime; I guess due to a lack of typesetting technology for diagrams, but maybe also due to the fact that his ideas were so revolutionary that they were about 100 years before their time. In fact, Peirce’s diagrams came before the standard syntactic way of writing logic, with all the upside down As and backwards Es that logicians are so fond of. But his work was forgotten and were rediscovered only in the 1960s. They are still a topic of research.

Existential graphs really blew me away. I plan to understand them properly over the next few months, and maybe I will write about them more seriously one day. For now, if you’re interested in reading further, the PhD thesis of Frithjof Dau seems like a great resource.

This weekend I will be in Porto for the HDRA workshop, and the weekend after that I’m flying to New York for the LiCS conference. It’s going to be a busy summer, but I will try to work more regularly on the blog from now on.

Continue reading with Episode 30 – The essence of graphical linear algebra.

# 28. Subspaces, diagrammatically

There are two common ways to interpret the information represented by a matrix. Roughly speaking, it boils down to whether—when looking at a grid of numbers—you choose to see a list of columns, or a list of rows. The two interpretations are symmetric, and this is nicely captured by this blog’s obsession, the graphical syntax. And as we will see, the two ways of looking at matrices will lead us to two different, symmetric ways of understanding linear subspaces.

Consider an m×n matrix A, drawn with our graphical sugar for matrices as a gadget with n wires entering on the left and m wires exiting on the right, like so:

A can be viewed as consisting of n column vectors c1, c2, …, c. Each of the c‘s is individually a m×1 matrix, so it has one wire entering on the left and m wires existing on the right. We can then wire up A, emphasising the column vectors, like so:

Alternatively, we can view the same matrix A as consisting or of m row vectors r1, r2, …, rn, with each having n wires on the left and one wire exiting on the right. This leads to drawing A like this:

Even though in both cases we are talking about the same matrix—the same raw data—it turns out that the two different points of view lead to two quite different interpretations.

Let’s work through a concrete example. Consider the matrix

which has no particular significance; I just pulled it out of a hat, so to speak. But it’s usually a good idea to have a particular, concrete object to work with, and this one does the job.

Emphasising the columns, it can be drawn like this:

Or we can focus on the rows, as shown below.

Viewing a matrix as a list of columns leads to the first interpretation of what a matrix means: a recipe for of transforming n vectors—that is, columns of of n numbers—into m vectors, i.e. columns of m numbers.

Indeed, let’s take another look at A drawn in the way that emphasises its columns.

We can imagine the n dangling wires on the left as being n different inputs. Now each ci takes an input number and produces a column of m numbers. At the end, the n columns, each of size m, are simply added up.

Let’s work through an example to see how this works, using the concrete matrix from before.

Here, our input vector consists of 51 and -2. Each input results in two outputs, according to the recipe within each column block. They are then added up, e.g. 5+(-1)+(-2)=2 in the first output. Using the standard linear algebra jargon, we have produced a linear combination of the column vectors:

In our worked example above we took one particular, quite arbitrary choice of inputs: 5, 1 and -2. This leads to a natural question: exactly which vectors can be produced by this circuit, supposing that we are free to choose any inputs whatsoever.

In the example above, it turns out that all outputs can be produced by picking the appropriate inputs. But this is not always going to be the case; for example, if all the columns consist of only zeros then clearly only the zero output is possible.

The general question is, then, the following: given column vectors c1c2, …, cn, which m vectors can be written as a linear combination

α1c1 +  α2c2 + … + αncn?

How can we express this idea in our graphical language?

It turns out that the idea of considering all possible inputs is simple to describe. It is also quite intuitive. First, let’s return to our venerable discard generator.

Recall that our intuition, backed by the functional semantics, is that this simply discards any number given to it. With matrices, we usually like to think of numbers flowing from left to right. So what are we to make of discard’s mirror image?

Well, if the discard can throw away any number, then a nice way to think about its mirror image is that it can generate any number. So, to get back to our original question, we can talk about the linear subspace of Qm generated by the matrix A by simply considering all the possible inputs to each column. Like so.

The diagram now represents the subspace that consists of all the linear combinations α1c1 +  α2c2 + … + αncn of A‘s column vectors, where each α is a fraction. More precisely—as we have discussed in the last episode—the diagram gives us a linear relation of type Q⇸ Qm, which we can (since Q0 contains only one element) then simply think of as a subspace of Qm.

This subspace is quite famous and has several names; it’s variously called the image, range, or column space of A, or the span of c1, c2, …, cn. And, using our usual syntactic sugar, we can simply write it as:

I claimed above that our example matrix can produce any pair of outputs. Let’s do the calculation to make sure: it’s a nice little exercise in diagrammatic reasoning.

The image is thus the entire space Q2, so for any pair of fractions p, q we can indeed find inputs u,v,w such that

Viewing a matrix as a collection of rows leads to another popular interpretation of the information that a matrix represents: a list of equations that constrain the inputs. Let me explain.

Drawing A in a way that emphasises the rows, we can think of each ri as encoding a linear combination of the inputs.

For example, going back to our original matrix, let’s label the inputs with variables x, y and z.

So the rows tell us how to combine the various inputs to form the outputs: for example, the first row gives us x-y+z and the second 2y+z.

Here, the natural question is to think of the rows as homogenous equations, and consider the set of solutions, those concrete choices of x, y and z that satisfy them. Homogenous, by the way, means simply that we append = 0 to each linear combination of variables.

So for our running example, we could consider the collection of all the inputs constrained so that they satisfy the two equations:

x-y+z = 0                  ①

2y+z = 0                  ②

We can also represent this graphically. When we emphasised the columns, we introduced a new intuition for the mirror image of discard. This time we are focussing on the rows, and so, not surprisingly, we will introduce a new intuition for the mirror image of the zero generator.

Recall that the zero generator, in the original left-to-right functional interpretation, simply outputs 0.

So how are we to think of it’s mirror image?

The answer is that it’s a constraint: it only accepts 0!

Using this insight, the solution set of the list of homogenous linear equations given to us by the rows of the matrix A is graphically represented as:

or using our usual sugar:

This is a linear relation of type QnQ0, so for the same reasons as before, it’s pretty much the same thing as a linear subspace of Qn. This subspace is also very important in linear algebra, and is variously called the kernel, or the nullspace of A.

Let’s do a calculation that will give us more information about the kernel of our long running example matrix. The calculation is a bit longer than before, but the details are not so important at this stage.

In the derivation we used the Spider theorem from Episode 23.

Here’s the interesting thing about the calculation: we started with a diagram in which we were thinking of the “flow” as going from left to right, where our inputs had to satisfy some equations. We ended up with a diagram which is easier to understand as going from right to left. Indeed, it’s an example of the kind of subspace that we got by focussing on the columns; it is:

where N is the matrix

In other words, the kernel of M is the same linear subspace as the image of N. This tells us that everything in the kernel is of the form

where α is some fraction. So the solutions of the system of equations  and  are x = 3α, y = α, z = -2α.

We have seen that some descriptions of a linear relation can be read “from left to right” and others “from right to left”. Indeed, we have just seen an example of a linear relation that had two descriptions:

So what is really going on here? It turns out that we can think in two ways about every linear relation.

1. Every linear relation can be seen as “being generated”, in the way we described the image subspace.

2. Every linear relation can be thought of as consisting of the solutions of homogenous equations, of inputs “being constrained”, the way we described the kernel subspace.

Let’s be more precise about this. We will start by considering the first of the two situations, the “being generated” case.

The proof of IH ≅ LinRel—which we haven’t seen yet but we will go through in due course—has one important consequence: for every linear relation R: m⇸n we can find an m×p matrix A and n×p matrix B such that

If we have such matrices A, B then we will say that they are a span form of R. Span forms allow us to think of R as being generated. In fact,

is in span form, since

is the graphical representation of the unique 0×n matrix.

The proof of IH ≅ LinRel also gives us the “being constrained” case, which is pleasingly symmetric. For every linear relation R:m⇸n, we can, respectively, find q×m and q×n matrices C and D such that

This is called a cospan form of R. Cospan forms capture the idea that a linear relation can be thought of as the solution set of a bunch of homogenous equations. An example is the kernel of a matrix A

since

is the unique m×0 matrix.

One little quirk of history is that the traditional way of doing linear algebra focusses very heavily on the span view, and almost totally ignores the cospan view. One takes a space and finds a basis, one talks about its dimension, etc. But the cospan view is just as expressive, given the symmetries of graphical linear algebra. So, it’s entirely possible that Martian linear algebra has been developed in a way that prioritised the cospan view of linear subspaces. It’s fun figuring out the details, and we will do some of this in the next episode!

I’m having a very heavy start to 2016: originally I was planning to publish this episode in early January, and it’s now early February. The highlight of the month was a trip to Oxford on Friday January 8. I was there to examine a thesis entitled “!-Logic: First-Order Reasoning for Families of Non-Commutative String Diagrams” of (the now Dr.) David Quick, supervised by Aleks Kissinger. I liked the thesis very much: it’s about a logic for string diagrams, much like the ones on this blog. David’s logic supports reasoning about whole families of diagrams, including supporting sophisticated inductive arguments; much more sophisticated and powerful than the kind of induction that we have been using on the blog. I will have to write about it at some point in the future.

I arrived in Oxford quite early, and it was a really beautiful morning. On my walk from the train station to the computer science department I got the urge to take a photo.

The next Friday, January 15, I was in Cambridge to give a talk. That trip was also a lot of fun, and I got the chance to catch up with some old pals. The evening was very pleasant, and I decided to walk from the computer science department to the train station. On the way I took this photo:

I later noticed—and I promise, as much as it looks like it was, this wasn’t planned!—that the photos themselves are quite symmetric: compare the relationship of water to land. Since we spend a lot of time talking about symmetry on this blog, I think it’s somehow appropriate.

Continue reading with Episode 29 – Dividing by zero to invert matrices.

# 27. Linear Relations

In the last two episodes we focussed on fairly simple diagrams in: those with just one dangling wire on each end. In this episode we expand our horizons and consider arbitrary diagrams. It’s time to return to linear algebra. We will also revisit division by zero from a different, but equivalent, point of view.

We’ve known since Episode 10 how to handle matrices diagrammatically, and since our expanded graphical language can handle fractions, we can put the two together and draw matrices of fractions. The algebra works as expected: if we restrict our attention to these matrices, composition is matrix multiplication, and the monoidal product of two matrices is their direct sum. Here’s a reminder of how to draw, say, the following matrix

But we can do much stranger things. Since we have the mirror image operation  in our arsenal, we can consider matrices going backwards. We don’t have any traditional notation for this: how could you write a matrix that “goes backwards”?  But here’s an example in the language of diagrams, if

Then we have can draw the diagram for B going from right to left.

Matrices and backwards matrices are not the full story because diagrams can get more complicated. In general, diagrams from m to k will be quite strange beasts: imagine a zig-zag of matrices going back and forth. It’s as if we took our Lego and got confused about what are the studs and what are the holes. Everything goes.

Let’s clear up the mystery. We know that the sub theories B and H are isomorphic to, respectively, the props Mat of matrices of natural numbers  and MatZ of integer matrices. Knowing this helped us to understand the diagrams in those theories, and what they were capable of expressing.

So, without further ado, it’s time to meet the linear algebraic prop LinRel that hides behind the equations of graphical linear algebra. Coincidently it’s my favourite category, so please forgive me if I sound a bit too much like a fanboy in the following. Before we start, we need some notation: in the following, the bold letter Q stands for the set of fractions, aka the rational numbers.

The arrows from m to n in LinRel are linear relations from Qm to Qn. Roughly speaking, LinRel is what happens when you combine matrices and relations. The result is truly delightful; it is much more than the sum of its parts. You will see what I mean.

Before I explain what linear relations are in more detail, let me first say that it’s a huge mystery that the concept of linear relation does not seem to be particularly standard or well-known: there is not even a wikipedia page at the time of writing this episode! Since there’s an informative wikipedia page about, say, Cletus from the Simpsons, this is a bit offensive. Maybe it’s because of a human preference for functional thinking, at the expense of relations? We discussed that a bit in Episode 20.

So what is a linear relation exactly?

First, a linear relation of type Qm ⇸ Qn is a relation, so a subset of the cartesian product Qm × Qn. De-jargonifying, this means that it’s a collection of pairs, with each consisting of an m-tuple and an n-tuple of fractions. For example, the linear relation for the addition generator—which we have already discussed informally in Episode 22—is the relation of type QQ that consists of all pairs

where r and s are fractions. We have thus finally demystified what the enigmatic universe of numbers Num is: for now Num = Q, our numbers are fractions.

We still need to account for the word linear in linear relations. Said succinctly, to be linear, a relation  R: Qm ⇸ Qn must be a linear subspace of Qm × Qn, considered as a Q vector space. If you don’t know what those words means, let me explain.

First, R must contain the pair of zero columns.

R must also satisfy two additional properties.The first is that it must be closed under pointwise addition. This scary sounding condition is actually very simple: if R contains pairs

and

then it also must contain the pair obtained by summing the individual numbers:

It takes a lot of space to write all those columns, so let’s introduce a common notational shorthand: we will write a for the a-column (or column vector), b for the b-column and so forth. When we write a+c we mean that a and c are columns of the same size and a+c consists of a1+c1, a2+c2 and so on until am+cm. So—using our new shorthand notation—for R to be closed under pointwise addition, whenever (ab) and (cd) are both in R then also

(a, c) + (b, d)  :=  (a+c, b+d)

must be in R.

The second property of linear relations is that they must be closed under scalar multiplication. This means that whenever

is in R then also

must be a member, where k is any fraction. We can also do say with the shorthand notation: given a vector a and fraction k, we will write ka for the vector with entries ka1, ka2, …, kam. So, being closed under scalar multiplication simply means that if a is in, then ka must also be in.

Composition in LinRel works just like it does in Rel, so composing R: Qm ⇸ Qn and S: Qn ⇸ Qp gives R;S : Qm ⇸ Qp with elements (x, z) precisely those for which we can find a y such that (x, y) is in R and (y, z) is in S.

I claim that LinRel is a prop. To be sure, we need to do some things that may not be immediately obvious:

1. verify that the identity relation is a linear relation,
2. check that composing two linear relations results in a linear relation, and
3. say what the symmetries are and make sure that they work as expected.

The first one is easy, try it!

For point 2, we need to show that if R: Qm ⇸ Qn and S: Qn ⇸ Qp are linear relations then so is R;S.

Let’s prove closure under pointwise addition: suppose that (x1z1) and (x2z2) are in R;S. We need to show that (x1+x2, z1+z2) is also in R;S. Using the definition of relational composition, there exist y1, y2 such that (x1y1) and (x2y2) are both in R and (y1, z1) and (y2z2) both in S. Since R and S are linear relations, they are both closed under pointwise addition, so we can conclude that (x1+x2, y1+y2) is in R and (y1+y2, z1+z2) is in S. It thus follows that (x1+x2z1+z2) is indeed in R;S. I’ll leave you to check that it’s closed under scalar multiplication, the argument is very similar.

For point 3, for now I’ll just tell you what the relation for the twist 2 → 2 is and hope that it gives you the general idea. The twist in LinRel is the smallest linear relation that contains the elements

Since we need to close under addition and scalar multiplication, elements in this relation will look like

where k and l are fractions.

We have already discussed the relational semantics of the generators in Episodes 22, 23 and 24 and verified that the equations make sense, relationally speaking. It is not difficult to check that each of the relations is linear. Long story short, this assignment of relations to generators defines a homomorphism of props

IHLinRel.

As we will eventually prove, this is an isomorphism; it is both full and faithful.

The proof of IH ≅ LinRel is quite a bit more involved than the proof BMat and H ≅ MatZ. We will eventually go through it in detail—apart from following our principle of not handwaving, the proof is quite informative. But for now, LinRel and its diagrammatic presentation IH are perfect playgrounds for linear algebra and that’s what we will concentrate on over the next few episodes.

In the last episode we discussed how (1, 1) diagrams allowed us to divide by zero. By looking at the corresponding situation in LinRel, we can shed some more light on this curiosity. So let’s take a look at linear relations  Q: since IHLinRel these are in 1-1 correspondence with (1, 1) diagrams.  We can plot linear relations of this type using conventional graphical notation, thinking of the domain as the horizontal x axis and the codomain as the vertical y axis.

Since every linear relation must contain the pair that consists of zero columns, every plot will contain the point (0,0). In fact, the singleton {(0,0)} is itself a linear relation: clearly it satisfies the two required properties. Here’s its plot.

The diagram for this linear relation is

which we dubbed ⊥ in the last episode. Next up, the entire set Q×Q is a linear relation. Here we colour the entire plane red. Sorry for the eye burn.

The corresponding diagram is

which we were calling  last time. These two linear relations give some justification for the names ⊥ (bottom) and ⊤ (top), since the former is—set theoretically speaking—the smallest linear relation of this type, and the latter is the largest.

The other linear relations are all lines through the origin. So, for example, the plot for the identity diagram

is the line with slope 1.

All the fractions arise like this. For example, the line for the fraction 1/2,  diagrammatically

is the line with slope 1/2

You get the idea for the other fractions.

There are two lines that represent interesting corner cases. The first is the line for 0

which is the line with zero slope, that is, the x-axis.

Finally, there’s the line with “infinite” slope, the y-axis.

which is the plot for

that we called  in the last episode.

If we think relationally, then the operations x+y, x×y and 1/x are all easy to define, and are total on the set of linear relations  Q. Multiplication is simply relational composition, and 1/x is the opposite relation x. Finally, addition can be defined set theoretically as

x + y = { (a, c) |  there exist (a, b1) in x and (a, b2) in y such that b1+b2 = c }.

The above definition simply mimics the sum operation, as we have been doing with diagrams. So we could reconstruct the addition and multiplication tables from the last episode by working directly with linear relations.

Identifying numbers with lines through the origin is an old idea. It is one way to understand projective arithmetic. What is not so well-known is that we can define the arithmetic operations directly on the underlying linear relations, and moreover, throwing ⊤ and ⊥ into the mix makes the operations defined everywhere, with no need for dividebyzerophobia. If you have seen this done somewhere else, please let me know!

Here’s one very cool fact about LinRel, and it concerns the mirror image symmetry . Remember that if R is a relation then R is the opposite relation: (ba) is in R exactly when (ab) is in R. In the last episode we saw that for any (1,1) diagram d we have

d;d;d = d.

It turns out that this works for arbitrary linear relations. Thus R is always the weak inverse of R. This is something that will be very useful for us in the future. Let’s prove it.

Suppose that D is some linear relation. Since D;D;D and D are both sets (of pairs), to show that they are equal it’s enough to show that each is a subset of the other. It is easy to show that every element of D must be an element of D;D;D, since, taking (a,b) in D, the chain of elements (a,b), (b,a), and (a,b) proves that (a,b) is in D;D;D.

We just need to show that the opposite inclusion also holds. So suppose that (a, d) is in D;D;D. By definition, there are bc such that (ab) is in D, (b, c) is in D and (cd) is in D. And since (bc) is in D, (cb) is in D.

Now, using the fact that D is a linear relation that contains (ab), (cb) and (cd), we have that

(a, b)-(c, b)+(c, d) = (ac+c, bb+d) = (a, d)

is in D.

The fact that D;D;D is not larger than D is one thing that’s special about linear relations: it is not true for ordinary relations! For example, think about what happens to this relation 2 ⇸ 2:

Today is Boxing Day, so this is a Christmas special episode of the blog. The next one will probably arrive in January, so let me take this opportunity to wish everyone a fantastic

Continue reading with Episode 28 – Subspaces, diagrammatically.

# 26. Keep Calm and Divide by Zero

The fear of dividing by zero is a common affliction. I caught it at school when trying to get my head around this well-known “proof” of 1 = 2:

1. suppose that a = b
2. then a2 = ab
3. subtracting b2 from both sides gives us a2 – b2 = ab – b2
4. factorising gives (a + b)(a-b) = b(a-b)
5. dividing both sides by a-b gives a+b = b
6. But also a+b = b+b = 2b, so 2b = b. Dividing by b gives 2 = 1.

The problem is in step 5, in which both sides of the equation are divided by zero: helpfully, this article with a similar wrong proof explains that you cannot ever do that.

Looking at the available evidence in popular culture, interest in dividebyzerophobia is prevalent: for example, this Numberphile video has over 2 million views. Even Siri has something amusing to say.

So I was understandably very nervous when I first realised that graphical linear algebra lets you divide by zero. I double checked everything: something must have gone seriously wrong.

Let me explain.

In the last episode we saw that ordinary fractions v/w (where w≠0) can be represented as the sugar:

We then discovered that the usual rules for doing arithmetic with fractions follow from the equations of our algebraic system:

The derivations made use of the fact that q and s are not zero.

So far so good, but there’s an important difference between fractions and our diagrams. Fractions are inherently asymmetric: the numerator is allowed to be zero, but the denominator is not. On the other hand—as we saw two episodes ago—the generators and equations of graphical linear algebra are symmetric.

So, since by definition we have

if we can draw the diagram for 0/1, then there’s nothing stopping us from drawing the diagram for 1/0:

But we are never allowed to do this, right?

Stay calm and take a deep breath: dividebyzerophobia is an irrational fear, caused by a deficiency of the fraction notation that we all grew up with. I no longer suffer from it and, with the right dosage of graphical linear algebra, neither will you.

As we have seen, all fractions are diagrams with one dangling wire on the left and one on the right. But—as we just discovered—not all diagrams of this type are legal fractions. In fact, there are precisely three that fail to be of the right form, and they all involve dividing by zero in some way. Let’s give them names:

It makes sense to choose ∞, the symbol for infinity, as the name for the 1/0 diagram: for example 1/x diverges to positive or negative infinity as x approaches zero. And positive and negative infinity are one and the same in projective arithmetic.

We will come back to projective arithmetic in more detail in the next episode. For now, let’s verify whether our diagrams indeed tell us that positive and negative infinities are equal.

The remaining two elements,  and ⊥, are two ways of resolving what we might mean by 0/0, which is ambiguous. This ambiguity does not exist when talking about ordinary fractions: if w is nonzero, the following two diagrams are equal, using the fact that nonzero sugars commute with sugars heading in the opposite direction. We showed that in the last episode.

So both of these diagrams represent the fraction v/w. But when v=w=0, the two diagrams are different: the first is , the second .

I’ve coloured ∞, ⊤ and ⊥ differently from our usual sugars, because, as we will see, they don’t behave quite as well as ordinary fractions. Nevertheless, we can try to do arithmetic with them as we did with fractions.

For example, what happens if we take some (legal) fraction p/q and add it to ?

We seem to get stuck at this point because we don’t have any equation that tells us what happens when we plug the (wrong way) zero generator into a copy. It turns out that:

Here’s a derivation:.

Using what we know about symmetry in graphical linear algebra, we can immediately state the bizarro version, which is

Continuing our calculation , we get

So, given any fraction p/q we have that p/q + ∞ = ∞. In fact, using (†) and (‡), we can also show that  ∞ + ∞ = ∞,  ⊤ + ∞ = ∞ and ⊥ + ∞ = ∞. So anything plus infinity is infinity.

In the fifth line of the above calculation we used a property we have not yet proved, that whenever w ≠ 0, the w can come out of a discard. This property is related to one from the last episode: nonzero sugars can go the wrong way over copy. Here’s a reminder of what this means, together with its bizarro version:

To keep a clear conscience, let’s prove the new property. Its bizarro version, which says that the zero generator can swallow nonzero sugars, will also come in handy in proofs.

Using the usual symmetry argument, it’s enough to prove one.

Also, since we didn’t do it earlier, this seems a good time to check whether 0/w is actually the same as 0. In the second step we use our new fact about the zero generator and nonzero integer sugars.

We may as well fill out the full addition table, which takes account of all possible ways that (1,1) diagrams can be added. In the table below I separated 0 from the nonzero fractions, where both the numerator and the denominator are not zero.

Remember that our diagrammatic addition is commutative, so we don’t need to fill the table below the diagonal.

 + 0 r/s ∞ ⊤ ⊥ 0 0 r/s ∞ ⊤ ⊥ p/q – sp+qr/qs ∞ ⊤ ⊥ ∞ – – ∞ ∞ ∞ ⊤ – – – ⊤ ∞ ⊥ – – – – ⊥

Perhaps the most mysterious equation in this table is ⊤ + ⊥ = ∞. It’s very simple to justify:

Multiplication is a bit more hairy, because although it is commutative on ordinary fractions—as we saw in the last episode—it is not always so when we consider the expanded system with the additional three elements. This means that now we need to fill in the entire table.

 × 0 r/s ∞ ⊤ ⊥ 0 0 0 ⊥ 0 ⊥ p/q 0 pr/qs ∞ ⊤ ⊥ ∞ ⊤ ∞ ∞ ⊤ ∞ ⊤ ⊤ ⊤ ∞ ⊤ ∞ ⊥ 0 ⊥ ⊥ 0 ⊥

There are some interesting symmetries in the table. And we can resolve the old philosophical conundrum about what happens when we multiply infinity by zero! In graphical linear algebra it turns out that 0 × ∞ = ⊥ and ∞ × 0 = ⊤.

What do we really mean by “dividing by zero”? As Siri says, it does not make much sense to divide cookies amongst zero friends; even when the Cookie Monster is happy, you are still sad.

Graphical linear algebra comes with the mirror image symmetry. We will use the dagger () superscript to mean mirror image. So if d is a diagram of type (m,n) then its mirror image d is a diagram of type (n,m). Clearly, no matter what the d, we always have d†† = d.

When d is a nonzero fraction sugar, then it turns out that d† is the multiplicative inverse of d: this means that d ; d† = d†; d = 1. It’s an easy derivation:

When d is 0 or  then this is no longer the case, as can be seen from the multiplication table. For example 0 ; 0†  =  0 ; ∞  =  ∞ × 0  =  ⊤ and 0† ; 0 = 0 × ∞ = ⊥.

Since does not always mean multiplicative inverse, what are some properties that it does satisfy for all (1,1) diagrams? Here is one: d and d are the weak inverse of each other.

d ; d ; d = d

d ; d ; d = d

I’ll leave this as a simple exercise. It will be great as the first step in your dividebyzerophobia treatment programme. For nonzero fractions, the equations follow from the fact that gives us the multiplicative inverse. To treat the other cases, you will only have to do four calculations, because  = ⊤ and  = ⊥.

The mirror image symmetry will be very useful when we talk about inverting matrices in graphical linear algebra. I’m thinking of bringing some of this material forward—maybe already to the next episode—because it is a nice example of how dividing by zero can actually be useful in calculations.

Although it was fun to fill in the addition and the multiplication tables, there are good reasons not to give  and  the same status as we give to the numbers (naturals, integers, fractions) that we have seen so far, and others (reals) that we will talk about in the future. This is why we have been colouring their sugars differently.

I don’t plan to get bogged down in a philosophical discussion about “what is a number?” (Episode 17 did not seem to go down too well) but there are a number of useful properties that “numbers” ought to satisfy in graphical linear algebra. For one, the bizarro of a number should be itself. This works for ∞, but ⊤ and ⊥ are the bizarro of each other.

It is also very useful to know that numbers, whatever they are, satisfy the following:

We know from earlier episodes that they hold for the natural numbers and for the integers. Given what we now know about nonzero sugars going “the wrong way” over copy and addition, and into discard and zero, it is follows that they hold for fractions as well. So, the a in the above equations can be any legal fraction.

These properties allow us, amongst other things, to prove that multiplication distributes over addition. But all three of  and ⊥ fail some of these four tests. For example, we have:

And these two diagrams cannot be shown equal in our theory. For now, you could think about the underlying relations, but we will get more precise about characterising diagram equality in the next episode.

So—long story short—while we will not think of them as numbers per se, we will not panic when we see and  popping up in calculations. They are all perfectly harmless.

In algebra, the notion of a set of elements with addition and multiplication, where multiplication is commutative and distributes over addition is called a commutative ring. If all nonzero elements also have a multiplicative inverse, the resulting structure is called a field. The rational numbers (i.e. fractions) and the real numbers are two examples.

Fields are a little bit peculiar from an algebraic perspective, because the operation of taking multiplicative inverse is partial0-1 is left undefined. Having partially defined operations complicates things.

One classic way of addressing this defect is projective arithmetic, which we will see in more detail in the next episode. There 1/0 makes sense, but 0/0 is still typically left undefined. Recently, inspired by projective arithmetic, Youtube legend Norman Wildberger proposed to hack fraction notation so that 0 is allowed in the denominator: you can check out some videos starting from this one. I have not yet had the time to understand the details, but it seems that you have two new elements that correspond to 1/0 and 0/0; the latter seems to be our  but there is no .

There are some other algebraic systems that go the whole hog and make all operations defined everywhere. Two that I’ve come across are wheels and meadows. If there is interest, I can write more about these; let me know in the comments.

There is one fundamental difference between the above approaches and arithmetic in graphical linear algebra. There, existing notation (e.g. addition, multiplication, fraction notation) and associated equational theories were extended in some way to make sense of dividing by zero. In graphical linear algebra, addition is not even a primitive operation, or at least it’s not primitive in the same sense. So instead of setting out to make sense of 1/0, we discovered that it was already happening, in a similar way to how we discovered the natural numbers lurking in the theory B of bimonoids.

I was in Edinburgh for a couple of days last week. I left Southampton on a cold, dark and wet morning and arrived in a Scotland bathed in glorious sunshine.

I was there as the external examiner for the PhD defence of Ricardo Honorato-Zimmer, a student of Vincent Danos. It was great to have the chance to catch up with Vincent. He is one of the few scientists I know who is a true polymath—his background is the French school of logic, and his early contributions to the proof theory of linear logic are pretty famous. Around ten years ago, Vincent became interested in stochastic formalisms and graph rewriting, and together with a number of people he developed the Kappa language for modelling protein interaction networks. Vincent’s group is now full of chemists, biologists, mathematical physicists, mathematicians and computer scientists who all manage to communicate (this is far from easy in my experience!) and moreover come up with real advances. Academic politicians like to throw around the term “multi-disciplinary research”, but often the reality behind the buzzword is somewhat underwhelming. It’s always amazing to see the real thing in action.

The day after the defence I gave a short talk about graphical linear algebra at the LFCS seminar. Afterwards I met Kevin Dunne, who is doing a PhD with Ross Duncan at the University of Strathclyde in Glasgow. Kevin told me his idea about how to develop the theory of eigenvalues and eigenvectors with graphical linear algebra: it sounded fantastic and I’m really looking forward to figuring out the details. I also met Nick Behr, who is a mathematical physicist working with Vincent. He told me about the Umbral calculus, used to reason about formal power series, and we tried to figure out the relationship with some related techniques that Filippo, Fabio and I developed using graphical linear algebra. He will visit Southampton for a few days in January; it’s going to be exciting!

I had to come back to Southampton not long after my talk. This time, I experienced some more common Edinburgh weather: I walked from the computer science department to Waverley station in rain and gale force winds. The locals seemed undisturbed.

Continue reading with Episode 27 – Linear Relations.

# 25. Fractions, diagrammatically

I’ve been looking forward to this moment for some time. Now that we’ve gone through all of the equations of graphical linear algebra, it’s time to have some fun. In this episode we look at how fractions are represented in the graphical notation, derive the usual rules of fraction arithmetic, and go through a very early example of an algorithm that produces fractions: Fibonacci’s procedure for division by composite numbers.

We will concentrate on diagrams with two dangling wires: one on the left and one on the right. In previous episodes we have seen that these diagrams have a tight relationship with various kinds of numbers. For example, in the theory of bimonoids, a sub-theory of graphical linear algebra, they were in 1-1 correspondence with the natural numbers. When we added the antipode, extending the theory, we got the integers.  Since then, we’ve added the mirror image generators and figured out the equations that relate them. So what additional “numbers” do we get in the theory IH of interacting Hopf monoids that we finished assembling in the last episode?

We get fractions, the stars of this episode. Let’s start by proving a useful little lemma. Suppose that w ≠ 0. Then the w sugar can jump over any sugar that’s pointing in the opposite direction, like so.

Here’s the proof of (†):

In the first and final steps we used the assumption w ≠ 0, since the scalar equations that we introduced in the last episode need this condition. The steps in the middle use the fact that multiplication of integers is the same as composition; we went through this in Episode 18.

We immediately get the mirror version of (†), where the nonzero w scalar is now pointing to the right:

No proof needed, because of the mirror symmetry in the equational theory that we discussed in the last episode.

Any fraction can be written vw where v and w are integers and w ≠ 0. So let’s extend the syntactic sugar for integers, and define a sugar for fractions as follows:

At this point we have to be a little bit careful, because a fraction is not merely a pair of integers, one written on top of each other. Fractions represent ratios; for example 12 = 24 since the ratio of one-part-in-two is the same as two-parts-in-four. But (1,2) and (2,4) are different pairs of numbers, and so the usual, formal textbook definition says that a fraction v/w is the equivalence class [(v,w)] where w ≠ 0. The underlying equivalence relation relates (p,q) and (r,s) when ps = qr, since this condition captures exactly what it means for p/q and r/s to represent the same ratio. For (1,2) and (2,4) we have 1 ·4 = 2 ·2, and so 1/2 = 2/4.  So, for our definition of fraction sugar to even make sense—for it to be well-defined—we ought to show that

Let’s do this now. In addition to ps=qr, we can also assume q,s ≠ 0. All three of these assumptions are used in the following proof.

Our new sugar wouldn’t be much good if we were able to show that two non-equal fractions are the same, using diagrammatic reasoning, i.e. if the translation was not faithful. For this reason, it’s useful to show that the implication that we just proved also goes the other way:

Here’s the proof.

So we have shown that any fraction can be considered, in a faithful way, as a diagram of graphical linear algebra with one dangling wire on the left and one on the right.

In fact, every diagram of this type—i.e. with one wire dangling from both ends— can be put, using diagrammatic reasoning, either in the form

where u,v are integers or in the form

This follows from some more advanced insights about our equational system, so we will not prove it immediately. You will have to trust me for now.

Next up is the algebra of fractions. First a quick reminder of how we used diagrams to talk about the algebra of the natural numbers and the integers. In both cases, addition could be expressed in the graphical language as follows:

Of course, this agrees with what we already knew about ordinary addition of integers. One advantage of defining addition in this way is that several of its properties immediately follow by diagrammatic reasoning: e.g. associativity, commutativity and the fact that 0 is the additive identity.

Similarly, multiplication can be defined as composition:

Again, associativity is immediate, but commutativity does not follow for totally generic reasons. One can prove it for the natural numbers by induction, then use this to get commutativity of multiplication for the integers, since the antipode commutes with any natural number sugar.

Finally, for any integer v, we know that its sugar can slide past copying and adding in the following way.

We proved this for natural numbers by induction in Episode 9, and for integers it follows from the fact that the antipode also slides in this way. These facts can be used to show that multiplication distributes over addition.

Before we delve into the algebra of fractions, we need one more lemma. It says that nonzero scalars can go the “wrong way” past addition and copying.

We only need to prove one of the two equations of (‡); the second is bizarro of the first: remember that bizarro is combining the mirror-image symmetry with the invert-colours symmetry and, as we saw in the last episode, both symmetries are present in our equational system.

So let’s assume that w ≠ 0 and prove that w can go the wrong way past copying:

In the proof used the scalar equations, which require the assumption w ≠ 0. In the second step we used our knowledge about integers going the “right  way” over copy.

So what happens when w = 0? Let’s take a look at the diagrams. First, if we have 0 going the wrong way to the left of copy we have

and two zeros on the right gives

The two results are different: they cannot be made equal using diagrammatic reasoning in graphical linear algebra. We will make the reasons more precise in the next few episodes; for now, we could go back to our relational intuition and figure out that the two diagrams

define two different relations. Zero can’t go over addition the wrong way for similar reasons.

Let’s mimic the algebra of the integers and see what happens when we define:

and

First, multiplication. Using the definition of our fraction syntactic sugar, the right hand side of the above definition is the following diagram

and we can use diagrammatic reasoning to put it into the form (★):

So  r/s · p/q = rp/sq, as we all learned in school. Since we know that multiplication of integers is commutative, this also gives us the commutativity of multiplication for fractions.

Next, let’s do the same thing for addition, transforming the diagram for p/q + r/s into the form (★).

Again, transforming the diagram into (★) using diagrammatic reasoning gives us the formula we all know: p/q + r/s = (sp + qr)/qs . In the fourth step of the proof, we were able to move the qs the wrong way past addition (‡) because we know that both q and s are not zero, so qs is also not zero.

So just by using diagrammatic reasoning, we “discovered” the rules for fraction arithmetic. This is different from the usual treatment: if one defines a fraction to be an equivalence class of pairs (p,q), then multiplication and addition are defined by the formulas we derived. It feels better to derive than to define.

There are a few more things left to do: e.g. show distributivity of multiplication and the fact that we have additive inverses. We will finish the job in the next episode.

This weekend I finally got around to reading Liber Abaci by Leonardo of Pisa, aka Fibonacci, translated into English by Laurence Sigler. I’m six chapters in and I’ve really enjoyed it so far. I already knew roughly what was in it, but that’s a bit like saying that I knew that Romeo and Juliet contained a teen suicide before I had to read it at school. A lot of fun, of course, comes from the actual reading, from getting into Fibonacci’s head. It’s a real shame that classic works of mathematics—and you don’t get much more classic than Liber Abaci—don’t get anywhere near as much attention in our culture as classic works of literature or philosophy. Fibonacci’s raw excitement about calculating with the new Indo-Arabic positional notation does not come across anywhere near as vividly in more modern, prescriptive texts.

As I keep reading, I will pull out the bits that I find the most interesting and talk about them on the blog. We start today with fractions, which get a lot of attention in Liber Abaci. Fibonacci used a different notation to the one we all know. Actually, there are several related notations, but the most common one is the following: what Fibonacci would write

$\frac{1\ 5\ \ 7}{5\ 6\ 10}3$

we would recognise as 3 + 7/10 + 5/6·1/10 + 1/5·1/6·1/10).  The notation is thus read from right to left. It’s a bit like a decimal expansion, in the sense that reading more gives you more information about the number. First, we know the number is larger than 3 but smaller than 4. Then, after reading the next bit of the fraction, we know that it’s a bit more than 7/10ths of the way from 3 to 4, but not quite 8/10ths. Of that final 1/10th, we know it’s a little bit more than 5/6ths. And so forth.

After the quick calculation, this is 284/75, or 359/75, using the evil mixed fraction notation. Fibonacci’s fractions follow the convention that the denominators never get smaller as one reads the fractions left to right: e.g. in the example above 5≤6≤10. Also, by using 10 in the denominators, one can write decimal expansions. For example, 3.14159 can be written $\frac{9\ \ 5\ \ 1\ \ 4\ \ 1}{10\ 10\ 10\ 10\ 10}3$.

Fibonacci’s fraction notation translates neatly into the graphical notation. For instance $\frac{1\ 5\ \ 7}{5\ 6\ 10}3$, our example above, is:

It’s easy to pass between the two notations: in the diagram, the numerators go from left to right, following successive copying. The denominators are interleaved with adding on the right.

The reason for Fibonacci’s choice of fraction notation could be that it is the output format of his algorithm—in Chapter 5—for performing division by composite (i.e. non prime) numbers. To prepare, Fibonacci first describes how to divide by prime numbers, and we can recognise this procedure as being equivalent to ordinary long division. Of course, long division works for composite numbers as well, but Fibonacci’s division algorithm means that if the composite has small factors then we can make the procedure simpler: it is usually easier to do long division when you divide by small numbers.

Fibonacci’s writing style is very entertaining: he describes a calculation technique and then solves a number of successively larger problems in full detail; some so large that it’s difficult to think of practical applications, especially since we are talking about medieval Italy. For example, in Chapter 2, he multiplies 12345678 by 87654321 to get 1082152022374638. It’s as if he’s saying: “Ok, so the abacists can probably also do the last example, but look, here’s a much bigger one. Good luck building an abacus big enough!”.

We will go through one of Fibonacci’s small examples from Chapter 5: dividing 749 by 75, illustrating his procedure for dividing by composite numbers with the graphical notation.

First we factorise 75 into primes—actually Fibonacci is not a stickler about this point and factorises into whatever is the most convenient to divide by—and write them in an increasing order.

The algorithm works by processing each of the 3, 5, 5 in turn, building up the Fibonacci fraction in the process. We start with the and divide 749 by it, obtaining 249 and 2/3.

At this point we already have a chunk of the Fibonacci fraction: the 2/3. We can pull it aside for later

and we are left with:

Next we perform the following calculation:

First the 5 goes over addition backwards (‡). Then we divide 249 by 5, obtaining 49 and 4/5. Next we use associativity of addition, and pull the 5 back over. At this point we have another chunk of a Fibonacci fraction ready to go, so we can pull it out:

and we are left with

Now we perform the same steps as in the previous calculation: this is the “loop” in the algorithm. Each time the body of the loop is performed, we get rid of the next number in the decomposition of the denominator. It’s time to process the final 5.

At this point we do not have any more components of the denominator to consume, so the algorithm terminates. And composing ①, ② and ③ gives us the Fibonacci fraction we were looking for:

that the great man himself would write $\frac{2\ 4\ 4}{3\ 5\ 5}9$, the result of dividing 749 by 75.

Fibonacci’s division algorithm is not so useful for us today, since we have computers to do the division for us, and there’s no pressing need to keep the numbers that we are dividing by small. But it is very interesting as an early example of an algorithm tout court; here Fibonacci was probably influenced by al-Khwarizmi. The presentation, using graphical notation, also gave me an opportunity to show off the compositional nature of the graphical language; we could perform the computation in steps, chopping off parts of the diagram that we were finished with, and gluing everything back together at the end. Compositionality is the strong point of the graphical notation, as we will discover time and again.

Continue reading with Episode 26 – Keep Calm and Divide by Zero.