Determinants and the Lindström-Gessel-Vienot Lemma

Solomon is currently a student at the University of Pennsylvania where he is pursuing a Ph.D in Computer Science. He has a keen interest in all things Computer Science and Maths, so he decided to specialize in Programming Languages, an area which applies ideas from diverse areas of Computer Science, Mathematics and Philosophy. One of his favourite things is when connections are made between seemingly disparate areas of inquiry, so he has enjoyed reading the Graphical Linear Algebra blog for the connections between string diagrams and linear algebra. As the Lindström-Gessel-Vienot Lemma shows us, there is also a surprising combinatorial connection. Such occurrences make Solomon very happy!


Consider the familiar diagram for the 2 \times 2 matrix \begin{pmatrix} a & b \\ c & d \end{pmatrix}:

matrix

Suppose that we wanted to push some flow from the input nodes X_1 and X_2 to the output nodes Y_1 and Y_2.  Moreover, we want to do this in such a way that every output node receives some flow without any congestion in the network. More specifically, want to push flow from the input nodes to the output nodes in such a way that:

  1. no two different paths that are part of the flow intersect at
    a vertex (not even endpoints), and
  2. for each output node Y_j, there is a path from some input
    node X_i to Y_j.

For example, in the diagram above, we can push a units of flow from X_1 to Y_1 and d units of flow from X_2 to Y_2.

flowex1

We can also push c units of flow from X_1 to Y_2 and b units of flow from X_2 to Y_1.

flowex2

We are not allowed to push a units of flow from X_1 to Y_1 and b units of flow from X_2 to Y_1, since then we would have two different paths intersecting at Y_1.

flowcex1

Also, we are not allowed to push only a units of flow from X_1 to Y_1, since then there is no path from some X_i to Y_2.

flowcex2.gif

We call a flow that satisfies conditions (1) and (2) a viable flow. Now we will measure the signed weight a viable flow. The weight of a viable flow is simply the product of all weights on the edges that are part of the flow. To get the signed weight, we multiply the result by -1 if the flow links X_1 to Y_2. Otherwise, we multiply the result by +1.
For example, in our 2 \times 2 matrix example, the first viable flow has weight +ad. The unsigned weight of this flow is ad since we push a units of flow from X_1 to Y_1 and d units of flow from X_2 to Y_2. To get the signed weight of this flow, we multiply the unsigned weight ad by +1 since X_1 is linked to Y_1. A similar computation shows that  the signed weight of the second viable flow is -bc.
Interestingly, when we add signed weights of all the viable flows in the diagram, we get the determinant of the matrix associated with the diagram i.e. ad-bc. Coincidence?


The Lindström-Gessel-Vienot Lemma (LGVL) tells us that this in fact is not a
coincidence. Specifically, LGVL associates to each finite directed acyclic
graph (dag) G a matrix whose determinant is exactly equal to the
sum of the signed weights of the viable flows in G.  To make this relationship precise we shall give the statement of LGVL, but first we need to introduce some definitions and notation that we will need in order to give the statement of the theorem.

First, we introduce permutations. For any sets A=\{a_1, \ldots, a_n\} and B=\{b_1, \ldots, b_n\} we call any bijection \sigma : A \longrightarrow B a permutation from A to B. Now assume that \sigma sends a_1, \ldots, a_n to \sigma(a_1), \ldots, \sigma(a_n) respectively. For example, let A = B = \{1,2,3,4,5\}, and let \sigma send 1,2,3,4,5 to 3,5,2,1,4 respectively. Write a_1, \ldots, a_n in a row in that order, skip a line, then write b_i below a_i for each b_i. Draw a line from a_i to \sigma(a_i), and count the number m of crossings in between the {a_i}'s and the {b_i}'s. Define the sign of \sigma be (-1)^m. For the permutation \sigma we have just described, there are six crossings, so the sign of the permutation is (-1)^6 = 1.

permex.gif

The sign of a permutation \sigma can also be computed by raising -1 to the
number inv(\sigma) of inversions of \sigma, where inv(\sigma) is the number of ordered pairs (i, j) such that i < j but \sigma(i) > \sigma(j). In the example above, the number of inversions of \sigma is 6 = |\{(1,3), (1,4), (2,3), (2,4), (2,5), (3,4)\}|, which gives the same result as before.

For the next few pieces of notation as well as the statement of LGVL, assume that G is a finite directed acyclic graph. Assume further that G has input nodes X_1, \ldots, X_n and output nodes Y_1, \ldots Y_n.

setup

As before, we are interested in viable flows, i.e. flows that have the property that:

  1. no two different paths that are part of the flow intersect at
    a vertex (not even endpoints), and
  2. for each output node Y _j, there is a path from some input
    node X_i to Y_j.

Each viable flow can described by an n-tuple P = (P_1, \ldots, P_n) of paths , where P_i is a path from the input node X_i to the output node Y_{\sigma(i)} for some permutation \sigma. Let \sigma(P) be this permutation.

For each directed path P between any two vertices in G, let \omega (P) be the product of the weights of the edges of the path. For any two vertices X_i and Y_j, write e(X_i, Y_j) for the sum

e(X_i,Y_j)=\sum\limits _{P:X_i\to Y_j} \omega (P),

where P ranges over all paths from X_i to Y_j.
We are finally in a position to give the statement of LGVL, which is that

 


\left \vert {\begin{pmatrix}e(X_{1},Y_{1})&e(X_{1},Y_{2})&\cdots &e(X_{1},Y_{n})\\e(X_{2},Y_{1})&e(X_{2},Y_{2})&\cdots &e(X_{2},Y_{n})\\\vdots &\vdots &\ddots &\vdots \\e(X_{n},Y_{1})&e(X_{n},Y_{2})&\cdots &e(X_{n},Y_{n})\end{pmatrix}} \right \vert = \sum\limits _{{P = (P_{1},\ldots ,P_{n})}}{\mathrm {sign}}(\sigma (P))\prod\limits _{{i=1}}^{n}\omega (P_{i}),

where P ranges over all viable flows in G!

LGVL holds for any finite directed acyclic graph, not just graphs that look like the 2 \times 2 example. For example consider the following graph:

secondexgraph

or equivalently, the following string diagram:

secondex
The matrix for this is

\begin{pmatrix} ag+bh & bi & 0\\ ch & ci + dj & dk\\ 0 & ej & ek + fl \end{pmatrix}.

It has determinant agciek + agcifl + agdjfe + bhdjfe, and it is easy to check that each of these terms indeed gives a viable flow from the {s_i}'s to the {t_i}'s. Note also that the sign of each term is +1 since s_i is sent to t_i in each of the viable flows.

Now on the one hand, LGVL gives us a way of computing the sum of the signed weights of viable flows in a dag as the determinant of a certain matrix. On the other hand, given a matrix M, we may compute the determinant of M by doing the following. First, we find a special dag G. Next, we find a set of input vertices \{X_i\} and output vertices \{Y_j\} in G. We check to see that M_{ij} = e(X_i, Y_j). Then, we (easily!) compute the sum of the signed weights of viable flows in G from \{X_i\} to \{Y_j\}. Finally we conclude that the result is equal to the determinant of M by LGVL. This compositional approach to computing the determinant often leads to useful and surprising results, as we shall see below.


Consider the determinant of the following matrix:

A = \begin{pmatrix} (1,1) & (1,2) & ... & (1,n) \\ (2,1) & (2,2) & ... & (2,n) \\ \vdots & \vdots & \ddots & \vdots \\(n-1,1) & (n-1,2) & ... & (n-1,n)\\(n,1) & (n,2) & ... & (n,n)\end{pmatrix},

where (k, \ell) denotes the greatest common divisor of k and \ell.

It is difficult to come up with a closed formula for this determinant by using say row-reductions or the co-factor expansion. However, using the strategy involving LGVL gives a simple and surprising result.

But first, a few more definitions! Given any positive integer k, let \phi(k) denote the number of positive integers less than or equal to k which are relatively prime to k. For example, \phi(6) = 2 since the only numbers less than 6 that are relatively prime to 6 are 1 and 5. On the other hand \phi(7)=6 since all the positive integers less than 7 are relatively prime to 1. In fact, it is easy to see that \phi(p)=p-1 for any prime p.
Also, for the purposes of the proof, we shall need to use the fact that

\sum\limits_{d|k} \phi(d) = k,

which is a property first established by Gauss. For example

6 = \phi(1) +\phi(2) + \phi(3) + \phi(6) = 1 + 1 + 2 + 2.

Now we proceed to the actual computation. Consider the graph G which has 3n vertices: \{s_1, \ldots, s_n\} are the input vertices, \{t_1, \ldots, t_n\} are the output vertices and \{v_1, \ldots, v_n\} are intermediate vertices. If d divides k, put the edge s_k \to v_d with weight \phi(d) and the edge v_d \to t_k with weight 1.

smith

What is the weight e(s_k, t_{\ell}) of all the paths from s_k to t_{\ell}? Well each path from s_k to t_{\ell} can have only two edges: an edge s_k \rightarrow v_d of weight \phi(d) and an edge v_d \rightarrow t_{\ell} of weight 1, for some v_d, where d divides k and d divides \ell. Of course the weight of this path is \phi(d) multiplied by 1, which is just \phi(d). Therefore,

e(s_k, t_{\ell}) = \sum\limits_{\substack{d|k \\ d|\ell}}\phi(d) = \sum\limits_{d|(k, \ell)}\phi(d) = (k, \ell)

The last equality is justified by Gauss’ result.

By the LGVL, it follows that 

\det(A) = \sum\limits_{P = (P_1,\ldots,P_n)} \mathrm{sign}(\sigma(P)) \prod\limits_{i=1}^n \omega(P_i),

where P = (P_1, \ldots P_n) ranges over all viable flows from the input vertices S = \{s_1, \ldots, s_n\} to the output vertices T = \{t_1, \ldots, t_n\}.
Now the only path from s_1 to T is s_1 \to v_1 \to t_1 since 1 has only one divisor, namely 1. As a result, for each prime 1 \leq p \leq n, any path from s_p to T in a viable flow must pass through v_p. This is because p has only two divisors, namely 1 and p, but the path through 1 has already been blocked by 1.

Consequently, for any product pq of two primes, any path from s_{pq} to T must pass through v_{pq}. Repeating this argument for all the other numbers less than or equal to n, we conclude that all the viable flows from S to T link s_k to v_k.

Now if there is a viable flow which links v_k to t_{\ell} for some k \neq \ell, then there must be some k', \ell' such that v_{k'} is linked to t_{\ell}' with k' > \ell' (why?). But this means that k' divides \ell', a contradiction since k' > \ell! So we must have v_k = t_k, for all k.

Thus, there is a unique viable flow from S to T i.e. the flow for which s_k is sent to t_k via v_k. Thus, we conclude that the determinant of A is \phi(1) \cdot \ldots \cdot \phi(n), since the weight of each path P_k in this flow is \phi(k) as we showed earlier:

\left \vert \begin{pmatrix}(1,1) & (1,2) & ... & (1,n) \\ (2,1) & (2,2) & ... & (2,n) \\ \vdots & \vdots & \ddots & \vdots \\(n-1,1) & (n-1,2) & ... & (n-1,n)\\(n,1) & (n,2) & ... & (n,n)\end{pmatrix} \right \vert = \phi(1) \cdot \ldots \cdot \phi(n)

5 thoughts on “Determinants and the Lindström-Gessel-Vienot Lemma

  1. Sweet. Thank you for sharing this!
    I noticed two minor typos (omitted words) and had a few questions.
    “signed weight a viable flow” –> “signed weight of a viable flow”
    “Each viable flow can described by” –> “Each viable flow be can described by”
    I think the variable P is overloaded between this and the following paragraphs (as both an arbitrary viable flow and an arbitrary path between two given endpoints). Am I missing something, or are these actually distinct objects?
    You pass lightly over a potentially interesting point: “…given a matrix M, we may compute the determinant of M by doing the following. First, we find a special dag G…” Is this inverse operation always possible?

    Like

  2. ”It has determinant agciek + agcifl + agdjfe + bhdjfe, and it is easy to check that each of these terms indeed gives a viable flow from the {s_i}’s to the {t_i}’s.“
    should be
    ”It has determinant agciek + agcifl + agdjfl + bhdjfl, and it is easy to check that each of these terms indeed gives a viable flow from the {s_i}’s to the {t_i}’s.“

    Like

  3. Hi Pawel. Thanks for this beautiful blog!! I have learned and enjoyed a lot.
    I am a newbie on this matter. I hope someday you can explain us, in your diagrammatic picture, the characteristic polynomial and trace invariants for the Hermitian case. Are your diagrams connected with the so-called trace diagrams?: https://mathoverflow.net/questions/6139/how-can-i-learn-about-doing-linear-algebra-with-trace-diagrams

    Thank you in advance !!

    Like

Leave a comment