### computation

Whatever ultimately becomes of some aspects of the Standard Model – the Higgs boson, for example – here is a report (based on an experiment described here) that some of the fundamentals hold up well to experimental test. Specifically, the Spin-Statistics Theorem – the relationship between quantum numbers of elementary particles and the representation theory of the Poincare group. It would have been very surprising if things had been otherwise, but as usual, the more you rely on an idea, the more important it is to be sure it fits the facts. The association between physics and representation theory is one of those things.

So the fact that it all seems to work correctly is a bit of a relief for me. See below.

Since the paperwork is now well on its way, I may as well now mention here that I’ve taken a job as a postdoctoral researcher at CAMGSD, a centre at IST in Lisbon, starting in September. In a week or so I will be heading off to visit there – there are quite a few people there doing things I find quite interesting, so it should be an interesting trip. After that, I’ll be heading down to the south of the country for the Oporto meeting on Geometry, Topology and Physics, which is held this year in Faro. This year the subject is “categorification”, so my talk will be mainly about my paper on ETQFT. There are a bunch of interesting speakers – two I happen to know personally are Aaron Lauda and Joel Kamnitzer, but many others look quite promising.

In particular, one of the main invited speakers is Mikhail Khovanov, whose name is famously (for some values of “famous”) attached to Khovanov Homology, which is a categorification of the Jones Polynomial. Instead of a polynomial, it associates a graded complex of vector spaces to a knot. (Dror Bar-Natan wrote an intro, with many pictures and computations). Khovanov’s more recent work, with Aaron Lauda, has been on categorifying quantum groups (starting with this).

Now, as for me, since my talk in Faro will only be about 20 minutes, I’m glad of the opportunity to give some more background during the visit at IST. In particular, a bunch of the background to the ETQFT paper really depends on this paper on 2-linearization. I’ve given some previous talks on the subject, but this time I’m going to try to get a little further into how this fits into a more general picture. To repeat a bit of what’s in this post, 2-linearization describes a (weak) 2-functor:

$\Lambda : Span(Gpd) \rightarrow 2Vect$

where $Span(Gpd)$ has groupoids as its objects, spans of groupoid homomorphisms as its arrows, and spans-of-span-maps as 2-morphisms. $2Vect$ is the 2-category of 2-vector spaces, which I’ve explained before. This 2-functor is supposed to be a sort of “linearization”, which is a very simple functor

$L : Span(FinSet) \rightarrow Vect$

It takes a set $X$ to the free vector space $L(X) = \mathbb{C}^X$, and a span $X \stackrel{s}{\leftarrow} S \stackrel{t}{\rightarrow} Y$ to a linear map $L(S) : L(X) \rightarrow L(Y)$. This can be described in two stages, starting with a vector in $L(S)$, namely, a function $\psi : X \rightarrow \mathbb{C}$. The two stages are:

• First, “pull” $\psi$ up along $s$ to $\mathbb{C}^S$ (note: I’m conflating the set $S$ with the span $(S,s,t)$), to get the function $s^*\psi = \psi \circ s : S \rightarrow \mathbb{C}$.
• Then “push” this along $t$ to get $t_*(s^*\psi)$. The “push” operation $f_*$ along any map $f : X \rightarrow Y$ is determined by the fact that it takes the basis vector $\delta_x \in \mathbb{C}^X$ to the basis vector $\delta_{f(x)} \in \mathbb{C}^Y$ (these are the delta functions which are 1 on the given element and 0 elsewhere)

It’s helpful to note that, for a given map $f : X \rightarrow Y$, are linear adjoints (using the standard inner product where the delta functions are orthonormal). Combining them together – it’s easy to see – gives a linear map which can be described in the basis of delta functions by a matrix. The $(x,y)$-entry of the matrix counts the elements of $S$ which map to $(x,y)$ under $(s,t) : S \rightarrow X \times Y$. We interpret this by saying the matrix “counts histories” connecting $x$ to $y$.

In groupoidification, a-la Baez and Dolan (see the various references beyond the link), one replaces $FinSet$ with $FinGpd$, the 2-category of (essentially) finite groupoids, but we still have a functor into $Vect$. In fact, into $FinHilb$: the vector space $D(G)$ is the free one on isomorphism classes in $G$, but the linear maps (and the inner product) are tweaked using the groupoid cardinality, which can be any positive rational number. Then we say the matrix does a “sum over histories” of certain weights. In this paper, I extend this to “$U(1)$-groupoids”, which are labelled by phases – which represent the exponentiated action in quantum mechanics – and end up with complex matrices. So far so good.

The 2-linearization process is really “just” a categorification of what happens for sets, where we treat “groupoid” as the right categorification of “set”, and “Kapranov-Voevodsky 2-vector space” as the right categorification of “vector space”. (To treat “category” as the right categorification of “set”, one would have to use Elgueta’s “generalized 2-vector space“, which is probably morally the right thing to do, but here I won’t.) To a groupoid $X$, we assign the category of functors into $Vect$ – that is, $Rep(X)$ (in smooth cases, we might want to restrict what kind of representations we mean – see below).

To pull such a functor along a groupoid homomorphism $f : X \rightarrow Y$ is again done by precomposition: $f^*F = F \circ f$. The push map in 2-linearization is the Kan extension of the functor $\Psi$ along $f$. This is the universal way to push a functor forward, and is the (categorical!) adjoint to the pull map. (Kan extensions are supposed to come equipped with some natural transformations: these are the ones associated to the adjunction). Then composing “pull” and “push”, one categorifies “sum over histories”.

So here’s one thing this process is related to: in the case where our groupoids have just one object (i.e. are groups), and the homomorphism $f : X \rightarrow Y$ is an inclusion (conventionally written $H < G$), this goes by a familiar name in representation theory: restriction and induction. So, given a representation $\rho$ of $G$ (that is, a functor from $Y$ into $Vect$), there is an induced representation $res_H^G \rho = f^*\rho$, which is just the same representation space, acted on only by elements of $H$ (that is, $X$). This is the easy one. The harder one is the induced representation of $G$ from a representation $\tau$ of $H$ (i.e. $\tau : X \rightarrow Vect$, which is to say $ind^G_H \tau = f_* \tau : Y \rightarrow Vect$. The fact that these operations are adjoints goes in representation theory by the name “Frobenius reciprocity”.

These two operations were studied by George Mackey (in particular, though I’ve been implicitly talking about discrete groups, Mackey’s better known for looking at the case of unitary representations of compact Lie groups). The notion of a Mackey functor is supposed to abstract the formal properties of these operations. (A Mackey functor is really a pair of functors, one covariant and one contravariant – giving restriction and “transfer”/induction maps for – which have formal properties similar to the functor from groups into their representation rings – which it’s helpful to think of as the categories of representations, decategorificatied. In nice cases, a Mackey functor from a category $C$ is the same as a functor out of $Span(C)$).

Anyway, by way of returning to groupoids: the induced representation for groups is found by $\mathbb{C}[G] \otimes_{\mathbb{C}[H]} V$, where $V$ is the representation space of $\tau$. (For compact Lie groups, replace the group algebra $\mathbb{C}[G]$ with $L^2(G)$, and likewise for $H$). A similar formula shows up in the groupoid case, but with a contribution from each object (see the paper on 2-linearization for more details). This is also the formula for the Kan extension.

“Now wait a minute”, the categorically aware may ask, “do you mean the left Kan extension, or the right Kan extension?” That’s a good question! For one thing, they have different formulas: one involving limits, and the other involving colimits. Instead of answering it, I’ll talk about something not entirely unrelated – and a little more context for 2-linearization.

The setup here is actually a rather special case of Grothendieck’s six-operation framework, in the algebro-geometric context, for sheaves on (algebraic) spaces (there’s an overview in this talk by Joseph Lipman, the best I’ve been able to find online). Now, , these operations as extended to derived categories of sheaves (see this intro by R.P. Thomas). The derived category $D(X)$ is described concretely in terms of chain complexes of sheaves in $Sh(X)$, taken “up to homotopy” – it is a sort of categorification of cohomology. But of course, this contains $Sh(X)$ as trivial complexes (i.e. concentrated at level zero). The fact that our sheaves come from functors into $Vect$, which form a 2-vector space, so that functors between these are exact, means that there’s no nontrivial homology – so in our special case, the machinery of derived categories is more than we need.

This framework has been extended to groupoids – so the sheaves are on the space of objects, and are equivariant – as described in a paper by Moerdijk called “Etale Groupoids, Derived Categories, and Operations” (the situation of sheaves that are equivariant under a group action is described in more detail by Bernstein and Lunts in the Springer lecture notes “Equivariant Sheaves and Functors”). Sheaves on groupoids are essentially just equivariant sheaves on the space of objects. Now, given a morphism $f : X \ra Y$, there are four induced operations:

• $f^* , f^! : D(Y) \rightarrow D(X)$
• $f_*, f^! : D(X) \rightarrow D(Y)$ (in general right adjoint to $f^*$ and $f^!$)

(The other operations of the “six” are $hom$ and $\otimes$). The basic point here is that we can “pull” and “push” sheaves along the map $f$ in various ways. For our purposes, it’s enough to consider $f^*$ and $f_*$. The sheaves we want come from functors into $Vect$ (we actually have a vector space at each point in the space of objects). These are equivariant “bundles”, albeit not necessarily locally trivial. The fact that we can think of these as sheaves – of sections – tends to stay in the background most of the time, but in particular, being functors automatically makes the resulting sheaves equivariant. In the discrete case, we can just think of these as sheaves of vector spaces: just take $F(U)$ to be the direct sum of all the vector spaces at each object in any subset $U$ – all subsets are open in the discrete topology… For the smooth situation, it’s better not to do this, and think of the space of sections as a module over the ring of suitable functions.

Now to return to your very good question about “left or right Kan extension”… the answer is both. since for $Vect$-valued functors (where $Vect$ is the category of finite dimensional vector spaces), we have natural isomorphisms $f^* \cong f^!$ and $f_* \cong f_!$: these functors are \textit{ambiadjoint} (ie. both left and right adjoint). We use this to define the effect of $\Lambda$ on 2-morphisms in $Span_2(Gpd)$.

This isomorphism is closely related to the fact that finite-dimensional vector spaces are canonically isomorphic to their double-dual: $V \cong V^{**}$. That’s because the functors $f^*$ and $f_*$ are 2-linear maps. These are naturally isomorphic to maps represented as matrices of vector spaces. Taking an adjoint – aside from transposing the matrix, naturally replaces the matrices with their duals. Doing this twice, we get the isomorphisms above. So the functors are both left and right adjoint to each other, and thus in particular we have what is both left and right Kan extension. (This is also connected with the fact that, in $Vect$, the direct sum is both product and coproduct – i.e. limit and colimit.)

It’s worth pointing out, then, that we wouldn’t generally expect this to happen for infinite-dimensional vector spaces, since these are generally not canonically isomorphic to their double-duals. Instead, for this case we would need to be looking at functors valued in $Hilb$, since Hilbert spaces do have that property. That’s why, in the case of smooth groupoids (say, Lie groupoids), we end up talking about “(measurable) equivariant Hilbert bundles”. (In particular, the ring of functions over which our sheaves are modules is: the measurable ones. Why this is the right choice would be a bit of a digression, but roughly it’s analogous to the fact that $L^2(X)$ is a space of measurable functions. This is the limitation on which representations we want that I alluded to above.).

Now, $\Lambda$ is supposed to be a 2-functor. In general, given a category $C$ with all pullbacks, $Span_2(C)$ is the universal 2-category faithfully containing $C$ such that every morphism has an ambiadjoint. So the fact that the “pull” and “push” operations are ambiadjoint lets this 2-functor respect that property. It’s the unit and counits of the adjunctions which produce the effect of $\Lambda$ on 2-morphisms: given a span of span-maps, we take the two maps in the middle, consider the adjoint pairs of functors that come from them, and get a natural transformation which is just the composite of the counit of one adjunction and the unit of the other.

Here’s where we understand how this fits into the groupoidification program – because the effect of $\Lambda$ on 2-morphisms exactly reproduces the “degroupoidification” functor of Baez and Dolan, from spans of groupoids into $Vect$, when we think of such a span as a 2-morphism in $Hom(1,1)$ – that is, a span of maps of spans from the terminal groupoid to itself. In other words, degroupoidification is an example something we can do between ANY pair of groupoids – but in the special case where the representation theory all becomes trivial. (This by no means makes it uninteresting: in fact, it’s a perfect setting to understand almost everything else about the subject).

Now, to actually get all the coefficients to work out to give the groupoid cardinality, one has to be a bit delicate – the exact isomorphism between the construction of the left and right adjoint has some flexibility when we’re working over the field of complex numbers. But there’s a general choice – the Nakayama isomorphism – which works even when we’re replace $Vect$ by $R$-modules for some ring $R$. To make sure, for general $R$, that we have a true isomorphism, the map needs some constants. These happen to be, in our case, exactly the groupoid cardinalities to make the above statement true!

To me, this last part is a rather magical aspect of the whole thing, since the motivation I learned for groupoid cardinalities is quite remote from this – it’s just a valuation on groupoids which gets along with products and coproducts, and also with group actions (so that $|X/G| = |X|/|G|$, even when the action isn’t free). So one thing I’d like to know, but currently don’t is: how is it that this is “secretly” the same thing as the Nakayama isomorphism?

Last Friday, UWO hosted a Distinguished Colloquium talk by Gregory Chaitin, who was talking about a proposal for a new field he calls “metabiology”, which he defined in the talk (and on the website above) as “a field parallel to biology, dealing with the random evolution of artificial software (computer programs) rather than natural software (DNA), and simple enough that it is possible to prove rigorous theorems or formulate heuristic arguments at the same high level of precision that is common in theoretical physics.” This field doesn’t really exist to date, but his talk was intended to argue that it should, and to suggest some ideas as to what it might look like. It was a well-attended talk with an interdisciplinary audience including (at least) people from the departments of mathematics, computer science, and biology. As you might expect for such a talk, it was also fairly nontechnical.

A lot of the ideas presented in the talk overlapped with those in this outline, but to summarize… One of the motivating ideas that he put forth was that there is currently no rigorous proof that Darwin-style biological evolution can work – i.e. that operations of mutation and natural selection can produce systems of very high complexity. This is a fundamental notion in biology, summarized by the slogan, “Nothing in biology makes sense except in light of evolution”. This phrase, funnily, was coined as the title of a defense of a “theistic evolution” – not obviously a majority position among scientists, but also not to be confused with “intelligent design” which claims that evolution can’t account for observed features of organisms. This is a touchy political issue in some countries, and it’s not obvious that a formal proof that mutation and selection CAN produce highly complex forms would resolve it. Even so, as Chaitin said, it seems likely that such a proof could exist – but if there’s a rigorous proof of the contrary, that would be good to know also!

Of course, such a formal proof doesn’t exist because formal proof doesn’t play much role in biology, or any other empirical science – since living things are very complex, and incompletely understood. Thus the proposal of a different field, “metabiology”, which would study simpler formal objects: “artificial software” in the form of Turing machines or program code, as opposed to “natural software” like DNA. This abstracts away everything about an organism except its genes (which is a lot!), with the aim of simplifying enough to prove that mutation and selection in this toy world can generate arbitrarily high levels of complexity.

Actually stating this precisely enough to prove ties in to the work that Chaitin is better known for, namely the study of algorithmic complexity and theoretical computer science. The two theorems Chaitin stated (but didn’t prove in the talk) did not – he admitted – really meet that goal, but perhaps did point in that direction. One measure of complexity is computability – that is, the size of a Turing machine (for example, though a similar definition applies to other universal ways of describing algorithms) which is needed to generate a particular pattern. A standard example is the “Busy Beaver function“, and one way to define
it is to say that $B(n)$ is the largest number printed out by an $n$-state Turing machine which then halts. Since the halting problem is uncomputable (i.e. there’s no Turing machine which, given a description of another machine, can always decide whether or not it halts), for reasons analogous to Cantor’s diagonal argument or Godel’s incompleteness theorem, generating $B(n)$, or a sequence of the same order, is a good task to measure complexity.

So the first toy model involved a single organism, being replaced in each generation by a mutant form. The “organism” is a Turing machine (or a program in some language, etc. – one key result from complexity theory is that all these different ways to specify an algorithm can simulate each other, with the addition of at most a fixed-size prefix, which is the part of the algorithm describing how to do the simulation). In each generation, it is mutated. The mutant replaces the original organism if: (a) the new code halts, and (b) outputs a number which (c) is larger than the number produced by the original. Now, this decision procedure is uncomputable since it requires solving the halting problem – so in particular, there’s no way to simulate this process. But the theorem says that, in exponential time (i.e. $t(n) \sim O(e^n)$), this process will produce a machine which produces a number of order $B(n)$. That is, as long as the “environment” (the thing doing the selection) can recognize and reward complexity, mutation is sufficient to produce it. But these are pretty big assumptions, which is one reason this theorem isn’t quite what’s wanted.

Still, within it’s limited domain, he also stated a theorem to the effect that, for any given level of complexity (in the above sense), there is a path through the space of possible programs which reaches it, such that the “mutation distance” (roughly, the negative logarithm of the probability of a mutation occurring) at each step is bounded, and the complexity (therefore fitness, in this toy model) increases at each step. He indicated that one could prove this using the bits of the halting probability Omega – he didn’t specify how, and this isn’t something I’m very familiar with, but apparently (as describeded in the linked article), there are somewhat standard ways to do this kind of thing.

So anyway, this little toy model doesn’t really do the job Chaitin is saying ought to be done, but it illustrates what the kind of theorems he’s asking for might look like. My reaction is that it would be great to have theorems like this that could tell us something meaningful about real biology (so the toy model certainly is too simple), though I’m not totally convinced there needs to be a “new field” for such study. But certainly theoretical biology seems to be much less developed than, say, theoretical physics, and even if rigorous proofs aren’t going to be as prominent there, if some can be found, it probably couldn’t hurt.

After the talk, there was some interesting discussion about other things going on in theoretical biology and “systems biology“.  Chaitin commented that a lot of the work in this field involves detailed simulations of models of real systems, made as accurate as possible – which, while important, is different from the kind of pursuit of basic theoretical principles he was talking about.  So this would include things like: modeling protein folding; studying patterns in big databases of gene frequencies in populations and how they change in time; biophysical modeling of organs and the biochemical reactions in them; simulating the dynamics of individual cells, their membranes and the molecular machinery that makes them work; and so on.  All of which has been moving rapidly in recent years,  but is only tangentially related to fundamental principles about how life works.

On the other hand, as audience members pointed out, there is another thread, exemplified by the Santa Fe Institute, which is more focused on understanding the dynamics of complex systems.  Some well-known names in this area would be Stuart Kauffman, John Holland and Per Bak, among others.  I’ve only looked into this stuff at the popular level, but there are some interesting books about their work – Holland’s “Hidden Order”, Kauffman’s “The Origins of Order” (more technical) and “At Home in the Universe” (more popular), and Solé and Goodwin’s “Signs of Life” (a popular survey, but with equations, of various
aspects of mathematical approaches to biological complexity).  Chaitin’s main comment on this stuff is that it has produced plenty of convincing heuristic arguments, simulations and models with suggestive behaviour, and so on – but not many rigorous theorems.  So: it’s good, but not exactly what he meant by “metabiology”.

Summarizing this stuff would be a big task in itself, but it does connect to Chaitin’s point that it might be nice to know (rigorously) if Darwinian evolution by itself were NOT enough to explain the complexity of living things.  Stuart Kauffman, for example, has suggested that certain kinds of complex order tend to arise through “self-organization”.  Philosopher Daniel Dennett
commented on this in “Darwin’s Dangerous Idea”, saying that although this might be true, at most it tells us more detail about what kinds of things Darwinian selection has available to act on.

This all seems to tie into the question over which appeared first as life was first coming into being: self-replicating molecules like RNA (and later DNA), or cells with metabolic reactions occurring inside.  Organisms obviously both reproduce and metabolize, but these are two quite different kinds of process, and there seems to be a “chicken-and-egg” problem with which came first.  Kauffman, among others, has looked at the emergence of “autocatalytic networks” of chemical reactions: these are collections of chemical reactions, some or all of which needing a catalyst, such that all the catalysts needed to make them run are products of some reaction in the network.  They’ve shown in simulation that such networks can arise spontaneously under certain conditions – suggesting that metabolism might have come into existence without DNA or similar molecules around (one also thinks of larger phenomena, like the nitrogen cycle).  In any case, this is the kind of thing which people sometimes point to when suggesting that Darwinian selection isn’t enough to completely explain the structure of organisms actually existing today.  Which is a different claim (mind you) than the claim that Darwinian evolution could not possibly produce complex organisms.  Chaitin’s whole motivation was to suggest that it should be provable one way or the other (and, he presumes, in the affirmative) whether mutation and selection CAN do this job.  If it could be proved that it can’t – at least there are some other ingredients to consider.

All in all, I found the talk thought-provoking, in spite (or because) of being partial and inconclusive.  Biology may be less rigorous than physics, but this could just be a sign that there’s a lot to learn and do in the field – and a lot of it is being done!

Continuing from the previous post…

I realized I accidentally omitted Klaas Lansdman’s  talk on the Kochen-Specker theorem, in light of topos theory.  This overlaps a lot with the talk by Andreas Doring, although there are some significant differences.  (Having heard only what Andreas had to say about the differences, I won’t attempt to summarize them).  Again, the point of the Kochen-Specker theorem is that there isn’t a “state space” model for a quantum system – in this talk, we heard the version saying that there are no “locally sigma-Boolean” maps, from operators on a Hilbert space, to $\{ 0, 1 \}$.  (This is referring to sigma-algebas (of measurable sets on a space), and Boolean algebras of subsets – if there were such a map, it would be representing the system in terms of a lattice equivalent to some space).  As with the Isham/Doring approach, they then try to construct something like a state space – internal to some topos.  The main difference is that the toposes are both categories of functors into sets from some locale – but here the functors are covariant, rather than contravariant.

Now, roughly speaking, the remaining talks could be grouped into two kinds:

Quantum Foundations

Many people came to this conference from a physics-oriented point of view.  So for instance Rafael Sorkin gave a talk asking “what is a quantum reality?”. He was speaking from a “histories” interpretation of quantum systems. So, by contrast, a “classical reality” would mean one worldline: out of some space of histories, one of them happens. In quantum theory, you typically use the same space of histories, but have some kind of “path integral” or “sum over histories” when you go to compute the probabilities of given events happening. In this context, “event” means “a subset of all histories” (e.g. the subset specified by a statement like “it rained today”). So his answer to the question is: a reality should be a way of answering all questions about all events.  This is called a “coevent”.  Sorkin’s answer to “what is a quantum reality?” is: “a primitive, preclusive coevent”.

In particular, it’s a measure $\mu$.  For a classical system, “answering” questions means yes/no, whether the one history is in a named event – for a quantum system, it means specifying a path integral over all events – i.e. a measure on the space of events.  This measure needs some nice properties, but it’s not, for instance, a probability measure (it’s complex valued, so there can be interference effects).  Preclusion has to do with the fact that the measure of an event being zero means that it doesn’t happen – so one can make logical inferences about which events can happen.

Other talks addressing foundational problems in physics included Lucien Hardy’s: he talked about how to base predictive theories on operational structures – and put to the audience the question of whether the structures he was talking about can be represented categorically or not.  The basic idea is an “operational structure” is some collection of operations that represents a physical experiment whose outcome we might want to predict.  They have some parameters (“knob settings”), outcomes (classical “readouts”), and inputs and outputs for the things they study and affect (e.g. a machine takes in and spit out an electron, doing something in the middle).  This sort of thing can be set up as a monoidal category – but the next idea, “object-oriented operationalism”, involved components having “connections” (given relations between their inputs) and “coincidences” (predictable correlations in output).  The result was a different kind of diagram language for describing experiments, which can be put together using a “causaloid product” (he referred us to this paper, or a similar one, on this).

Robert Spekkens gave a talk about quantum theory as a probability theory – there are many parallels, though the complex amplitudes give QM phenomena like interference.  Instead of a “random variable” $A$, one has a Hilbert space $H_A$; instead of a (positive) function of $A$, one has a positive operator on $H_A$; standard things in probability have analogs in the quantum world.  What Robert Spekkens’ talk dealt with was how to think about conditional probabilities and Bayesian inference in QM.  One of the basic points is that when calculating conditional probabilities, you generally have to divide by some probability, which encounters difficulties translating into QM.  He described how to construct a “conditional density operator” along similar lines – replacing “division” by a “distortion” operation with an analogous meaning.  The whole thing deeply uses the Choi-Jamiolkowski isomorphism, a duality between “states and channels”.  In terms of the string diagrams Bob Coecke et. al. are keen on, this isomorphism can be seen as taking a special cup which creates entangled states into an ordinary cup, with an operator on one side.  (I.e. it allows the operation to be “slid off” the cup).  The talk carried this through, and ended up defining a quantum version of the probabilistic concept of “conditional independence” (i.e. events A and C are independent, given that B occurred).

A more categorical look at foundational questions was given by Rick Blute’s talk on “Categorical Structures in AQFT”, i.e. Algebraic Quantum Field Theory.  This is a formalism for QFT which takes into account the causal structure it lives on – for example, on Minkowski space, one has a causal order for points, with $x \leq y$ if there is a future-directed null or timelike curve from $x$ to $y$.  Then there’s an “interval” (more literally, a double cone) $[x,y] = \{ z | x \leq z \leq y\}$, and these cones form a poset under inclusion (so this is a version of the poset of subspaces of a space which keeps track of the causal structure).  Then an AQFT is a functor $\mathbb{A}$ from this poset into C*-algebras (taking inclusions to inclusions): the idea is that each local region of space has its own algebra of observables relevant to what’s found there.  Of course, these algebras can all be pieced together (i.e. one can take a colimit of the diagram of inclusions coming from all regions on spacetime.  The result is $\hat{\mathbb{A}}$.  Then, one finds a category of certain representations of it on a hilbert space $H$ (namely, “DHR” representations).  It turns out that this category is always equivalent to the representations of some group $G$, the gauge group of the AQFT.  Rick talked about these results, and suggested various ways to improve it – for example, by improving how one represents spacetime.

The last talk I’d attempt to shoehorn into this category was by Daniel Lehmann.  He was making an analysis of the operation “tensor product”, that is, the monoidal operation in $Hilb$.  For such a fundamental operation – physically, it represents taking two systems and looking at the combined system containing both – it doesn’t have a very clear abstract definition.  Lehmann presented a way of characterizing it by a universal property analogous to the universal definitions for products and coproducts.  This definition makes sense whenever there is an idea of a “bimorphism” – a thing which abstracts the properties of a “bilinear map” for vector spaces.  This seems to be closely related to the link between multicategories and monoidal categories (discussed in, for example, Tom Leinster’s book).

Categories and Logic

Some less physics-oriented and more categorical talks rounded out the part of the program that I saw.  One I might note was Mike Stay‘s talk about the Rosetta Stone paper he wrote with John Baez.  The Rosetta Stone, of course, was a major archaeological find from the Ptolemaic period in Egypt – by that point, Egypt had been conquered by Alexander of Macedon and had a Greek speaking elite, but the language wasn’t widespread.  So the stone is an official pronouncement with a message in Greek, and in two written forms of Egyptian (heiroglyphic and demotic), neither of which had been readable to moderns until the stone was uncovered and correspondences could be deduced between the same message in a known language and two unknown ones.  The idea of their paper, and Mike’s talk, is to collect together analogs between four subjects: physics, topology, computation, and logic.  The idea is that each can be represented in terms of monoidal categories.  In physics, there is the category of Hilbert spaces; in topology one can look at the category of manifolds and cobordisms; in computation, there’s a monoidal category whose objects are data types, and whose morphisms are (equivalence classes) of programs taking data of one type in and returning data of another type; in logic, one has objects being propositions and morphisms being (classes) of proofs of one proposition from another.  The paper has a pretty extensive list of analogs between these domains, so go ahead and look in there for more!

Peter Selinger gave a talk about “Higher-Order Quantum Computation”.  This had to do with interesting phenomena that show up when dealing with “higher-order types” in quantum computers.  These are “data types”, as I just described – the “higher-order” types can be interpreted by blurring the distinction between a “system” and a “process”.  A data type describing a sytem we might act on might be $A$ or $B$.  A higher order type like $A \multimap B$ describes a process which takes something of type $A$ and returns something of type $B$.  One could interpret this as a black box – and performing processes on a type $A \multimap B$ is like studying that black box as a system itself.  This type is like an “internal hom” – and so one might like to say, “well, it’s dual to tensor – so it amounts to taking $A^* \otimes B$, since we’re in the category of Hilbert spaces”.  The trouble is, for physical computation, we’re not quite in the category where that works.  Because not all operators are significant: only some class of totally positive operators are physical.  So we don’t have the hom-tensor duality to use (equivalently, don’t have a well-behaved dual), and these types have to be considered in their own right.  And, because computations might not halt, operations studying a black box might not halt.  So in particular, a “co-co-qubit” isn’t the same as a qubit.  A co-qubit is a black box which eats a qubit and terminates with some halting probability.  A co-co-qubit eats a co-qubit and does the same.  If not for the halting probability, one could equally well see a qubit “eating” a co-co-qubit as the reverse.  But in fact they’re different.  A key fact in Peter’s talk is that quantum computation has new logical phenomena happening with types of every higher order.  Quantifying this (an open problem, apparently) would involve finding some equivalent of Bell inequalities that apply to every higher order of type.  It’s interesting to see how different quantum computing is, in not-so-obvious ways, from the classical kind.

Manoush Sadrzadeh gave a talk describing how “string diagrams” from monoidal categories, and representations of them, have been used in linguistics.  The idea is that the grammatical structure of a sentence can be build by “composing” structures associated to words – for example, a verb can be composed on left and right with subject and object to build a phrase.  She described some of the syntactic analysis that went into coming up with such a formalism.  But the interesting bit was to compare putting semantics on that syntax to taking a representation.  In particular, she described the notion of a semantic space in linguistics: this is a large-dimensional vector space that compares the meanings of words.  A rough but surprisingly effective way to clump words together by meaning just uses the statistics on a big sample of text, measuring how often they co-occur in the same context. Then there is a functor that “adds semantics” by mapping a category of string diagrams representing the syntax of sentences into one of vector spaces like this.  Applying the kind of categorical analysis usually used in logic to natural language seemed like a pretty neat idea – though it’s clear one has to make many more simplifying assumptions.

On the whole, it was a great conference with a great many interesting people to talk to – as you might guess from the fact that it took me three posts to comment on everything I wanted.

Last week was Wade Cherrington‘s Ph.D. defense – he is (or, rather, WAS) a student of Dan Christensen. The title was “Dual Computational Methods for Lattice Gauge Theory”. The point of which is to describe some methods for doing numerical computations of various physical systems governed by gauge theories. This would include electromagnetism, Yang-Mills theory (which covers the Standard Model and other quantum field theories), as well as gravity. In any gauge theory, the fundamental objects being studied are fields described by $G$-connections, for some (Lie) group $G$. To some degree of approximation, a connection gives a group element for any path in space: $\Gamma : Path(M) \rightarrow G$. Then the dynamics of these fields are described by a Lagrangian, where the action for a field is the integral of the curvature over the whole space $M$: $\int tr(F \wedge \star F)$ (plus possibly some other terms to couple the field to sources).

Now, the point here is to get non-perturbative ways to study these theories: rather than, say, getting differential equations for the fields and finding solutions by expanding a power series. The approach in question is to take a discrete version of this continuum theory, which is finite and can be dealt with exactly, and then take a limit.

So in lattice gauge theory, continuous space is replaced by a – well, a lattice $L$, say $L=\mathbb{Z}^3$, for definiteness (then eventually take the continuum limit as the spacing of the lattice goes to zero). The lattice also include edges joining adjacent points – say the set of edges is $E$. Paths in the lattice are built from these edges. (Furthermore, since an infinite lattice can’t be represented in the computer, the actual computations use a quotient of this – a lattice in a 3-torus, or equivalently, one considers only periodic fields.) Then it’s enough to say that a connection assigns a group element to each edge of the lattice, $\Gamma : E \rightarrow G$.

Of course, to back up, describing connections as functions $\Gamma : E \rightarrow G$, or $Path(M) : \rightarrow G$, often provokes various objections from people used to differential geometry. One is that the group elements assigned don’t have any direct physical meaning – since a physical state is only defined up to gauge equivalence. So if an edge $e$ joins lattice points $a$ and $b$, a gauge transformation $g : L \rightarrow G$ acts on $\Gamma$ to give $\Gamma' : E \rightarrow G$ with $\Gamma'(e) = g(a)\Gamma(e)g(b)^{-1}$. Clearly, for any given edge, there are gauge-equivalent connections assigning any group element you want. As Wade pointed out, one benefit of the dual models he was describing is that their states can be given a definite physical meaning – there are no gauge choices. Another, helpfully, is that they’re easier to calculate with (sometimes). A more physical motivation Wade suggested is that these methods can deal with spin-foam models of quantum gravity, and also matter fields: a realistic look at a theory of gravity should have some matter to gravitate, so this gives a way to simulate them together.

So what are these dual methods? This is described in some detail in this paper by Wade, Dan, and Igor Khavkine. The first step is to find a discrete version of curvature: instead of the action $\int tr(F \wedge \star F)$, we want a sum of face amplitudes. Curvature is described by the holonomy around a contractible loop, so the basic element is a face in the lattice (say $F$ is the set of faces). Given a square face $f \in F$ with edges whose holonomies are $g_1$ through $g_4$ (assuming all faces are oriented in a consistent direction), the holonomy around the face is $g_1 g_2 g_3^{-1} g_4^{-1} = g(f)$. From this, one defines an amplitude for the face, for some function $S(g(f))$ (there are various possibilities – Wade’s example used the heat kernel action mentioned in the paper above), and then the total action $S = \sum_{F} S(g(f))$ is the sum over all faces. Then instead of integrating over an infinite-dimensional, and generally intractable, space of smooth connections, one itegrates over $G^E$, the space of discrete connections.

The duality here is the expansion of this in terms of group characters: a function $S(g)$ can be written as a combination of irreducible characters: $S(g) = \sum_i c_i \chi_i(g)$. Then one can pull this sum over characters outside the integral over $G^E$ (so that local quantities are inside).

There are many nice images on Wade’s homepage (above) showing visualizations of the resulting calculations – one finds sums over certain labellings of the lattice, namely those which can be described by having certain surfaces. In particular, closed (boundaryless), branched (possibly self-intersecting) surfaces with face and edge labels given by representations of $G$ and intertwining operators between them… that is, spin foams. These dual spin foam configurations have the advantage of having a physical interpretation (though I confess I don’t have a good intuition about it) which doesn’t depend on gauge choices.

A variant on this comes about when the action is changed to include a term coupling the Yang-Mills field to fermions (one thinks of quarks and gluons, for example). In this case, the fermion part is described by “polymers” (closed, possibly self-intersecting paths, rather than surfaces), and the coupled system allows the surfaces used in the YM calculations to have boundaries – but only on these polymers. (Again, Wade has some nice images of this on his site. Personally, I find a lot of the details here remain obscure, though I’ve seen a few versions of this talk and related ones, but the pictures give a framework to hang the rest of it on.)

Wade identified two “key” ingredients for doing calculations with these dual spin foams:

1. Recoupling moves for the graphs (as described, for instance, by Carter, Flath, and Saito) which simplify the calculation of amplitudes, and
2. A set of local moves (changes of configuration) which are ergodic – that is, between them they can take any configuration to any other. (The point here is to allow a reasonably random sampling – the algorithm is stochastic – of the configuration space, while making only local changes, requiring a minimum of recomputation, at each step.)

Finally, Wade summed up by pointing out that the results obtained so far agree with the usual methods, and in some cases are faster. Then he told us about some future projects. Some involve optimizing code and adapting it to run on clusters. Others were more theoretical matters: doing for $SU(3)$ what has been done for $SU(2)$ (which will involve developing much of the recoupling theory for 3j- and 6j-symbols); finding and computing observables for these configurations (such as Wilson loops); and modelling supersymmetry and other notions about particle physics.