philosophical


When I made my previous two posts about ideas of “state”, one thing I was aiming at was to say something about the relationships between states and dynamics. The point here is that, although the idea of “state” is that it is intrinsically something like a snapshot capturing how things are at one instant in “time” (whatever that is), extrinsically, there’s more to the story. The “kinematics” of a physical theory consists of its collection of possible states. The “dynamics” consists of the regularities in how states change with time. Part of the point here is that these aren’t totally separate.

Just for one thing, in classical mechanics, the “state” includes time-derivatives of the quantities you know, and the dynamical laws tell you something about the second derivatives. This is true in both the Hamiltonian and Lagrangian formalism of dynamics. The Hamiltonian function, which represents the concept of “energy” in the context of a system, is based on a function H(q,p), where q is a vector representing the values of some collection of variables describing the system (generalized position variables, in some configuration space X), and the p = m \dot{q} are corresponding “momentum” variables, which are the other coordinates in a phase space which in simple cases is just the cotangent bundle T*X. Here, m refers to mass, or some equivalent. The familiar case of a moving point particle has “energy = kinetic + potential”, or H = p^2 / m + V(q) for some potential function V. The symplectic form on T*X can then be used to define a path through any point, which describes the evolution of the system in time – notably, it conserves the energy H. Then there’s the Lagrangian, which defines the “action” associated to a path, which comes from integrating some function L(q, \dot{q}) living on the tangent bundle TX, over the path. The physically realized paths (classically) are critical points of the action, with respect to variations of the path.

This is all based on the view of a “state” as an element of a set (which happens to be a symplectic manifold like T*X or just a manifold if it’s TX), and both the “energy” and the “action” are some kind of function on this set. A little extra structure (symplectic form, or measure on path space) turns these functions into a notion of dynamics. Now a function on the space of states is what an observable is: energy certainly is easy to envision this way, and action (though harder to define intuitively) counts as well.

But another view of states which I mentioned in that first post is the one that pertains to statistical mechanics, in which a state is actually a statisticial distribution on the set of “pure” states. This is rather like a function – it’s slightly more general, since a distribution can have point-masses, but any function gives a distribution if there’s a fixed measure d\mu around to integrate against – then a function like H becomes the measure H d\mu. And this is where the notion of a Gibbs state comes from, though it’s slightly trickier. The idea is that the Gibbs state (in some circumstances called the Boltzmann distribution) is the state a system will end up in if it’s allowed to “thermalize” – it’s the maximum-entropy distribution for a given amount of energy in the specified system, at a given temperature T. So, for instance, for a gas in a box, this describes how, at a given temperature, the kinetic energies of the particles are (probably) distributed. Up to a bunch of constants of proportionality, one expects that the weight given to a state (or region in state space) is just exp(-H/T), where H is the Hamiltonian (energy) for that state. That is, the likelihood of being in a state is inversely proportional to the exponential of its energy – and higher temperature makes higher energy states more likely.

Now part of the point here is that, if you know the Gibbs state at temperature T, you can work out the Hamiltonian
just by taking a logarithm – so specifying a Hamiltonian and specifying the corresponding Gibbs state are completely equivalent. But specifying a Hamiltonian (given some other structure) completely determines the dynamics of the system.

This is the classical version of the idea Carlo Rovelli calls “Thermal Time”, which I first encountered in his book “Quantum Gravity”, but also is summarized in Rovelli’s FQXi essay “Forget Time“, and described in more detail in this paper by Rovelli and Alain Connes. Mathematically, this involves the Tomita flow on von Neumann algebras (which Connes used to great effect in his work on the classification of same). It was reading “Forget Time” which originally got me thinking about making the series of posts about different notions of state.

Physically, remember, these are von Neumann algebras of operators on a quantum system, the self-adjoint ones being observables; states are linear functionals on such algebras. The equivalent of a Gibbs state – a thermal equilibrium state – is called a KMS (Kubo-Martin-Schwinger) state (for a particular Hamiltonian). It’s important that the KMS state depends on the Hamiltonian, which is to say the dynamics and the notion of time with respect to which the system will evolve. Given a notion of time flow, there is a notion of KMS state.

One interesting place where KMS states come up is in (general) relativistic thermodynamics. In particular, the effect called the Unruh Effect is an example (here I’m referencing Robert Wald’s book, “Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics”). Physically, the Unruh effect says the following. Suppose you’re in flat spacetime (described by Minkowski space), and an inertial (unaccelerated) observer sees it in a vacuum. Then an accelerated observer will see space as full of a bath of particles at some temperature related to the acceleration. Mathematically, a change of coordinates (acceleration) implies there’s a one-parameter family of automorphisms of the von Neumann algebra which describes the quantum field for particles. There’s also a (trivial) family for the unaccelerated observer, since the coordinate system is not changing. The Unruh effect in this language is the fact that a vacuum state relative to the time-flow for an unaccelerated observer is a KMS state relative to the time-flow for the accelerated observer (at some temperature related to the acceleration).

The KMS state for a von Neumann algebra with a given Hamiltonian operator has a density matrix \omega, which is again, up to some constant factors, just the exponential of the Hamiltonian operator. (For pure states, \omega = |\Psi \rangle \langle \Psi |, and in general a matrix becomes a state by \omega(A) = Tr(A \omega) which for pure states is just the usual expectation value value for A, \langle \Psi | A | \Psi \rangle).

Now, things are a bit more complicated in the von Neumann algebra picture than the classical picture, but Tomita-Takesaki theory tells us that as in the classical world, the correspondence between dynamics and KMS states goes both ways: there is a flow – the Tomita flow – associated to any given state, with respect to which the state is a KMS state. By “flow” here, I mean a one-parameter family of automorphisms of the von Neumann algebra. In the Heisenberg formalism for quantum mechanics, this is just what time is (i.e. states remain the same, but the algebra of observables is deformed with time). The way you find it is as follows (and why this is right involves some operator algebra I find a bit mysterious):

First, get the algebra \mathcal{A} acting on a Hilbert space H, with a cyclic vector \Psi (i.e. such that \mathcal{A} \Psi is dense in H – one way to get this is by the GNS representation, so that the state \omega just acts on an operator A by the expectation value at \Psi, as above, so that the vector \Psi is standing in, in the Hilbert space picture, for the state \omega). Then one can define an operator S by the fact that, for any A \in \mathcal{A}, one has

(SA)\Psi = A^{\star}\Psi

That is, S acts like the conjugation operation on operators at \Psi, which is enough to define S since \Psi is cyclic. This S has a polar decomposition (analogous for operators to the polar form for complex numbers) of S = J \Delta, where J is antiunitary (this is conjugation, after all) and \Delta is self-adjoint. We need the self-adjoint part, because the Tomita flow is a one-parameter family of automorphisms given by:

\alpha_t(A) = \Delta^{-it} A \Delta^{it}

An important fact for Connes’ classification of von Neumann algebras is that the Tomita flow is basically unique – that is, it’s unique up to an inner automorphism (i.e. a conjugation by some unitary operator – so in particular, if we’re talking about a relativistic physical theory, a change of coordinates giving a different t parameter would be an example). So while there are different flows, they’re all “essentially” the same. There’s a unique notion of time flow if we reduce the algebra \mathcal{A} to its cosets modulo inner automorphism. Now, in some cases, the Tomita flow consists entirely of inner automorphisms, and this reduction makes it disappear entirely (this happens in the finite-dimensional case, for instance). But in the general case this doesn’t happen, and the Connes-Rovelli paper summarizes this by saying that von Neumann algebras are “intrinsically dynamic objects”. So this is one interesting thing about the quantum view of states: there is a somewhat canonical notion of dynamics present just by virtue of the way states are described. In the classical world, this isn’t the case.

Now, Rovelli’s “Thermal Time” hypothesis is, basically, that the notion of time is a state-dependent one: instead of an independent variable, with respect to which other variables change, quantum mechanics (per Rovelli) makes predictions about correlations between different observed variables. More precisely, the hypothesis is that, given that we observe the world in some state, the right notion of time should just be the Tomita flow for that state. They claim that checking this for certain cosmological models, like the Friedman model, they get the usual notion of time flow. I have to admit, I have trouble grokking this idea as fundamental physics, because it seems like it’s implying that the universe (or any system in it we look at) is always, a priori, in thermal equilibrium, which seems wrong to me since it evidently isn’t. The Friedman model does assume an expanding universe in thermal equilibrium, but clearly we’re not in exactly that world. On the other hand, the Tomita flow is definitely there in the von Neumann algebra view of quantum mechanics and states, so possibly I’m misinterpreting the nature of the claim. Also, as applied to quantum gravity, a “state” perhaps should be read as a state for the whole spacetime geometry of the universe – which is presumably static – and then the apparent “time change” would then be a result of the Tomita flow on operators describing actual physical observables. But on this view, I’m not sure how to understand “thermal equilibrium”.  So in the end, I don’t really know how to take the “Thermal Time Hypothesis” as physics.

In any case, the idea that the right notion of time should be state-dependent does make some intuitive sense. The only physically, empirically accessible referent for time is “what a clock measures”: in other words, there is some chosen system which we refer to whenever we say we’re “measuring time”. Different choices of system (that is, different clocks) will give different readings even if they happen to be moving together in an inertial frame – atomic clocks sitting side by side will still gradually drift out of sync. Even if “the system” means the whole universe, or just the gravitational field, clearly the notion of time even in General Relativity depends on the state of this system. If there is a non-state-dependent “god’s-eye view” of which variable is time, we don’t have empirical access to it. So while I can’t really assess this idea confidently, it does seem to be getting at something important.

In my post about my short talk at CQC, I mentioned that the groupoidification program in physics is based on a few simple concepts (most research programs are, I suppose). The ones I singled out are: state, symmetry, and history. But since concepts tend to seem simpler if you leave them undefined, there are bound to be subtleties here. Recently I’ve been thinking about the first one, state. What is a state? What is this supposedly simple concept?

Etymology isn’t an especially reliable indicator of what a word means, or even the history of a concept (words change meanings, and concepts shift over time), but it’s sometimes interesting to trace. The English word “state” comes from the Latin verb stare, meaning “to stand”, whose past participle is status, which is also borrowed directly into English. The Proto-Indoeuropean root sta- also means “stand”, which in turn comes from this root, but this time via Germanic (along with “standard”). However, most of the words with this root come via various Latin intermediaries: state, stable, status, statue, stationary, station, and also substance, understand and others. The state of affairs is sometimes referred to as being “how things stand”, how they are, the current condition. Most of the words based on the sta- root imply non-motion (i.e. “stasis”). If anything, “state” (like “status”) carries this connotation less strongly than most, since the state of affairs can change – but it emphasizes how things stand now and not how they’re changing. From this sense, we also get the political meaning of “a state”, a reified version of a term originally meaning the political condition of a country (by analogy with Latin expressions like status rei publicae, the “condition of public affairs”).

So, narrowing focus now, the “state” of a physical system is the condition it’s in. In different models of physics, this is described in different ways, but in each case, by the “condition” we mean something like a complete description of all the facts about the system we can get. But this means different things in different settings. So I just want to take a look at some of them.

Think of these different settings for physics as being literally “settings” (but please excuse the pun) of the switches on a machine. Three of the switches are labelled Thermal, Quantum, and Relativistic. The “Thermal” switch varies whether or not we’re talking about thermodynamics or ordinary mechanics. The “Quantum” switch varies whether we’re talking about a quantum or classical system.

The “Relativistic” switch, which I’ll ignore for this post, specifies what kind of invariance we have: Galileian for Newton’s physics; Lorentzian for Special Relativity; general covariance for General Relativity. But this gets into dynamics, and “state” implies things are, well, static – that is, it’s about kinematics. At the very least, in Relativity, it’s not canonical what you mean by “now”, and so the definition of a state must include choosing a reference frame (in SR), or a Cauchy hypersurface (in GR). So let’s gloss over that for now.

When all these switches are in the “off” position, we have classical mechanics. Here, we think of a state as – at a first level of approximation, an element of a set. Now, for serious classical mechanics, this set will be a symplectic manifold, like the cotangent bundle T^*M of some manifold M. This is actually a bit subtle already, since a point in T^*M represents a collection of positions and momenta (or some generalization thereof): that is, we can start with a space of “static” configurations, parametrized by the values of some observable quantities, but a state (contrary to what etymology suggests) also includes momenta describing how those quantities are changing with time (which, in classical mechanics, is a fairly unproblematic notion).

The Hamiltonian picture of the dynamics of the system then tells us: given its state, what will be the acceleration, which we can then use to calculate states at future time. This requires a Hamiltonian, H, which we think of as the energy, which can be calculated from the state. So, for example, kinetic plus potential energy: in the case of a particle moving in a potential on a line, H = K + V = p^2/m + V(q). The space of states can be described without much reference to the Hamiltonian, but once we have H, we get a flow on that space, transforming old states into new states with time.

Now if we turn on the “Thermal” switch, we have a different notion of state. The standard image for the classical mechanical system is that we may be talking about a particle, or a few particles, or perhaps a rigid object, moving in space, maybe subject to some constraints. In thermodynamics, we are thinking of a statistical ensemble of objects – in the simplest case, N identical objects – and want to ask how energy is distributed among them. The standard image is of a box full of gas at some temperature: it’s full of molecules, each with its own trajectory, and they interact through collisions and exchange energy and momentum. Rather than tracking the exact positions of molecules, in thermodynamics a “state” is a distribution, or more precisely a probability measure, on the space of such states. We don’t assume we know the detailed microstate of the system – the positions and momenta of all the particles in the gas – but only something about how these are distributed among them. This reflects the real fact that we can only measure things like pressure, temperature, etc. The measure is telling us the proportion of particles with positions and momenta in a given range.

This is a big difference for something described by the same word “state”. Even assuming our underlying space of “microstates” is still the same T^*M, the state is no longer a point. One way to interpret the difference is that here the state is something epistemic. It describes what we know about the system, rather than everything about it. The measure answers the question: “given what we know, what is the likelihood the system is in microstate X?” for each X. Now, of course, we could take a space of all such measures: given our previous classical system, it’s a space of functionals on C(T^*M). Then the state can again be seen as an element of a set. But it’s more natural to keep in view its nature as a measure, or, if it’s nice enough, as a positive function on the space of states. (It’s interesting that this is an object of the same type as the Hamiltonian – this is, intuitively, the basis of what Carlo Rovelli calls the “Thermal Time Hypothesis”, summarized here, which is secretly why I wanted to write on this topic. But more on that in a later post. For one thing, before I can talk about it, I have to talk about what comes next.)

Now turn off the “Thermal” switch, and think about the “Quantum” switch. Here there are a couple of points of view.

To begin with, we describe a system in terms of a Hilbert space, and a state is a vector in a Hilbert space. Again, this could be described as an element of a set, but the complex linear structure is important, so we keep thinking of it as fundamental to the type of a state. In geometric quantization, one often starts with a classical system with a state space like T^*M = X, and then takes the Hilbert space \mathcal{H}=L^2(X), so that a state is (modulo analysis issues) basically a complex-valued function on X. This is something like the (positive real-valued) measure which gives a thermodynamic state, but the interpretation is trickier. Of course, if \mathcal{H} is an L^2-space, we can recover a probability measure, since the square modulus of \phi \in \mathcal{H} has finite total measure (so we can normalize it). But this isn’t enough to describe \phi, and the extra information of phases goes missing. In any case, the probability measure no longer has the obvious interpretation of describing the statistics of a whole ensemble of identical systems – only the likelihood of measuring particular values for one system in the state \phi. (In fact, there are various no-go theorems getting in the way of a probablity interpretation of \phi, though this again involves dynamics – a recurring theme is that it’s hard to reason sensibly about states without dynamics). So despite some similarity, this concept of “state” is very different, and phase is a key part of how it’s different. I’ll be jiggered if I can say why, though: most of the “huh?” factor in quantum mechanics lives right about here.

Another way to describe the state of a quantum system is related to this probability, though. The inner product of \mathcal{H} (whether we found it as an L^2-space or not) gives a way to talk about statistics of the system under repeated observations. Observables, which for the classical picture are described by functions on the state space X, are now self-adjoint operators on \mathcal{H}. The expectation value for an observable A in the state \phi is $\langle \phi | A | \phi \rangle$ (note that the Dirac notation implicitly uses self-adjointness of A). So the state has another, intuitively easier, interpretation: it’s a real-valued functional on observables, namely the one I just described.

The observables live in the algebra \mathcal{A} = \mathcal{B}(\mathcal{H}) of bounded operators on \mathcal{H}. Setting both Thermal and Quantum switches of our notion of “state” gives quantum statistical mechanics. Here, the “C*-algebra” (or von Neumann-algebra) picture of quantum mechanics says that really it’s the algebra \mathcal{A} that’s fundamental – it corresponds to actual operations we can perform on the system. Some of them (the self-adjoint ones) represent really very intuitive things, namely observables, which are tangible, measurable quantities. In this picture, \mathcal{H} isn’t assumed to start with at all – but when it is, the kind of object we’re dealing with is a density matrix. This is (roughly) a positive operator on \mathcal{H} of unit trace). In general a state on a von Neumann algebra is a linear functional with unit trace.

This is analogous to the view of a state as a probability measure (positive function with unit total integral) in the classical realm: if an observable is a function on states (giving the value of that observable in each state), then a measure is indeed a functional on the space of observables. A probability measure, in fact, is the functional giving the expectation value of the observable. (And, since variance and all the higher moments of the probability distribution for that observable are themselves defined as expectation values, it also tells us all of those.)

On the other hand, the Gelfand-Naimark-Segal theorem says that, given a state \phi : \mathcal{A} \rightarrow \mathbb{R}, there’s a representation of \mathcal{A} as an algebra of operators on some Hilbert space, and a vector v for which this \phi is just \phi(A) = \langle v | A | v \rangle. This is the GNS representation (and in fact it’s built by taking the regular representation of \mathcal{A} on itself by multiplication, with \mathcal{A} made into a Hilbert space by definining the inner product to make this property work, and with v = 1). So the view here is that a state is some kind of operation on observables – a much more epistemic view of things. So although the GNS theorem relates this to the vector-in-Hilbert-space view of “state”, they are quite different conceptually. (For one thing, the GNS representation is giving a different Hilbert space for each state, which undermines the sense that the space of ALL states is fundamentally “there”, but in both pictures \mathcal{A} is the same for all states.)

(This von Neumann-algebra point of view, by the way, gets along nicely with the 2-Hilbert space lens for looking at quantum mechanics, which may partly bridges the gap between it and the Hilbert-space view. The category of representations of a von Neumann algebra is a 2-Hilbert space. A “2-vector” (or “2-state”, if you like) in this category is a representation of the algebra. So the GNS representation itself is a “2-state”. This raises the question about 2-algebras of 2-operators, and John Baez’ question: “What is the categorified GNS theorem?” But let’s leave 2-states for later along with the rest.)

So where does this leave us regarding the meaning of “state”? The classical view is that a state is an element of some (structured) set. The usual quantum picture is that a state is, depending on how precise you want to be, either a vector in a Hilbert space, or a 1-d subspace of that Hilbert space – that is, a point in the projective Hilbert space. What these two views have in common is that there is some space of all “possible worlds”, i.e. of all ways things can be in the system being studied. A state is then a way of selecting one of these. The difference is in what this space of possible worlds is like – that is, which category it lives in – and how exactly one “selects” a state. How they differ is in the possibility of taking combinations of states. As for selecting states, Sets is a Cartesian category, with a terminal object 1 = {*}: an element of a set is a map from 1 into it. Hilb is a monoidal category, but not Cartesian: selecting a single vector has no obvious categorical equivalent, though selecting a 1-D subspace amounts to a map from \mathbb{C} (up to isomorphism). So the model of an “element” isn’t a singleton, it’s the complex line – and it relates to other possible spaces differently: not as a terminal object, but as a monoidal unit. This is a categorical way of saying how the idea of “state” is structurally different.

The thermal point of view is a little more epistemically subtle: for both classical and quantum pictures, it’s best thought of as, not a possible world, but a function acting on observables (that is, conditions of knowledge). In the classical picture, this is directly related to a space of possible worlds – it’s a measure on it, which we can think of as saying how a large ensemble of systems are distributed in that space. In the quantum picture, in some ways the most (epistemically) natural view, in terms of von Neumann algebras, breaks the connection to this notion of “possible worlds” altogether, since \mathcal{A} has representations on many different Hilbert spaces?

So a philosophical question is: what do these different concepts have in common that lets us use them all to represent the “same” root idea? Without actually answering this, I’ll just mention that at some point I’d like to talk a bit about “2-states” as 2-vectors, and in general how to categorify everything above.

It’s taken me a while to write this up, since I’ve been in the process of moving house – packing and unpacking and all the rest. However, a bit over a week ago, I was in Montreal, attending MakkaiFest ‘09 at the Centre de Recherches Mathematiques at the University of Montréal (and a pre-conference workshop hosted at McGill, which I’m including in the talks I mention here). This was in honour of the 70th birthday of Mihaly (Michael) Makkai, of McGill University. Makkai has done a lot of important foundational work in logic, model theory, and category theory, and a great many of the talks were from former students who’d gone on and been inspired by him, so one got sense of the range of things he’s worked on through his life.

The broad picture of Makkai’s work was explained to us by J.P. Marquis, from the Philosophy department at U of M. He is interested in philosophy of mathematics, and described Makkai’s project by contrast with the program of axiomatization of the early 20th century, along the lines suggested by Hilbert. This program provided a formal language for concrete structures – the problem, which category theory is part of a solution to, is to do the same for abstract structures. Contrast, for instance, the concrete description of a group G as a (particular) set with some (particular) operation, with the abstract definition of a group object in a category. Makkai’s work in categorical logic, said Marquis, is about formalizing the process of abstraction that example illustrates.

Model Theory/Logic

This matter – of the relation between abstract theories and concrete models of the theories – is really what model theory is about, and this is one of the major areas Makkai has worked on. Roughly, a theory is most basically a schema with symbols for types, members of types, and some function symbols – and a collection of sentences built using these symbols (usually generated from some axioms by rules of logical inference). A model is (intuitively), an interpretation of the terms: a way of assigning concrete data to the symbols – say, a symbol for a type is assigned the set of all entities of that type, and a function symbol is assigned an actual function between sets, and so on – making all propositions true. A morphism of models is a map that preserves all the properties of the model that can be stated using first order logic.

This is an older way to say things – Victor Harnik gave an expository talk called “Model Theory vs. Categorical Logic” in which he compared two ways of adding an equivalence relation to a theory. The model theory way (invented by Shelah) involves taking the theory (list of sentences) T and extending it to a new theory T^{eq}. This has, for instance, some new types – if we had a type for “element of group”, for example, we might then get a new type “equivalence class of elements of group”, and so on. Now, this extension is “tight” in the sense that the categories of all models of T and of T^{eq} are equivalent (by a forgetful functor Mod(T^{eq}) \rightarrow Mod(T)) – but one can prove new theorems in the extended theory. To make this clear, he described work (due to Makkai and Reyes) about pretopos completion. Here, one has the concept of a “Boolean logical category” – Set is an example, as is, for any theory, a certain category whose objects are the formulas of the theory. This is related to Lawvere theories (see below). There are logical functors between such categories – functors into Set are models, but there are also logical functors between theories. The point is that a theory T embeds into T^{eq} (abusing notation here – these are now the boolean logical categories). Then the point is that T^{eq} arises as a kind of completion of T – namely, it’s a boolean pretopos (not just category). Moreover, it has some nice universal properties, making this point of view a bit more natural than the model-theoretic construction.

Bradd Hart’s talk, “Conceptual Completeness for Cantinuous Logic”, was a bit over my head, but made some use of this kind of extension of a theory to T^{eq}. The basic point seems to be to add some kind of continuous structure to logic. One example comes from a metric structure – defining a metric space of terms, where the metric function d(x,y) is some sum \sum_n \phi_n (x,y), where the \phi_n are formulas with two variables, either true or false – where true gives a 0, and false gives a 1 in this sum. This defines a distance from x to y associated to the given list of formulas \phi_n. A continuous logic is one with a structure like this. The business about equivalence relations arises if we say two things are equivalent when the distance between them is 0 – this leads to a concept of completion, and again there’s a notion that the categories of models are equivalent (though proving it here involves some notion of approximating terms to arbitrary epsilon, which doesn’t appear in standard logic).

Anand Pillay gave a talk which used model theory to describe some properties of the free group on n generators. This involved a “theory of the free group” which applies to any free group, and regard each such group as a model of the theory – in fact a submodel of some large model, and using model-theoretic methods to examine “stability” properties, in some sense which amounts to a notion of defining “generic” subsets of the group.

Logic and Higher Categories

A number of talks specifically addressed the ground where logic meets higher dimensional categories, since Makkai has worked with both.

In one talk, Robert Paré described a way of thinking about first-order theories as examples of “double Lawvere theories”. Lawvere’s way of formalizing “theories and models” was to say that the theory is a category itself (which has just the objects needed to describe the kind of structure it’s a theory of) – and a model is a functor into Sets (or some other category – a model of the theory of groups in topological spaces, say, is a topological group). For example, the theory of groups includes an object G and powers of it, multiplication and inverse maps, and expresses the axioms by the fact that certain diagrams commute. A model is a functor M : Th(Grp) \rightarrow Sets, assigning to the “group object” a set of elements, which then get the group structure from the maps. Instead of a category, this uses a double category. There are two kinds of morphisms – horizontal and vertical – and these are used to represent two kinds of symbols: function symbols, and relation symbols. (For example, one can talk about the theory of an ordered field – so one needs symbols for multiplication and addition and so forth, but also for the order relation \leq). Then a model of such a theory is a double functor into the double category whose objects are sets, and whose horizontal and vertical morphisms are respectively functions and relations.

André Joyal gave a talk about the first order logic of higher structures. He started by commenting on some fields which began life close together, and are now gradually re-merging: logic and category theory; category theory and homotopy theory (via higher categories); homotopy theory and algebraic geometry. The higher categories Joyal was thinking of are quasicategories, or “( \infty, 1)-categories, which are simplicial sets satisfying a weak version of a horn-filling condition (the “strict” version of this, a Kan complex, includes as example N(C), the nerve of a category C – there’s an n-simplex for each sequence of n composable morphisms, whose other edges are the various composites, and whose faces are “compositors”, “associators”, and so on – which for N(C) are identities). The point of this is that one can reproduce most of category theory for quasicategories – in particular, he mentioned limits and colimits, factorization systems, pretoposes, and model theory.

Moving to quasicategories on one side of the parallel between category theory and logic has a corresponding move on the other side – on the logic side, one aspect is that the usual notion of a language is replaced by what’s called Martin-Löf type theory. This, in fact, was the subject of Michael Warren’s talk, “Martin-Löf complexes” (I reported on a similar talk he gave at Octoberfest last year). The idea here is to start by defining a globular set, given a theory and type A – a complex whose n-cells have two faces, of dimension (n-1). The 0-cells are just terms of some type A. The 1-cells are terms of types like \underline{A}(a,b), where a and b are variables of type A – the type has an interpretation as a proposition that a=b “extensionally” (i.e. not via a proof – but as for instance when two programs with non-equivalent code happen to always produce the same output). This kind of operation can be repeated to give higher cells, like \underline{A(a,b)}(f,g), and so on. Given a globular set G, one gets a theory by an adjoint construction. Putting the two together, one has a monad on the category of globular sets – algebras for the monad are Martin-Löf complexes. Throwing in syntactic rules to truncate higher cells (I suppose by declaring all cells to be identities) gives n-truncated versions of these complexes, MLC_n. Then there is some interesting homotopy theory, in that the category of n-truncated Martin-Löf complexes is expected to be a model for homotopy n-types. For example, MLC_0 is equivalent to Sets, and there is an adjunction (in fact, a Quillen equivalence – that is, a kind of “homotopy” equivalence) between MLC_1 and Gpd.

Category Theory/Higher Categories

There were a number of talks that just dealt with categories – including higher categories – in their own right. Makkai has worked, for example, on computads, which were touched on by Marek Zawadowski in one of his two talks (one in the pre-conference workshop, the other in the conference). The first was about categories of “many-to-one shapes”, which are important to computads – these are a notion of higher-category, where every cell takes many “input” faces to one “output” face. Zawadowski described a “shape” of an n-cell as an initial object in a certain category built from the category of computads with specified faces. Then there’s a category of shapes, and an abstract description of “shape” in terms of a graded tensor theory (graded for dimension, and tensor because there’s a notion of composition, I believe). Zawadowski’s second talk, “Opetopic Sets in Lax Monoidal Fibrations”, dealt with a similar topic from a different point of view. A lax monoidal fibration (LMF) is a kind of gadget for dealing with multi-level structures (categories, multicategories, quasicategories, etc). There’s a lot of stuff here I didn’t entirely follow, but just to illustrate: categories arise as LMF, by the fibration cod : Set^{B} \rightarrow Set, where B is the category with two objects M, O, and two arrows from M to O. An object in the functor category Set^{B} consists of a “set of morphisms and set of objects” with maps – making this a category involves the monoidal structure, and how composition is defined, and the real point is that this is quite general machinery.

Joachim Lambek and Gonzalo Reyez, both longtime collaborators and friends of Makkai, also both gave talks that touched on physics and categories, though in very different ways. Lambek talked about the “Lorentz category” and its appearance in special relativity.  This involves a reformulation of SR in terms of biquaternions: like complex numbers, these are of the form u + iv, but u and v are quaternions.  They have various conjugation operations, and the geometry of SR can be described in terms of their algebra (just as, say, rotations in 3D can be described in terms of quaternions).  The Lorentz category is a way of organizing this – its two objects correspond to “unconjugated” and “conjugated” states.

Gonzalo Reyez gave a derivation of General Relativity in the context of synthetic differential geometry.  The substance of this derivation is not so different from the usual one, but with one exception.  Einstein’s field equations can be derived in terms of the motions of small regions full of of freely falling test particles – synthetic differential geometry makes it possible to do the same analysis using infinitesimals rigorously all the way through.  The basic point here is that in SDG one replaces the real line as usually conceived, with a “real line with infinitesimals” (think of the ring \mathbb{R}[\epsilon]/\langle \epsilon^2 \rangle, which is like the reals, but has the infinitesimal \epsilon, whose square is zero).

Among other talks: John Power talked about the correspondence between Lawvere theories in universal algebra and finitary tree monads on sets – and asked about what happens to the left hand side of this correspondence when we replace “sets” with other categories on the righ hand side. Jeff Egger talked about measure theory from a categorical point of view – namely, the correspondence of NCG between C*-algebras and “noncommutative” topological spaces, and between W*-algebras and “noncommutative” measure spaces, thought of in terms of locales. Hongde Hu talked about the “codensity theorem”, and a way to classify certain kinds of categories – he commented on how it was inspired by Makkai’s approach to mathematics: 1) Find new proofs of old theorems, (2) standardize the concepts used in them, and (3) prove new theorems with those concepts. Fred Linton gave a talk describing Heath’s “V-space”, which is a half-plane with a funny topology whose open sets are “V” shapes, and described how the topos of locally finite sheaves over it has surprising properties having to do with nonexistence of global sections. Manoush Sadrzadeh, whom I met recently at CQC (see the bottom of the previous post) was again talking about linguistics using monoidal categories – she described some rules for “clitic movement” and changes in word order, and what these rules look like in categorical terms.

Other

A few other talks are a little harder for me to fit into the broad classification above.  There was Charles Steinhorn’s talk about ordered “o-minimal” structures, which touched on a bit of economics – essentially, a lot of economics is based on the assumption that preference orders can be made into real-valued functions, but in fact in many cases one has (variants on) “lexicographic order”, involving ranked priorities.  He talked about how typically one has a space of possibilities which can be cut up into cells, with one sort of order in each cell.  There was Julia Knight, talking about computable structures of “high Scott rank” – in particular, this is about infinite structures that can still be dealt with computably – for example, infinitary logical formulas involving an infinite number of “OR” statements where all the terms being joined are of some common form.  This ends up with an analysis of certain infinite trees.  Hal Kierstead gave a talk about Ramsey theory which I found notable because it used the kind of construction based on a game: to prove that any colouring of a graph (or hypergraph) has some property, one devises a game where one player tries to build a graph, and the other tries to colour it, and proves a winning strategy for one player.  Finally, Michael Barr gave a talk about a duality between certain categories of modules over commutative rings.

All in all, an interesting conference, with plenty of food for thought.

Barr, Kierstead, Knight, Steinhorn

So as I mentioned in my previous post, I attended 80% of the conference “Categories, Quanta, Concepts”, hosted by the Perimeter Institute.  Videos of many of the talks are online, but on the assumption that not everyone will watch them all, I’ll comment anyway… ;)

It dealt with various takes on the uses of category theory in fundamental physics, and quantum physics particularly. One basic theme is that the language of categories can organize and clarify the concepts that show up here. Since there doesn’t seem to be a really universal agreement on what “fundamental” physics is, or what the concepts involved might be, this is probably a good thing.

There were a lot of talks, so I’ll split this into a couple of posts – this first one dealing with two obvious category-related themes – monoidal categories and toposes.  The next post will cover most of the others – roughly, focused on fundamentals of quantum mechanics, and on categories for logic and language.

Monoidal Categories

So a large contingent came from Oxford’s Comlab, many of them looking at ideas that I first saw popularized by Abramsky and Coecke about describing the features of quantum mechanics that appear in any dagger-compact category. This yields a “string diagram” notation for quantum systems. (An explanation of this system is given by Abramsky and Coecke – http://arxiv.org/abs/0808.1023 – or more concisely by Coecke – http://arxiv.org/abs/quant-ph/0510032).

Samson Abramsky talked about diagonal arguments. This is a broad class of arguments including Cantor’s theorem (that the real line is uncountable), Russell’s paradox in set theory (about the “set” of non-self-membered sets), Godel’s incompleteness theorem, and others. Abramsky’s talk was based on Bill Lawvere’s analysis of these arguments in general cartesian closed categories (CCC’s). The relevance to quantum theory has to do with “no-cloning” theorems – that quantum states can’t be duplicated. Diagonal arguments involve two capabilitiess: the ability to duplicate objects, and the ability to represent predicates (think of Godel numbering, for instance) which is related to a fixed point property. Generalizing to other monoidal categories, one still has representability: linear functionals on Hilbert spaces can be represented by vectors. But diagonal arguments fail since there is no diagonal \Delta : H \rightarrow H \otimes H.

Bob Coecke and Ross Duncan both spoke about “complementary observables”. Part of this comes from their notion of an “observable structure”, or “classical structure” for a quantum system. The intuition here is that this is some collection of observables which we can simultaneously observe, and such that, if we restrict to those observables, and states which are eigenstates for them, we can treat the whole system as if it were classical. In particular, this gives us “copy” and “destroy” operations for states – these maps and their duals actually turn out to define a Frobenius algebra. In finite-dimensional Hilbert spaces, this is equivalent to choosing an orthonormal basis.

Complementary observables is related to the concept of mutually unbiased bases. So the bases \{v_i\} and \{w_j\} are unbiased if all the inner products \langle v_i , w_j \rangle have the same magnitude. If these bases are associated to observables (say, they form a basis of eigenvectors), then knowing a classical value of one observable gives no information about the other – all eigenstates are equally likely. For a visual image, think of two sets of bases for the plane, rotated 45 degrees relative to each other. Each basis vector in one has a projection of equal length onto both basis vectors of the other.

Thinking of the orthonormal bases as “observable structures”, the mutually unbiased ones correspond to “complementary” observables: a state which is classical for one observable (i.e. is an eigenstate for that operator) is unbiased (i.e. has equal probablities of having any value) for the other observable. Labelling the different structures with colours (red and green, usually), they could diagrammatically represent states being classical or unbiased in particular systems.

This is where “phase groups” come into play. The setup is that we’re given some system – the toy model they often referred to was a spinning particle in 3D – and an observable system (say, just containing the observable “spin in the X direction”). Then there’s a group of symmetries of the system which leave that observable untouched (in that example, the symmetries are rotation about the X axis). This is the “phase group” for that observable.

Bill Edwards talked about phase groups and how they can be used to classify systems. He gave an example of a couple of toy models with six states each. One was based on spin (the six states describe spins about each axis in 3-space in each direction). The other, due to Robert Spekkens, is a “hidden variable” theory, where there are four possible “ontic” states (the “hidden” variable), but the six “epistemic” states only register whether the state lies in of six possible PAIRS of ontic states. The two toy models resemble each other at the level of states, but the phase groups are different: the truly “quantum” one has a cyclic group \mathbb{Z}_4 (for the X-spin observable, it’s generated by a right-angled rotation about the X axis); the “hidden variable” model, which has some quantum-mechanics-like features, but not all, has phase group \mathbb{Z}_2 \times \mathbb{Z}_2. The suggestion of the talk was that this phase group distinguishes “local” from “nonlocal” systems (i.e. ones with hidden variable models and ones without).

Marni Sheppard also gave a talk about Mutually Unbiased Bases, p-adic arithmetic, and algebraic geometry over finite fields, which I find hard to summarize because I don’t understand all those fields very well. Roughly, her talk made a link between quantum mechanics and an axiomatic version of projective geometry (Hilbert spaces in QM ought to be projective, after all, so this makes sense).  There was also a connection between mutually unbiased bases and finite fields, but again, this sort of escaped me.

Also in this group was Jamie Vicary, whom I’ve been working with on a project about the categorified harmonic oscillator.  His talk, however, was about n-Hilbert spaces, and n-categorical extended TQFT.  The basic point is that a TQFT assigns a number to a closed n-manifold, and a Hilbert space to each (n-1)-manifold (such as a boundary between two parts of a closed one), and if the TQFT is fully local (i.e. can be derived from, say, a triangulation), this can be continued to have it assign k-Hilbert spaces to (n-k)-manifolds for all k up to n.  He described the structure of 2-Hilbert spaces, and also monoidal ones (as many interesting cases are), and how they can all be realized (in finite dimensions, at least) as categories of representations of supergroupoids.  Part of the point of this talk was to suggest how not just dagger-compact categories, but general n-categories should be useful for quantum theory.

Toposes

The monoidal category setting is popular for dealing with quantum theories, since it abstracts some properties of Hilbert spaces, which they’re usually modelled in.  Topos theory is usually thought of as a generalization of the category of sets, and in particular they model intuitionistic classical, not quantum, logic.  So the talk by Andreas Döring (based on work with Christopher Isham – see many of Andreas’ recent papers) called “Why Topos Theory in the Foundations of Physics?” is surprising if you haven’t heard this idea before.  One motivation could be described in terms of the Kochen-Specker theorem, which, roughly, says that a quantum theory – involving observables which are operators on a Hilbert space of dimension at least three – can’t be modeled by a “state space”.  That is, it’s not the case that you can simultaneously give definite values to all the observables in a consistent way – in ANY state!  (That is, it’s not just the generic state: there is no state at all which corresponds to the classical picture of a “point” in some space parametrized by the observables.)

Now, part of the point is that there’s no “state space” in the category of sets – but maybe there is in some other topos!  And sure enough, the equivalent of a state space turns out to be a thing they call the “spectral presheaf” for the theory.  It’s an object in some topos.  The KS theorem becomes a statement that it has no “global points”.  To see what this means, you have to know what the spectral presheaf is.

This is based on the assumption that one has a (noncommutative) von Neumann algebra of operators on a Hilbert space – among them, the observables we might be interested in.  The structure of this algebra is supposed to describe some system.  Now you might want to look for subalgebras of it which are abelian.  Why?  Because a system of commuting operators, should they be observables, are ones which we CAN assign values to simultaneously – there’s no issue of which order we do measurements in.  Call this a “context” – a choice of subalgebra making the system look classical.  So maybe we can describe a “state space” in a context: so what?

Well, the collection of all such contexts forms a poset – in fact, lattice – in fact, a complete Heyting algebra.  These objects are just the same (object-wise) as “locales” (a generalization from topological spaces, and their lattice of open sets).  The topos in question is the category of presheaves on this locale, which is to say, of contravariant functors to Set.  Which is to say… a way of assigning a set (the “state space” I mentioned), with a way of restricting sets along inclusion maps.  This restriction can be a bit rough (in fact, the fact that restriction can be quite approximate is just where uncertainty principles and the like come from).  The main point is that this “spectral presheaf” (the assignment of local state spaces to each context) supports a concept of logic, for reasoning about the system it describes.  It’s a lot like the logic of sets, but operations happen “context-by-context”.  A proposition has a truth value which is a “downset” in the lattice of contexts – the collection of contexts where the proposition is true.  A proposition just amounts to a subobject of the spectral presheaf by what they call “daseinization” – it’s the equivalent of a proposition being a subset of a configuration space (where the statement is true).

One could say a lot more, but this is a blog post, after all.

There are philosophical issues that this subject seems to provoke – the sign of an interesting theory is that it gets people arguing, I suppose.  One is the characterization of this as a “neo-realist interpretation” of quantum theory.  A “naive realist” interpretation would be one that says a “state” is just a way of saying what all the values of all the observable quantities is – to put it another way, of giving definite truth values to all definite “yes/no” questions.  This is just what the KS theorem says can’t happen.  The spectral presheaf is supposedly “neo-realist” because it does almost these things, but in an exotic topos (of presheaves on the locale of all classical contexts).  As you might expect, this is a bit of a head-scratcher.

So for my inaugural blog post of 2009, I thought I would step back and comment about the big picture of the motivation behind what I’ve been talking about here, and other things which I haven’t. I recently gave a talk at the University of Ottawa, which tries to give some of the mathematical/physical context. It describes both “degroupoidification” and “2-linearization” as maps from spans of groupoids into (a) vector spaces, and (b) 2-vector spaces. I will soon write a post setting out the new thing in case (b) that I was hung up on for a while until I learned some more representation theory. However, in this venue I can step even further back than that.

Over the Xmas/New Year break, I was travelling about “The Corridor” (the densely populated part of Canada – London, where I live, is toward one end, and I visited Montreal, Ottawa, Toronto, Kitchener, and some of the areas in between, to see family and friends). Between catching up with friends – who, naturally, like to know what I’m up to – and the New Year impulse to summarize, and the fact that I’m applying for jobs these days, I’ve had occasion to think through the answer to the question “What do you work on?” on a few different levels. So what I thought i’d do here is give the “Cocktail Party Version” of what it is I’m working on (a less technical version of my research statement, with some philosophical asides, I guess).

In The Middle

The first thing I usually have to tell people is that what I work on lives in the middle – somewhere between mathematics and physics. Having said that, I have to clear up the fact that I’m a mathematician, rather than a physicist. I approach questions with a mathematician’s point of view – I’m interested in making concepts precise, proving facts about them rigorously, and so on. But I do find it helps to motivate this activity to suppose that the concepts in question apply to the real world – by which I mean, the physical world.

(That’s a contentious position in itself, obviously. Platonists, Cartesian dualists, and people who believe in the supernatural generally don’t accept it, for example. For most purposes it doesn’t matter, but my choice about what to work on is definitely influenced by the view that mathematical concepts don’t exist independently of human thought, but the physical world does, and the concepts we use today have been selected – unconsciously sometimes, but for the most part, I think, on purpose – for their use in describing it. This is how I account for the supposedly unreasonable effectiveness of mathematics – not really any more surprising than the remarkable effectiveness of car engines at turning gasoline into motion, or that steel girders and concrete can miraculously hold up a building. You can be surprised that anything at all might work, but it’s less amazing that the thing selected for the job does it well.)

Physics

The physical world, however, is just full of interesting things one could study, even as a mathematician. Biology is a popular subject these days, which is being brought into mathematics departments in various ways. This involves theoretical study of non-equilibrium thermodynamics, the dynamics of networks (of chemical reactions, for example), and no doubt a lot of other things I know nothing about. It also involves a lot of detailed modelling and computer simulation. There’s a lot of profound mathematical engagement with the physical world here, and I think this stuff is great, but it’s not what I work on. My taste in research questions is a lot more foundational. These days, the physical side of the questions I’m thinking about has more to do with foundations of quantum mechanics (in the guise of 2-Hilbert spaces), and questions related to quantum gravity.

Now, recently, I’ve more or less come around to the opinion that these are related: that part of the difficulty of finding a good theory accomodating quantum mechanics and general relativity comes from not having a proper understanding of the foundations of quantum mechanics itself. It’s constantly surprising that there are still controversies, even, over whether QM should be understood as an ontological theory describing what the world is like, or an epistemological theory describing the dynamics of the information about the world known to some observer. (Incidentally – I’m assuming here that the cocktail party in question is one where you can use the word “ontological” in polite company. I’m told there are other kinds.)

Furthermore, some of the most intractable problems surrounding quantum gravity involve foundational questions. Since the language of quantum mechanics deals with the interactions between a system and an observer, so applying it to the entire universe (quantum cosmology) is problematic. Then there’s the problem of time: quantum mechanics (and field theory), both old-fashioned and relativistic, assume a pre-existing notion of time (either a coordinate, or at least a fixed background geometry), when calculating how systems (including fields) evolve. But if the field in question is the gravitational field, then the right notion of time will depend on which solution you’re looking at.

Category Theory

So having said the above, I then have to account for why it is that I think category theory has anything to say to these fundamental issues. This being the cocktail party version, this has to begin with an explanation of what category theory is, which is probably the hardest part. Not so much because the concept of a category is hard, but because as a concept, it’s fairly abstract. The odd thing is, individual categories themselves are in some ways more concrete than the “decategorified” nubbins we often deal with. For example, finite sets and set maps are quite concrete: here are four sheep, and here four rocks, and here is a way of matching sheep with rocks. Contrast that with the abstract concept of the pure number “four” – an element in the set of cardinalities of finite sets, which gets addition and multiplication (abstractly defined operations) from the very concrete concepts of union and product (set of pairs) of sets. Part of the point of categorification is to restore our attention to things which are “more real” in this way, by giving them names.

One philosophical point about categories is that they treat objects and morphisms (which, for cocktail party purposes, I would describe as “relations between objects”) as equally real. Since I’ve already used the word, I’ll say this is an ontological commitment (at least in some domain – here’s an issue where computer science offers some nicely structured terminology) to the existence of relations as real. It might be surprising to hear someone say that relations between things are just as “real” as things themselves – or worse, more real, albeit less tangible.  Most of us are used to thinking of relations as some kind of derivative statement about real things. On the other hand, relations (between subject and object, system and observer) are what we have actual empirical evidence for. So maybe this shouldn’t be such a surprising stance.

Now, there are different ways category theory can enter into this discussion. Just to name one: the causal structure of a spacetime (a history) is a category – in particular, a poset (though we might want to refine that into a timelike-path category – or a double category where the morphisms are timelike and spacelike paths). Another way category theory may come in is as the setting for representation theory, which comes up in what I’ve been looking at. Here, there is some category representing a specific physical system – for example, a groupoid which represents the pure states of a system and their symmetries. Then we want to describe that system in a more universal way – for example, studying it by looking at maps (functors) from that category into one like Hilb, which isn’t tied to the specific system. The underlying point here is to represent something physical in terms of the sort of symbolic/abstract structures which we can deal with mathematically. Then there’s a category of such representations, whose morphisms (intertwiners in some suitably general sense) are ways of “changing coordinates” which get along with what’s important about the system.

The Point

So by “The Point”, I mean: how this all addresses questions in quantum mechanics and gravity, which I previously implied it did (or could). Let me summarize it by describing what happens in the 3D quantum gravity toy model developed in my thesis. There, the two levels (object and morphism) give us two concepts of “state”: a state in a 2-Hilbert space is an object in a category. Then there’s a “2-state” (which is actually more like the usual QM concept of a state): this is a vector in a Hilbert space, which happens to be a component in a 2-linear map between 2-vector spaces. In particular, a “state” specifies the geometry of space (albeit, in 3D, it does this by specifying boundary conditions only). A “2-state” describes a state of a quantum field theory which lives on that background.

Here is a Big Picture conjecture (which I can in no way back up at the moment, and reserve the right to second-guess): the division between “state and 2-state” as I just outlined it should turn out to resolve the above questions about the “problem of time”, and other philosophical puzzles of quantum gravity. This distinction is most naturally understood via categorification.

(Maybe. It appears to work that way in 3D. In the real world, gravity isn’t topological – though it has a limit that is.)

Well, a couple of weeks ago I was up in Waterloo at the Perimeter Institute with Dan Christensen and his grad student Wade Cherrington for a couple of days for the “Young Loops and Foams” conference. It actually ran all week, but we only took the time out to go for the first couple of days. The talks that we were there for dealt mainly with the loop-quantum-gravity and spin-foam approaches to quantum gravity.

These are not really what I’m working on, though I certainly have thought about these approaches, and Dan and his grad students have done significant work on them. Wade Cherrington has been applying spin-foam methods to lattice gauge theory, and Igor Khavkine has been working on the “new” spin foam models. Both of these guys are in the Applied Mathematics department here at UWO (though Igor is graduating this year), and a lot of their work has been about getting efficient algorithms for doing computations with these models. This seems like great stuff to me – certainly it’s a step in the direction of getting predictions and comparing them to experiments (i.e. “real physics”, though as a “mathematician” who’s only motivated by physics, I clearly don’t say this to be snobby)

Many of the talks were a bit over my head – for one thing, a lot of the significant new stuff involves fairly substantial calculation, which is by nature rather technical. There were some more introductory talks about Group Field Theory – Etera Livine and Daniele Oriti gave talks about Group Field Theory which described the main concepts of this subject. Livine’s talk was fairly introductory – explaining how GFT describes a field theory on a background which consists of a product of a few copies of a Lie group, for instance on G^4. In that example, states of the theory

Oriti’s talk dealt more with issues about GFT, but also emphasized that it can be seen as a kind of “second quantization” of spin networks. That is, one can think of a spin network geometry in terms of a graph which is labelled with spins (in practice, half-integers). Given such a graph, there is a Hilbert space for such states on the graph, whereas in GFT, the graph itself emerges from the states. The total Hilbert space for the fields in GFT then includes many different graphs, with many different numbers of vertices. The analogy to second quantization, in which, for example, one takes the quantum mechanical theory of an oscillator with a given energy, and turns it

Oriti also made references to this paper, in which he proposes a way to get a continuum limit out of GFT (using methods, which I can hardly comment on, analogous to those used to describe condensates in solid-state physics). However, he didn’t have time to describe this in detail. I’ve only looked briefly at that paper, and it seems sort of impressionistic, but the impressions are interesting, anyway.

I managed to have a few conversations with Robert Oeckl about Extended TQFT’s on the one hand, and his general boundary formulation of QFT’s on the other (more here, and slides giving an overview here). These two points of view take the usual formalism of TQFT and run with it in two somewhat different directions. Since I’ve talked a lot here about Extended TQFT’s and categorification, I’ll just say a bit about what Oeckl calls the general boundary formulation. This doesn’t use categorical language, and it remains a theory at “codimension 1″ (that is, it tells you about top-dimension “volumes” which connect codimension-1 “surfaces”, and that’s all). It does get outside what the functorial axiomatization of TQFT’s seems to ask, though. In particular, it doesn’t require you to be talking about a cobordism (“spacetime”) going from an input hypersurface (“space-slice”) to an output. Instead, it lets you talk about a general region with boundary, treating the whole boundary at once. Any part of it can be thought of as input or output.

One point of this way of describing a QFT is to help deal with the “problem of time”. His talk at the conference was a sort of “back to basics” discussion about the two basic approaches to quantum gravity – what he named the “covariant” (or perturbative) approach and the “canonical” (or “no-spacetime”) approach. One way to put the “problem” of time has to do with the apparently incompatible roles it plays in, respectively, general relativity and quantum mechanics, and these two approaches respect different portions of these roles.

The point is that in (non-quantum) relativity, a “state” is a whole world-history, part of which is the background geometry, which determines a causal order – a sort of minimal summary of time in that state. But in particular, it is part of the information contained in a state, which describes everything real. In QM, on the other hand, a “state” contains some information about the world in a maximal way (though IF you assume it represents all of reality, THEN you have to accept that reality isn’t local). But moreover, time plays a special role in QM outside any particular “world”.

In particular, the state vector in the Hilbert space \mathcal{H} encodes information about a system between measurements (chronologically!), an operator on \mathcal{H} changes a state \psi_1 into a new state \psi_2 (also chronologically), and composition of operators implies a temporal sequence (which gives the meaning of noncommuting operators – the result depends on the order in which you perform them). This all depends on a notion of temporal order which, in relativity, depends on the background metric, which is putatively depends on the state itself! So the two approaches to quantization try to either (a) keep the temporal order using a fixed background, and treat perturbations as the field (which can only be approximate), or (b) keep the idea that the metric is part of the state and hopefully recover the usual picture in some special cases (which is hard).

So as I understand it, the general boundary approach is meant to help get around this. It works by assigning data to both regions M, and their boundaries \Sigma = \partial M, subject to a few rules which are reminiscent of those which make a TQFT in the usual formulation into a monoidal functor. In particular, the theory assigns a Hilbert space \mathcal{H}_{\Sigma} to a boundary, and a linear functional \rho_M : \mathcal{H}_{\Sigma} \rightarrow \mathbb{C} to a region. This satisfies some rules such as that \mathcal{H}_{\Sigma_1 \cup \Sigma_2} = \mathcal{H}_{\Sigma_1} \otimes \mathcal{H}_{\Sigma_2}, that reversing the orientation of a boundary amounts to taking the dual of the Hilbert space, some gluing rules, and so on.

Then there is a way to recover a generalization of the probability interpretation for quantum mechanics. But it’s not a matter of first setting up a system in a state, and then making a measurement. Instead, it’s a way of asking a question, given some knowledge about the “system” at the boundary. Both knowledge and question take the form of subspaces (denoted \mathcal{A} and \mathcal{S}) of \mathcal{H}_{\Sigma}, and the formula for probability involves both \rho_M and the projection operators onto these subspaces. The “probablity of \mathcal{A} given \mathcal{S}” is:

P(\mathcal{A}|\mathcal{S}) = \frac{|\rho_M \circ P_\mathcal{S} \circ P_\mathcal{A}|^2}{|\rho_M \circ P_{S}|^2}

Then one of the rules defining how \rho_M behaves when M is deformed gives a sort of “conservation of probability” – the equivalent of unitarity of time evolution. If \Sigma decomposes as the union of an input and an output, and the subspaces \mathcal{A} and \mathcal{S} correspond to states on the input and the output surfaces, it gives exactly unitarity of time evolution.

Now, this seems like an interesting idea, assuming that it does indeed get over the shortcomings of both canonical and covariant approaches to quantum gravity. My main questions have to do with how to interpret it in category-theoretic terms, since it would be nice to see whether an extended TQFT – with 2-algebraic data for surfaces of codimension 2, and so on – could be described in the same way. The way Oeckl presents his TQFT’s is quite minimal, which is good for some purposes and avoids some complexity, but loses the organizing structure of TQFT-as-functor.

One thing that would be needed is a way of talking about some sort of n-category which has composition for morphisms with fairly arbitrary shapes – not just taking a source to a target. Instead of composition of arrows tip-to-tail, one has to glue randomly shaped regions together. Offhand, I don’t know the right way to do this.

A couple of posts ago, I mentioned Max Jammer’s book “Concepts of Space” as a nice genealogy of that concept, with one shortcoming from my point of view – namely, as the subtitle suggests, it’s a “History of Theories of Space in Physics”, and since physics tends to use concepts out of mathematics, it lags a bit – at least as regards fundamental concepts. Riemannian geometry predates Einstein’s use of it in General Relativity by fifty some years, for example. Heisenberg reinvented matrices and matrix multiplication (which eventually led to wholesale importation of group theory and representation theory into physics). More examples no doubt could be found (String Theory purports to be a counterexample, though opinions differ as to whether it is real physics, or “merely” important mathematics; until it starts interacting with experiments, I’m inclined to the latter, though of course contra Hardy, all important mathematics eventually becomes useful for something).

What I said was that it would be nice to see further investigation of concepts of space within mathematics, in particular Grothendieck’s and Connes’. Well, in a different context I was referred to this survey paper by Pierre Cartier from a few years back, “A Mad Day’s Work: From Grothendieck To Connes And Kontsevich, The Evolution Of Concepts Of Space And Symmetry”, which does at least some of that – it’s a fairly big-picture review that touches on the relationship between these new ideas of space. It follows that stream of the story of space up to the end of the 20th century or so.

There’s also a little historical/biographical note on Alexander Grothendieck – the historical context is nice to see (one of the appealing things about Jammer’s book). In this case, much of the interesting detail is more relevant if you find recent European political history interesting – but I do, so that’s okay. In fact, I think it’s helpful – maybe not mathematically, but in other ways – to understand the development of mathematical ideas in the context of history. This view seems to be better received the more ancient the history in question.

On the scientific end, Cartier tries to explain Grothendieck’s point of view of space – in particular what we now call  topos theory – and how it developed, as well as how it relates to Connes’.  Pleasantly enough, a key link between them turns out to be groupoids!  However, I’ll pass on commenting on that at the moment.

Instead, let me take a bit of a tangent and jump back to Jammer’s book.  I’ll tell you something from his chapter “Emancipation from Aristotelianism” which I found intriguing.  This would be an atomistic theory of space – an idea that’s now beginning to make something of a comeback, in the guise of some of the efforts toward a quantum theory of gravity (EDIT: but see comments below).  Loop quantum gravity, for example, deals with space in terms of observables, which happen to take the form of holonomies of connections around loops.  Some of these observables have interpretations in terms of lengths, areas, and volumes.  It’s a prediction of LQG that these measurements should have “quantized”, which is to say integer, values: states of LQG are “spin networks”, which is to say graphs with (quantized) labels on the edges, interpreted as areas (in a dual cell complex).  (Notice this is yet again another, different, view of space, different from Grothendieck’s or Connes’, but shares with Connes especially the idea of probing space in some empirical way.  Grothendieck “probes” space mainly via cohomology – how “empirical” that is depends on your point of view.)

The atomistic theory of space Jammer talks about is very different, but it does also come from trying to reconcile a discrete “quantum” theory of matter with a theory linking matter to space.  In particular, the medieval Muslim philosophical school known as al Kalam tried to reconcile the Koran and Islamic theology with Greek philosophy (most of the “Hellenistic” world conquered by Alexander the Great, not least Egypt, is inside Dar al Islam, which is why many important Greek texts came into Europe via Arabic translations).  Though they were, as Jammer says, “Emancipating” themselves from Aristotle, they did share some of his ideas about space.

For Aristotle, space meant “place” – the answer to the questions “where is it?” and “what is its shape and size?”. In particular, it was first and foremost an attribute of some substance.  All “where?” questions are about some THING.  The answer is defined in terms of other things: my cat is on the ground, under the tree, beside the house.  The “place” of an object was literally the inner shell of the containing body that held it (which was contained by some other body, and so on – there being no vacuum in Aristotle).  So my “place” is defined by (depending how you look at it) my skin, my clothes, or the walls of the room I’m in.  This is a relational view of space, though more hard-headed than, say, Leibniz’s.

The philosophers of the Kalam had a similar relational view of space, but they didn’t accept Aristotle’s view of “substances”, where each thing has its own essential identity, on which attributes are hung like hats.  Instead, they believed in atomism, following Democritus and Leucippus: bodies were made out of little indivisible nuggets called “atoms”.  Macroscopic things were composites of atoms, and their attributes resulted from how the atoms were put together.  Here’s Jammer’s description:

The atoms of the Kalam are indivisible particles, equal to each other and devoid of all extension.  Spatial magnitude can be attributed only to a combination of atoms forming a body.  Although a definite position (hayyiz) belongs to each individual atom, it does not occupy space (makan).  It is rather the set of these positions – one is almost tempted to say, the system of relations – that constitutes spatial extension….

In the Kalam, these rather complicated and surprisingly abstract ideas were deemed necessary in order to meet Aristotle’s objections against atomism on the ground that a spatial continuum cannot be constituted by, or resolved into, indivisibles nor can two points be continuous or contiguous with one another.

So like people who prefer a “background independent” quantum theory of gravity, they wanted to believe that space (geometry) derives from matter, and that matter is discrete, but space was commonly held to be continuous.  Also alike, they resolved the problem by discarding the assumption of continuous space, and, by consideration of motion, to discrete time.

There are some differences, though.  The most obvious is that the nodes of the graph in a spin network state don’t represent units of matter, or “atoms”.  For that matter, quantum field theory doesn’t really have “atoms” in the sense of indivisible units which don’t break apart or interact.  Everything interacts in QFT.  (In some sense, interactions are more fundamental units in QFT than “particles” are – particles only (sic!) serve to connect one interaction with another.)

Another key difference is how space relates to matter.  In Aristotle, and in the Kalam, space is defined directly by matter: two bits of matter “define” the space between them.  In General Relativity (the modern theory with the “relational” view of space), there’s still room for space as an actor in its own right, like Newton’s absolute space-as-independent-variable – in other words, room for a vacuum, which Aristotle categorically denied could even conceivably exist.  In GR, what matter determines is the curvature of space (more precisely the Einstein tensor of the curvature).

Well, so the differences are probably more informative than the similarities,

(Edit: To emphasize a key difference glossed over before…  It was coupling to quantum matter which suggested quantizing the picture of space.  Discreteness of the spectrum of various observables is a logically separate prediction in each case.  Either matter or space(time) could have had continuous spectrum for the relevant observables and still been quantized – discrete matter would have given discreteness for some observed quantities, but not area, length, and so on.  So in the modern setting, the link is much less direct.)

 but the fact that theories of related discreteness in matter, space, and time, have been around for a thousand years or more is intriguing.  The idea of empty space as an independent entity – in the modern form only about three hundred years old – appears to be the real novel part.  One of the nice intuitions in Carlo Rovelli’s book on Quantum Gravity, for me at least, was to say that, rather than there being a separate “space”, we have a theory of fields defined on other fields as background – one of which, the “gravitational field” has customarily been taken for “space”.  So spatial geometry is a field, and it has some propagating (through space!) degrees of freedom – the particle associated to this field is a graviton.  Nobody’s ever seen one, mind you – but supposing they exist makes many of things easier.

To re-state a previous point: I think this is a nice aspect of categorification for dealing with space.  Extending the “stuff/structure/properties” trichotomy to allow space to resemble both “stuff” and relations between stuff leaves room for both points of view.

I mention this because tomorrow I leave London (Ontario) for London (England), and thence to Nottingham, for the Quantum Gravity and Quantum Geometry Conference.  It’s been a while since I worked much on quantum gravity, per se, but this conference should be interesting because it seems to be a confluence of mathematically and physically inclined people, as the name suggests.  I read on the program, for example, that Jerzy Lewandowski is speaking on QFT in Quantum Curved Spacetime, and suddenly remember that, oh yes, I did a Masters thesis (viz) on QFT in curved (classical) spacetime… but that was back in the 20th century!

It’s been a while, and I only made a small start at it before, but that whole area of physics is quite pretty.  Anyway, it should be interesting, and there are a number of people I’m looking forward to talking to.

I’d just like to post something about a conceptual clarification that came up recently. Last week I gave the first of a couple of talks in the Algebra seminar in our department, about the ideas of structure types and stuff types, more or less as outlined in this paper which I put out a couple of years ago. It summarizes and traipses a little way beyond the matter of the 2003/2004 quantum gravity seminar at UCR, whence on this paper by John Baez and Jim Dolan, and even further back on work by André Joyal, particularly in the paper “Foncteurs analytiques et espèces de structures“, which regrettably doesn’t seem to be available either online. (I gave a blackboard version of the talk, but it was an expanded form of this one hour version.)

(Semantic side note: these espèces de structures are often referred to as “combinatorial species” in English. This is the more common translation than “structure type”, but unfortunately, it doesn’t capture the modifier “de structures“, instead choosing the more generic “combinatorial”, which makes it hard to distinguish “structure types” from “stuff types” in the Baez-Dolan sense. Also, “species” is probably over-specific as a translation of “espèces” in a way that “type” isn’t. The generic sense of “species” as “a kind of” in English is a bit recherché.)

In any case, what I’m interested in this post is the sense in which stuff types give a “categorification” of a vector space. In a nutshell, a stuff type is a groupoid over FinSet_0 (the groupoid whose objects are finite sets, and whose morphisms are bijections). That is, it’s really a functor X \stackrel{\psi}{\longrightarrow} FinSet_0, which we call the “underlying set” functor. For example, consider the groupoid T of all binary trees, where the underlying set is the set of nodes (or, a different example, the set of leaves). Any isomorphism between two such trees gives a bijection between the underlying sets, so this actually is a functor. Or one could take the functor FinSet_0 \times FinSet_0 \stackrel{\pi_1}{\longrightarrow} FinSet_0, where the “underlying set” of a pair of sets (S_1,S_2) is just S_1, and likewise for morphisms. (Notice that different bijections “up above” in the bundle may give the same bijection “below” – in cases where this doesn’t happen, we have one of Joyal’s “structure types”). In some ways, it’s better to think of it as a bundle of groupoids – one fibre over each object in FinSet_0

The thing is, that map gives an invariant for objects in the category of groupoids, but not a complete invariant. Unlike, say, finite sets and the natural numbers. Natural numbers correspond exactly to isomorphism classes of sets – not so with groupoid cardinalities. So there’s an equivalence relation, and reducing the object set modulo that equivalence relation gives a structure – but it’s not the minimal throwing-away of information about objects that taking isomorphism classes would be.

But in any case, it’s the whole category of groupoids (over FinSet_0) which gets “decategorified” down to a vector space, in that world. There is a concept of groupoid cardinality, which is given by Baez and Dolan in the paper above, and which is also linked to Tom Leinster’s definition of the Euler characteristic of a category. This adds up, over all the isomorphism classes of objects, \frac{1}{|Aut(x)|}, the reciprocals of the sizes of automorphism groups. Reasons why this is the nicest concept of cardinality are described in some of those references, but all that really matters here is that groupoid cardinality gets along with disjoint unions of groupoids (corresponding to sums of cardinalitys), and products of groupoids (which get the product of the two cardinalities). That is, the categorical coproduct and product, respectively, define operations on the set of cardinalities!

In particular, taking stuff types – groupoids over FinSet_0, we can take the cardinalities of the fibres over sets of each size n giving the n^{th} coordinate in a vector. So then is, the slice category \mathbf{Grpd}/FinSet_0 has this “cardinality” on objects into a set, and the structure of the category gives well-defined operations on this set, turning it into a vector space. In fact, there’s an operation (weak pullback) which makes it an inner product space. (To make this work in complex cardinalities takes some fudging with phases in U(1), but it can be done.)

The details are interesting, and I’m coming back to looking at some of this again, but what I want to point out at the moment is a more fundamental point, which has to do with the offhanded use of the handy, but imprecise, term “categorify”. With the category of (U(1)-) stuff types, we have a category with a “decategorification” map that compresses it into a vector space. This sure sounds like a “categorified vector space”. In fact, this seems to be what people who hear the term “categorification” often want it to mean: I look for a categorification of mathematical object X by finding a category which, secretly, looks like X.

The problem is, there’s another concept attached to the phrase “categorified vector space”, namely that of 2-vector space in the sense of Kapranov and Voevodski, as discussed, say, here. There’s a different level of abstraction at work here. The specific category of stuff types provides a categorification (if that indeed is the right word to use) of a specific vector space. The concept of a KV 2-vector space categorifies the concept of a regular vector space in a particular way: putting “additive” structure on objects, and “C-linear” structure on morphisms. (The Baez-Crans version does the same job in a different way).

You don’t think of a specific KV 2-vector space “decategorifying to” a specific vector space. Indeed, just taking the “minimal” equivalence relation – isomorphism classes of objects – what we get from a KV 2-vector space is more like an \mathbb{N}-module (over a rig, not a ring). Basically, 2-vectors have components which are vector spaces, and therefore classified by their dimension. The relationship between THIS kind of 2-vector space and the non-categorified concept is that real vector spaces show up as the hom-sets in a KV 2-vector space.

Elucidating exactly what’s going on with these two forms of categorification would be nice – perhaps somebody’s done it, but if so, I don’t know who. I also don’t know any nice conditions that tell you when you have a “category that can be mistaken for a vector space”, like stuff types: a good characterization of these things would be nice. Or again: both versions of “categorification” of vector space have special relationships to groupoids – but of two very different natures (in one, the groupoids can be interpreted as 2-vectors – in the other, there are whole 2-vector spaces associated to groupoids). Just a coincidence?

Another possibility that comes to mind would be to form some kind of hybrid structure – where the “vector spaces” which show up in the hom-sets in a KV 2-v.s. are secretly this fake-vector space type of category. Since both types seem to have physics-y ambitions, such a setup that combines both approaches is appealing, rather than a muddled and confusing competition for the term “categorification”.

I don’t have a good ending to this story, which is why this is a blog, not a book.

In the past couple of weeks, Masoud Khalkhali and I have been reading and discussing this paper by Marcolli and Al-Yasry. Along the way, I’ve been explaining some things I know about bicategories, spans, cospans and cobordisms, and so on, while Masoud has been explaining to me some of the basic ideas of noncommutative geometry, and (today) K-theory and cyclic cohomology. I find the paper pretty interesting, especially with a bit of that background help to identify and understand the main points. Noncommutative geometry is fairly new to me, but a lot of the material that goes into it turns out to be familiar stuff bearing unfamiliar names, or looked at in a somewhat different way than the one I’m accustomed to. For example, as I mentioned when I went to the Groupoidfest conference, there’s a theme in NCG involving groupoids, and algebras of \mathbb{C}-linear combinations of “elements” in a groupoid. But these “elements” are actually morphisms, and this picture is commonly drawn without objects at all. I’ve mentioned before some ideas for how to deal with this (roughly: \mathbb{C} is easy to confuse with the algebra of 1 \times 1 matrices over \mathbb{C}), but anything special I have to say about that is something I’ll hide under my hat for the moment.

I must say that, though some aspects of how people talk about it, like the one I just mentioned, seem a bit off, to my mind, I like NCG in many respects. One is the way it ties in to ideas I know a bit about from the physics end of things, such as algebras of operators on Hilbert spaces. People talk about Hamiltonians, concepts of time-evolution, creation and annihilation operators, and so on in the algebras that are supposed to represent spaces. I don’t yet understand how this all fits together, but it’s definitely appealing.

Another good thing about NCG is the clever elegance of Connes’ original idea of yet another way to generalize the concept “space”. Namely, there was already a duality between spaces (in the usual sense) and commutative algebras (of functions on spaces), so generalizing to noncommutative algebras should give corresponding concepts of “spaces” which are different from all the usual ones in fairly profound ways. I’m assured, though I don’t really know how it all works, that one can do all sorts of things with these “spaces”, such as finding their volumes, defining derivatives of functions on them, and so on. They do lack some qualities traditionally associated with space – for instance, many of them don’t have many, or in some cases any, points. But then, “point” is a dubious concept to begin with, if you want a framework for physics – nobody’s ever seen one, physically, and it’s not clear to me what seeing one would consist of…

(As an aside – this is different from other versions of “pointless” topology, such as the passage from ordinary topologies to, sites in the sense of Grothendieck. The notion of “space” went through some fairly serious mutations during the 20th century: from Einstein’s two theories of relativity, to these and other mathematicians’ generalizations, the concept of “space” has turned out to be either very problematic, or wonderfully flexible. A neat book is Max Jammer’s “Concepts of Space“: though it focuses on physics and stops in the 1930’s, you get to appreciate how this concept gradually came together out of folk concepts, went through several very different stages, and in the 20th century started to be warped out of all recognition. It’s as if – to adapt Dan Dennett – “their word for milk became our word for health”.I would like to see a comparable history of mathematicians’ more various concepts, covering more of the 20th century. Plus, one could probably write a less Eurocentric genealogy nowadays than Jammer did in 1954.)

Anyway, what I’d like to say about the Marcolli and Al-Yasry paper at the moment has to do with the setup, rather than the later parts, which are also interesting. This has to do with the idea of a correspondence between noncommutative spaces. Masoud explained to me that, related to the matter of not having many points, such “spaces” also tend to be short on honest-to-goodness maps between them. Instead, it seems that people often use correspondences. Using that duality to replace spaces with algebras, a recurring idea is to think of a category where morphism from algebra A to algebra B is not a map, but a left-right (A,B)-bimodule, _AM_B. This is similar to the business of making categories of spans.

Let me describe briefly what Marcolli and Al-Yasry describe in the paper. They actually have a 2-category. It has:

Objects: An object is a copy of the 3-sphere S^3 with an embedded graph G.

Morphisms: A morphism is a span of branched covers of 3-manifolds over S^3:

G_1 \subset S^3 \stackrel{\pi_1}{\longleftarrow} M \stackrel{\pi_2}{\longrightarrow} S^3 \supset G_2

such that each of the maps \pi_i is branched over a graph containing G_i (perhaps strictly). In fact, as they point out, there’s a theorem (due to Alexander) proving that ANY 3-manifold M can be realized as a branched cover over the 3-sphere, branched at some graph (though perhaps not including a given G, and certainly not uniquely).

2-Morphisms: A 2-morphism between morphisms M_1 and M_2 (together with their \pi maps) is a cobordism M_1 \rightarrow W \leftarrow M_2, in a way that’s compatible with the structure of the $lateux M_i$ as branched covers of the 3-sphere. The M_i are being included as components of the boundary \partial W – I’m writing it this way to emphasize that a cobordism is a kind of cospan. Here, it’s a cospan between spans.

This is somewhat familiar to me, though I’d been thinking mostly about examples of cospans between cospans – in fact, thinking of both as cobordisms. From a categorical point of view, this is very similar, except that with spans you compose not by gluing along a shared boundary, but taking a fibred product over one of the objects (in this case, one of the spheres). Abstractly, these are dual – one is a pushout, and the other is a pullback – but in practice, they look quite different.

However, this higher-categorical stuff can be put aside temporarily – they get back to it later, but to start with, they just collapse all the hom-categories into hom-sets by taking morphisms to be connected components of the categories. That is, they think about taking morphisms to be cobordism classes of manifolds (in a setting where both manifolds and cobordisms have some branched-covering information hanging around that needs to be respected – they’re supposed to be morphisms, after all).

So the result is a category. Because they’re writing for noncommutative geometry people, who are happy with the word “groupoid” but not “category”, they actually call it a “semigroupoid” – but as they point out, “semigroupoid” is essentially a synonym for (small) “category”.

Apparently it’s quite common in NCG to do certain things with groupoids \mathcal{G} – like taking the groupoid algebra \mathbb{C}[\mathcal{G}] of \mathbb{C}-linear combinations of morphisms, with a product that comes from multiplying coefficients and composing morphisms whenever possible. The corresponding general thing is a categorical algebra. There are several quantum-mechanical-flavoured things that can be done with it. One is to let it act as an algebra of operators on a Hilbert space.

This is, again, a fairly standard business. The way it works is to define a Hilbert space \mathcal{H}(G) at each object G of the category, which has a basis consisting of all morphisms whose source is G. Then the algebra acts on this, since any morphism M' which can be post-composed with one M starting at G acts (by composition) to give a new morphism M' \circ M starting at G – that is, it acts on basis elements of \mathcal{H}(G) to give new ones. Extending linearly, algebra elements (combinations of morphisms) also act on \mathcal{H}(G).

So this gives, at each object G, an algebra of operators acting on a Hilbert space \mathcal{H}(G) – the main components of a noncommutative space (actually, these need to be defined by a spectral triple: the missing ingredient in this description is a special Dirac operator). Furthermore, the morphisms (which in this case are, remember, given by those spans of branched covers) give correspondences between these.

Anyway, I don’t really grasp the big picture this fits into, but reading this paper with Masoud is interesting. It ties into a number of things I’ve already thought about, but also suggests all sorts of connections with other topics and opportunities to learn some new ideas. That’s nice, because although I still have plenty of work to do getting papers written up on work already done, I was starting to feel a little bit narrowly focused.

I recently got back to London, Ontario from a trip to Ottawa, the first purpose of which was to attend the Ottawa Mathematics Conference. The other purpose was to visit family and friends, many of whom happen to be located there, which is one reason it’s taken me a week or so to get around to writing about the trip. Now, the OMC was a general-purpose conference, mainly for grad students, and some postdocs, to give short talks (plus a couple of invited faculty from Ottawa’s two universities – the University of Ottawa, and Carleton University – who gave lengthier talks in the mornings). This is not a type of conference I’ve been to before, so I wasn’t sure what to expect.

From one, fairly goal-oriented, point of view, the style of the conference seemed a little scattered. There was no particular topic of focus, for instance. On the other hand, for someone just starting out in mathematical research, this type of thing has some up sides. It gives a chance to talk about new work, see what’s being done across a range of subjects, and meet people in the region (in this case, mainly Ottawa, but also elsewhere across Eastern and Southern Ontario). The only other general-purpose mathematics conference I’ve been to so far was the joint meeting of the AMS in New Orleans in 2007, which had 5000 people and anyone attending talks would pick special sessions suiting their interests. I do think it’s worthwhile to find ways of circumventing the various pressures toward specialization in research – it may be useful in some ways, but balance is also good. Particularly for Ph.D. students, for whom specialization is the name of the game.

One useful thing – again, particularly for students – is the reminder that the world of mathematics is broader than just one’s own department, which almost certainly has its own specialties and peculiarities. For example, whereas here at UWO “Applied” mathematics (mostly involving computer modelling) is done in a separate department, this isn’t so everywhere. Or, again, while my interactions in the UWO department focus a lot on geometry and topology (there are active groups in homotopy theory and noncommutative geometry, for example), it’s been a while since I saw anyone talk about combinatorics, or differential equations. Since I actually did a major in combinatorics at U of Waterloo, it was kind of refreshing to see some of that material again.

There were a couple of invited talks by faculty. Monica Nevins from U of Ottawa gave a broad and enthusiastic survey of representation theory for graduate students. Brett Stevens from Carleton talked about “software testing”, which surprised me by actually being about combinatorial designs. Basically, it’s about the problem of how, if you have many variables with many possible values each, to design a minimal collection of “settings” for those variables which tests all possible combinations of, say, two variables (or three, etc.). One imagines the variables representing circumstances software might have to cope with – combinations of inputs, peripherals, and so on – so the combinatorial problem is if there are 10 variables with 10 possible values each, you can’t possibly test all 10 billion combinations – but you might be able to test all possible settings of any given PAIR of variables, and much more efficiently than just an exhaustive search, by combining some tests together.

Among the other talks were several combinatorial ones – error correcting codes using groups, path ideals in simplicial trees (which I understand to be a sort of generalization to simplicial sets of what trees are for graphs), heuristic algorithms for finding minimal cost collections of edges in weighted graphs that leave the graph with at least a given connectivity, and so on. Charles Starling from U of O gave an interesting talk about how to associate a topological space to an aperiodic tiling (roughly, any finite-size region in an aperiodic tiling is repeated infinitely many times – so the points of the space are translations, and two translations are within \epsilon of one another if they produce matching regions about the origin of size \frac{1}{\epsilon} – then the thing is to study cohomology of such spaces, and so forth).

The talk immediately following mine was by Mehmetcik Pamuk about homotopy self-equivalences of 4-manifolds, which used a certain braid of exact sequences of groups of automorphisms (among other things). I expected this to be very interesting, and it was certainly intriguing, but I can’t adequately summarize it – whatever he was saying, it proved to be hard to pick up from just a 25 minute talk. I did like something he said in his introduction, though: nowadays, if a topologist says they’re doing “low-dimensional” topology, they mean dimension 3, and “high-dimensional” means dimension 4. This is a glib but indicative way to point out that topology of manifolds in dimensions 1 and 2 is well understood (the connected components are, respectively, circles and n-holed tori), and in dimension 5 and above have been straightened out more recently thanks to Smale.

There were some quite applied talks which I missed, though I did catch one on “gravity waves”, which turn out not to be gravitational waves, but the kind of waves produced in fluids of varying density acted on by gravity. (In particular, due to layers of temperature and pressure in the atmosphere, sometimes denser air sits above less dense air, and gravity is trying to reverse this, producing waves. This produces those long rippling patterns you sometimes see in high-altitude clouds. Lidia Nikitina told us about some work modelling these in situations where the ground topography matters, such as near mountains – and had some really nice pictures to illustrate both the theory and the practice.)

On the second day there were quite a few talks of an algebraic or algebra-geometric flavour – about rings of algebraic invariants, about enumerating lines in special “blow-up” varieties, function fields associated to hyperelliptic curves, and so on – but although this is interesting, I had a harder time extracting informative things to say about these, so I’ll gloss over them glibly. However, I did appreciate the chance to gradually absorb a little more of this area of math by osmosis.

The flip side of seeing what many other people are doing was getting a chance to see what other people had to say about my own talk – about groupoids, spans, and 2-vector spaces. One of the things I find is that, while here at UWO the language of category theory is widely used (at least by the homotopy theorists and noncommutative geometry people I’ve been talking to), it’s not as familiar in other places. This seems to have been going on for some time – since the 1970’s if I understand the stories correctly. After MacLane and Eilenberg introduced categories in the 1940’s, the concept had significant effects in algebraic geometry/topology, homological algebra, and spread out from there. There was some deep enthusiasm – possibly well-founded, though I won’t claim so – that category theory was a viable replacement for set theory as a “foundation” for mathematics. True or not, that idea seemed to be one of those which was picked up by mathematicans who didn’t otherwise know much about category theory, and it seems to be one that’s still remembered. So maybe it had something to do with the apparent fall from fashion of category theory. I’ve heard that theory suggested before: roughly, that many mathematicians thought category theory was supposed to be a new foundation for mathematics, couldn’t see the point, and lost interest.

Now, my view of foundations is roughly suggested in my explanation of the title of this blog. I tend to think that our understanding of the world comes in bits and pieces, which we refine, then try to stick together into larger and more inclusive bits and pieces – the “Atlas” of charts of the title. This isn’t really just about the physical world, but the mathematical world as well (in fact I’m not really a Platonist who believes in a separate “world” of mathematical objects – though that’s a different conversation). This is really just a view of epistemology – namely, empirical methods work best because we don’t know things for sure, not being infinitely smart. So the “idealist”-style program of coming up with some foundational axioms (say, for set theory), and deriving all of mathematics from them without further reference to the outside doesn’t seem like the end of the story. It’s useful as a way of generating predictions in physics, but not of testing them. In mathematics, it generates many correct theorems, but doesn’t help identify interesting, or useful, ones.

So could category theory be used in foundations of mathematics? Maybe – but you could also say that mathematics consists of manipulating strings in a formal language, and strings are just words in a free monoid, so actually all of mathematics is the theory of monoids with some extra structure (giving rules of inference in the formal language). Yet monoid theory – indeed, algebra generally – is not mainly interesting as foundations, and probably neither is category theory.

On the whole, it was an interesting step out of the usual routine.

Next Page »