Looks like the Standard Model is having a bad day – Fermilab has detected CP-asymmetry about 50 times what it predicts in some meson decay. As they say – it looks like there might be some new physics for the LHC to look into.

That said, this post is mostly about a particular voting system which has come back into the limelight recently, but also runs off on a few tangents about social choice theory and the assumptions behind it. I’m by no means expert in the mathematical study of game theory and social choice theory, but I do take an observer’s interest in them.

A couple of years ago, during an election season, I wrote a post on Arrow’s theorem, which I believe received more comments than any other post I’ve made in this blog – which may only indicate that it’s more interesting than my subject matter, but I suppose is also a consequence of mentioning anything related to politics on the Internet. Arrow’s theorem is in some ways uncontroversial – nobody disputes that it’s true, and in fact the proof is pretty easy – but what significance, if any, it has for the real world can be controversial. I’ve known people who wouldn’t continue any conversation in which it was mentioned, probably for this reason.

On the other hand, voting systems are now in the news again, as they were when I made the last post (at least in Ontario, where there was a referendum on a proposal to switch to the Mixed Member Proportional system). Today it’s in the United Kingdom, where the new coalition government includes the Liberal Democrats, who have been campaigning for a long time (longer than it’s had that name) for some form of proportional representation in the British Parliament. One thing you’ll notice if you click that link and watch the video (featuring John Cleese), is that the condensed summary of how the proposed system would work doesn’t actually tell you… how the proposed system would work. It explains how to fill out a ballot (with rank-ordering of candidates, instead of selecting a single one), and says that the rest is up to the returning officer. But obviously, what the returning officer does with the ballot is the key of the whole affair.

In fact, collecting ordinal preferences (that is, a rank-ordering of the options on the table) is the starting point for any social choice algorithm in the sense that Arrow’s Theorem talks about. The “social choice problem” is to give a map from the set of possible preference orders for each individual, and produce a “social” preference order, using some algorithm. One can do a wide range of things with this information: even the “first-past-the-post” system can start with ordinal preferences: this method just counts the number of first-place rankings for each option, ranks the one with the largest count first, and declares indifference to all the rest.

The Lib-Dems have been advocating for some sort of proportional representation, but there are many different systems that fall into that category and they don’t all work the same way. The Conservatives have promised some sort of referendum on a new electoral system involving the so-called “Alternative Vote”, also called Instant Runoff Voting (IRV), or the Australian Ballot, since it’s used to elect the Australian legislature.

Now, Arrow’s theorem says that every voting system will fail at least one of the conditions of the theorem. The version I quoted previously has three conditions: Unrestricted Range (no candidate is excluded by the system before votes are even counted); Monotonicity (votes for a candidate shouldn’t make them less likely to win); and Independence of Irrelevant Alternatives (if X beats Y one-on-one, and both beat Z, then Y shouldn’t win in a three-way race). Most voting systems used in practice fail IIA, and surprisingly many fail monotonicity. Both possibilities allow forms of strategic voting, in which voters can sometimes achieve a better result, according to their own true preferences, by stating those preferences falsely when they vote. This “strategic” aspect to voting is what ties this into game theory.

In this case, IRV fails both IIA and monotonicity. In fact, this is involved with the fact that IRV also fails the Condorcet condition which says that if there’s a candidate X who beats every other candidate one-on-one, X should win a multi-candidate race (which, obviously, can only happen if the voting system fails IIA).

So the IRV algorithm, one effectively uses the preference ordering to “simulate” a runoff election, in which people vote for their first choice from $n$ candidates, then the one with the fewest votes is eliminated, and the election is held again with $(n-1)$ candidates, and so on until a single winner emerges. In IRV, this is done by transferring the votes for the discarded candidate to their second-choice candidate, recounding, discarding again, and so on. (The proposal in the UK would be to use this system in each constituency to elect individual MP’s.)

Here’s an example of how IRV might fail these criteria, and permit strategic voting. The way assumes a close three-way election, but this isn’t the only possibility.

Suppose there are three candidates: X, Y, and Z. There are six possible preference orders a voter could have, but to simplify, we’ll suppose that only three actually occur, as follows:

 Percentage Choice 1 Choice 2 Choice 3 36 X Z Y 33 Y Z X 31 Z Y X

One could imagine Z is a “centrist” candidate somewhere between X and Y. It’s clear here that Z is the Condorcet winner: in a two-person race with either X or Y, Z would win by nearly a 2-to-1 margin. Yet under IRV, Z has the fewest first-choice ballots, and so is eliminated, and Y wins the second round. So IRV fails the Condorcet criterion. It also fails the Independence of Irrelevant Alternatives, since X is loses in a two-candidate vote against either Y or Z (by 64-36), hence should be “irrelevant”, yet the fact that X is on the ballot causes Z to lose to Y, whom Z would otherwise beat

This tends to undermine the argument for IRV that it eliminates the “spoiler effect” (another term for the failure of IIA): here, Y is the “spoiler”.

The failure of monotonicity is well illustrated by a slightly differente example, where Z-supporters are split between X and Y, say 16-15. Then X-supporters can get a better result for themselves if 6 of their 36 percent lie, and rank Y first instead of X (even though they like Y the least), followed by X. This would mean only 30% rank X first, so X is eliminated, and Y runs against Z. Then Z wins 61-39 against Y, which X-supporters prefer. Thus, although the X supporters switched to Y – who would otherwise have won – Y now loses. (Of course, switching to Z would also have worked – but this shows that in increase of support for the winning candidate could actually cause that candidate to LOSE, if it comes from the right place). This kind of strategic voting can happen with any algorithm that proceeds in multiple rounds.

Clearly, though, this form of strategic voting is more difficult than the kind seen in FPTP – “vote for your second choice to vote against your third choice”, which is what usually depresses the vote for third parties, even those who do well in polls. Strategic voting always involves having some advance knowledge about what the outcome of the election is likely to be, and changing one’s vote on that basis: under FPTP, this means knowing, for instance, that your favourite candidate is a distant third in the polls, and your second and third choices are the front-runners. Under IRV, it involves knowing the actual percentages much more accurately, and coordinating more carefully with others (to make sure that not too many people switch, in the above example). This sort of thing is especially hard to do well if everyone else is also voting strategically, disguising their true preferences, which is where the theory of such games with imperfect information gets complicated.

So there’s an argument that in practice strategic voting matters less under IRV.

Another criticism of IRV – indeed, of any voting system that selects a single-candidate per district – is that it tends toward a two party system. This is “Duverger’s Law“, (which if it is a law in the sense of a theorem, it must be one of those facts about asymptotic behaviour that depend on a lot of assumptions, since we have a FPTP system in Canada, and four main parties). Whether this is bad or not is contentious – which illustrates the gap between analysis and conclusions about the real world. Some say two-party systems are bad because they disenfranchise people who would otherwise vote for small parties; others say they’re good because they create stability by allowing governing majorities; still others (such as the UK’s LibDems) claim they create instability, by leading to dramatic shifts in ruling party, instead of quantitative shifts in ruling coalitions. As far as I know, none of these claims can be backed up with the kind of solid analysis one has with strategic voting.

Getting back to strategic voting: perverse voting scenarios like the ones above will always arise when the social choice problem is framed as finding an algorithm taking $n$ voters’ preference orders, and producing a “social” preference order. Arrow’s theorem says any such algorithm will fail one of the conditions mentioned above, and the Gibbard-Satterthwaite theorem says that some form of strategic voting will always exist to take advantage of this, if the algorithm has unlimited range. Of course, a “limited range” algorithm – for example, one which always selects the dictator’s preferred option regardless of any votes cast – may be immune to strategic voting, but not in a good way. (In fact, the GS theorem says that if strategic voting is impossible, the system is either dictatorial or a priori excludes some option.)

One suggestion to deal with Arrow’s theorem is to frame the problem differently. Some people advocate Range Voting (that’s an advocacy site, in the US context – here is one advocating IRV which describes possible problems with range voting – though criticism runs both ways). I find range voting interesting because it escapes the Arrow and Gibbard-Satterthwaite theorems; this in turn is because it begins by collecting cardinal preferences, not ordinal preferences, from each voter, and produces cardinal preferences as output. That is, voters give each option a score in the range between 0% and 100% – or 0.0 and 10.0 as in the Olympics. The winner (as in the Olympics) is the candidate with the highest total score. (There are some easy variations in non-single-winner situations: take the candidates with the top $n$ scores, or assign seats in Parliament proportional to total score using a variation on the same scheme). Collecting more information evades the hypotheses of these theorems. The point is that Arrow’s theorem tells us there are fundamental obstacles to coherently defining the idea of the “social preference order” by amalgamating individual ones. There’s no such obstacle to defining a social cardinal preference: it’s just an average.  Then, too: it’s usually pretty clear what a preference order means – it’s less clear for cardinal preferences; so the extra information being collected might not be meaningful.  After all: many different cardinal preferences give the same order, and these all look the same when it comes to behaviour.

Now, as the above links suggest, there are still some ways to “vote tactically” with range voting, but many of the usual incentives to dishonesty (at least as to preference ORDER) disappear. The incentives to dishonesty are usually toward exaggeration of real preferences. That is, falsely assigning cardinal values to ordinal preferences: if your preference order is X > Y > Z, you may want to assign 100% to X, and 0% to Y and Z, to give your preferred candidate the strongest possible help. Another way to put this is: if there are $n$ candidates, a ballot essentially amounts to choosing a vector in $\mathbb{R}^n$, and vote-counting amounts to taking an average of all the vectors. Then assuming one knew in advance what the average were going to be, the incentive in voting is to pick a vector pointing from the actual average to the outcome you want.

But this raises the same problem as before: the more people can be expected to vote strategically, the harder it is to predict where the actual average is going to be in advance, and therefore the harder it is to vote strategically.

There are a number of interesting books on political theory, social choice, and voting theory, from a mathematical point of view. Two that I have are Peter Ordeshook’s “Game Theory and Political Theory”, which covers a lot of different subjects, and William Riker’s “Liberalism Against Populism” which is a slightly misleading title for a book that is mostly about voting theory. I would recommend either of them – Ordeshook’s is the more technical, whereas Riker’s is illustrated with plenty of real-world examples.

I’m not particularly trying to advocate one way or another on any of these topics. If anything, I tend to agree with the observation in Ordeshook’s book – that a major effect of Arrow’s theorem, historically, has been to undermine the idea that one can use terms like “social interest” in any sort of uncomplicated way, and turned the focus of social choice theory from an optimization question – how to pick the best social choice for everyone – into a question in the theory of strategy games – how to maximize one’s own interests under a given social system. I guess what I’d advocate is that more people should understand how to examine such questions (and I’d like to understand the methods better, too) – but not to expect that these sorts of mathematical models will solve the fundamental issues. Those issues live in the realm of interpretation and values, not analysis.