« The Price of War | Main | New Column »

Don't Count Your Models Before They've Hatched

24 Oct 2006 01:44 pm

Ezra Klein touts a new statistical model forecasting "an expected Democratic gain of 32 seats with Democratic control (a gain of 18 seats or more) a near certainty." Ezra remarks, "All the usual disclaimers apply, but things would have to go mighty awry for this election to slip through the left's fingers."

Well, let me offer some disclaimers. One is that there are two kinds of models based on historical data. One kind looks at the historical data, devises a model that fits the historical data well, and then offers a prediction based on the model. That's what these guys have done. In another kind of model, you do the same thing and then, having offered your prediction, you wait for the election outcome and it turns out that your prediction was good. Then, next time around, the same thing happens. That is to say that in the second sort of situation your model is not only based on historical data, but has an actual track record of success. I'd be a lot more confident in a model with a track record, since there are actually any number of formulae that might fit the historical data well.

More concretely, I still worry that we might see a new al-Qaeda video aimed at tipping things toward the GOP. I wish more liberals were out there putting this worry and Brad Plumer's argument about it, out there before it happens.

Share This

Comments (26)

Haven't you heard? The GOP was so worried that Osama wouldn't make a video to help them like he did in 2004 the RNC went ahead and made one a nice Al-Queda recruiting video on their own. It was on all the news channels.

My only electoral model is despair.

Whatever.

agreed that we should be out ahead of the next al-Qaeda video.

Even more important, we should be out ahead of Bush's surprise bombing of Iran.

Maybe it won't happen. But I don't like the moves afoot with the Navy.

The researchers wouldn't have to wait until the next election to better evaluate the quality of their model. They could hold out a set of elections from the set on from which they derive the model's parameters, then use the held out set to fairly evaluate the quality of the model. (i.e. They could see how well their model predicts the outcomes in the held out set.)

I haven't read the paper, so I'm not sure if they did this. If they did, I don't think your quibble stands. If they didn't, I'd wonder why not.

Actually, all models of this sort are purely based on historical data (since if we had future data, there wouldn't be much reason to build a model). Let's say you have all the historical data and your model. Common practice is to take some of the data out (in time series data, that would be the most recent observations--in this case, the most recent election cycle(s)) and produce your model on the test data, then see how well it worked looking at the holdout data.

The 2nd kind of model you describe uses future data as its holdout. That's fine, but it means you have to wait a year before applying a model, which is totally unnecessary. The statisticians can just use the most recent election cycle as their holdout variable and use it to test the model. Indeed, I would be a little surprised if they didn't do that.

ErnieP beat me to it--and said it better.

Whatever model used may be thwarted by the coming election day antics. Screw-ups, both unplanned and planned might significantly raise the noise-floor for any useful signal they might glean from the model

Today is October 24. Four years ago, Paul Wellstone's plane crashed in October 25.

Don't discount the possibility of "things going mighty awry" at the last minute.

Agree totally. I wish some of the rest of Left Blogistan would cease and desist with the victory laps and concentrate on GOTV. This thing will pivot on turn-out.

Matt, any way you could prevail on, say, myDD and firedoglake to get real?

A model that is validated using hold-out data is better supported than one that isn't. However, Matt's overall point still stands. Suppose I come up with thousands of different random models, and evaluate them all using hold-one-out cross validation. The one that scores the best will, in effect, have been "trained" on the hold out data.

In other words, in something as complicated as election forecasting, a model that is fit to past election results will end up being biased by factors particular to them. If the election process is described by nonstationary statistics, and you have enough historical elections to use in your training set, then the bias will be negligible. However, that is almost certainly not the case here.

Maybe if you guys had a better foreign policy, voters would not run to the Republicans every time al Qaeda put out a new video.

Maybe if the GOP actually cared about our country instead of enriching themselves, they wouldn't be doing Bin Laden's work for him.

Every time the GOP says "vote for us or die," Osama smiles.

Just to restate my point to make it a little clearer: the problem is that if researchers used past elections as part of the process of developing their model, even if only in a hold-one-out validation procedure, then they will inevitably end up fitting their model to the validation data. This is because they will tweak the model parameters until the model does well on the validation data.

If, on the other hand, there was a past election that the researchers were completely unaware of, and then they tested their final model on that election, then that would be a true validation of the model, in just the same way that a future election will be.

Speaking from a data mining perspective. Say one had perhaps 100 elections in which you had sufficient data to be useful -- I'd say it's probably even less than that, as what you really want is a lot of pre-election polling, and that didn't come into widescale use until the last half-century.

In the simplest case, you simply take about 2/3rds of the data to train your model, then test it on the remaining third. (In actuality, you'd want to do some other stuff -- some elections are more 'interesting' than others and you'd want to expand your samples to account for that).

This doesn't overfit the data -- your model must accurately predict the training data AND the test data. I mean, it COULD overfit the data if you kept went back and kept nudging it, but I'm used to automated processes here. All I get out of test data is how well I predicted it. I don't get stuff like "I was too optimistic about GOP chances in elections where condition X was less than Y". Just "67% accurate" or "15% more Type II errors than Model XYZ".

It'd be difficult to create a 100% model on the test data unless you allow some really rigorous feedback -- the sort you explicitly allow for training data and disallow for test data.

Maybe if you guys had a better foreign policy, voters would not run to the Republicans every time al Qaeda put out a new video.

Al, we'd still be the party of blacks, gays, atheists, and man-hating women.

SCMT: and latte-drinkers, Volvo-drivers, etc. But none of those things has anything to do with Matthew's expressed worry about an OBL video.

My only point is that you wouldn't have to worry about a late-arriving OBL video causing people to vote Republican if, you know, people didn't vote Republican when they see an OBL video. You might want to consider how to accomplish that.

On the other hand, it is entirely possible that OBL videos don't have any effect on people's voting at all... but that would ruin the (widely-held?) theory on the left that the late-breaking video in '04 caused Kerry to lose.

I think OBL counts as a brown person, and therefor a Democrat. We can't win on this issue on until the Nazis come back. I think Bush might be overseeing the recreation of our Cold War enemies, but I don't know if he has time--only two years--to convince a reunified Germany to aggressively rearm. That might have to be left to McCain.

the (widely-held?) theory on the left that the late-breaking video in '04 caused Kerry to lose.

Strawman, from Al the Inveterate Fuckwit. But I didn't realise that the CIA's chief analysts were part of 'the left' (see The One Percent Doctrine, where Suskind's sourcing almost certainly is John McLaughlin.)

Everyone should make their own Osama video and post it to YouTube. Flood the zone!

Barrone takes a different point of view.

http://www.usnews.com/usnews/opinion/baroneblog/archives/061024/the_house_elect.htm#more

The best political analysis is one part deduction, and seven parts ESP.

Bill Clinton is a psychic whore.

Al, if your guys had a better foreign policy OBL wouldn't be making videos. It's all irrelevant anyway, though, as it looks like the New Jersey Supreme Court will come through to save the day for the Republicans.

What BrklynLibrul said. Even October/November surprises aside, Dem cakewalks in recent history have had a tendency to fail to caketalk (or whatever). We need to get our voters to the polls. (Speaking of which, BL, there's a MoveOn phonebanking office in Brooklyn, email me if you're interested in making some calls...great way to have an impact outside of a district where the real election happened on 9/12.)

Whoops, forgot my email address: tps12@columbia.edu

It's GOTV stupid! The repubs beat us cause we are lazy fcks.

but I don't know if he has time--only two years--to convince a reunified Germany to aggressively rearm.

I stand corrected.


Comments closed November 07, 2006.

Copyright © 2007 by The Atlantic Monthly Group. All rights reserved.