« More Snitch | Main | Hostages »

Garbage In, Double-Garbage Out

07 Dec 2007 04:12 pm

ESPN.com's John Hollinger playoff predictor seems like an uncommonly dumb feature. It starts by using his interesting but flawed PowerRankings formula and then "Based on those rankings, each day the computer plays out the remainder of the season 5,000 times to see the potential range of projected outcomes. The results reveal the most likely win-loss record for each team -- and how likely it is for each team to make the playoffs, win the NBA title, win the lottery, and so on."

All this serves to do, however, is exaggerate flaws in the original model by compounding them over and over again.

Share This

Comments (19)

Wait -- you mean you don't think Boston and Orlando are the most likely championship winners??

Running repeated randomized trials is a way to produce a probability distribution over the outcome, based on the underlying model. It will not exaggerate flaws in the original model. It just reveals the predictions of the model without those predictions being biased by random noise, which can happen if too few trials are considered.

There's nothing paradoxical about having two teams from the same conference being the most likely championship winners. If that conference is very skewed by having two dominant teams, then those teams may in fact be the most likely to win the championship because they are the most likely to get to the championship. This can happen even if they are not any better than the best team in the other conference.

I was wondering when you were going to blog on this. Its kinda stupid. Some teams have a 0% chance of making the playoffs. I have a 0% chance of making the playoffs. But not the Clippers. They have some sort of chance.

It's not clear how running what is basically a statistical monte carlo exercise exaggerates or compounds the flaws.

The results of the exercise has exactly the same flaws as the original, but allows us to translate his predictions into actual outcomes. Thus, we can directly evaluate the benefits and flaws of his system in terms of win-losses instead of in an abstract "Power" number. In that sense, it's actually a much better feature than raw Power Ranking numbers.

Ok, this has to go down as one of your worst blog posts ever. I expected the link to the Power Rankings to go to some previous post of yours explaining why they were bad. It doesn't. So you just claim the rankings are "flawed" with no further explanation.

While I can think of many things I'd do to the ranking system to make it better, overall I think it is pretty good. We know point differential is better at predicting future outcomes than win/loss. It takes opponents record into account.

Overall, I think this a pretty cool use of computer simulation that actually sheds light on how well teams are playing right now.

Also, there is regression to the mean built in, and frankly, the final standings make sense to me even if the playoff odds don't.

The problem for the teams in the West is that, ultimately, there are 9 legitimately good teams in the West.

San Antonio, Phoenix, and Utah are the top 3 teams for sure. The other six are really good, and have the potential to add a big player.

In the East there are two very good teams right now in Orlando and Boston, and then two good teams in Detroit and Cleveland.

However, the real flaw is that it doesn't take into account injury. What Hollinger ought to be doing right now is regressing the future standings on his 2007-2008 projections and not just to the mean. This is what Baseball Prospectus does with it's PECOTA adjusted playoff odds reports.

What everyone else said about Monte Carlo analysis.

And actually, doing a Monte Carlo analysis seems like it could potentially improve the PowerRankings formula, because it allows you to do sensitivity analysis. That is, you can subtly adjust the input parameters, and see whether the adjustments actually make a difference to the predicted outcome.

I don't know whether this would actually help in this instance, because I don't know squat about PowerRankings, but it seems like an obvious thing to do.

What Joel W. said...

MP: The "0 percent chance" of the Clippers making the playoffs actually means "less than 0.5% chance" which is quite believable.

Duh, sorry, actually he says "0.0% chance" which means "less than 0.05% chance", which is also believable. His model won't be more precise on how fantastically unlikely in would be, because he isn't saying more than the fact that it didn't happen in a few thousand simulations. I would think the probability gets less than one in a gazillion well before the Clippers are actually mathematically eliminated.

Matt, based on your stated logic, you should be equally skeptical of global warming climate models, since they use all sorts of monte carlo analysis in order to determine the likely range of warming, sea level rise, ice pack density, humidity, etc etc.

Now, back to Basketball, what would be interesting is if you could take Hollinger's model, and run it against historical data, and see if it properly predicts who ends up in the playoffs and who wins, for each of the last 30 years. If it could successfully do that, it would be a great model, your snark aside.

Now, of course, you might find that it comes up with some unusual results. You might even have to add some "fudge factors" to different metrics each year, in order to ensure that it would properly pick the winner in 2000, 2001, 2002, etc. Most distressingly, you might have to pick different metrics to fudge each year, essentially adding an arbitrary and hindsight biased "fix" to the data for each year.

If you did that for every year in order to make the historical models fit the historical data, then I'd say that the model was pretty crappy. Why would it be able to predict this year or future years, when it can't predict previous years without lots of tinkering? That's the definition of a bad prediction model.

Thank goodness that there's no fudge factors in any of our climate models.

Fuck stats.

Have you people seen the Iverson games the past two nights?

Why only 5,000 simulations? Coolstandings does 1,000,000 simulations every day for baseball, which has a longer season. 5,000 replications will still leave a fairly substantial margin of error, probably on the same order as the "true" day to day fluctuations due to wins and losses. Doesn't ESPN have access to any newfangled computers?

Was I the only one who followed the link to playoff predictor and was stunned to find Detroit as a lock for the playoffs? I mean, they're 6-6, on a four-game losing streak, with two games against Dallas and Green Bay still to go. They'll be lucky to go 8-8, and that won't be good enough for a wildcard bid this year, especially now that the NFC seems to have found its feet again.

(I hate this time of year, when people start wasting their time with games that aren't football, even though football is being played.)

Umm, coolstandings does it w/ basketball too, which I didn't know. Obviously, it's a useful comparison:

http://www.coolstandings.com/basketball/

Cool standings seems to have a heavier regression mechanism.

ed,

Doing 1,000,000 replications would not likely produce a much smaller margin of error than doing 5,000. It's different than increasing the sample size in a survey. All it would do is smooth out the probability distribution over what is already probably something quite smooth; you might get slight changes in predicted values, but the width of the confidence band is determined by the data available to make the prediction, not by how many times you predict it.

Was I the only one who followed the link to playoff predictor and was stunned to find Detroit as a lock for the playoffs? I mean, they're 6-6, on a four-game losing streak

You're not taking into acocunt the trade for Cabrera and Willis.

Coolstandings doesn't take the strength of the schedule already played into account, so I'd say Hollinger's rankings are probably more accurate, at least this early in the season.

What is this football thing? Isn't that what they play in Europe? Kinda like quiditch but without broomsticks?

I agree that the predictor is dumb at this point of the year, but in the final weeks, it has the potential to be awesome.

It reminds me of the Baseball Prospectus postseason odds that were awesome at the end of the baseball season this year.

http://www.baseballprospectus.com/statistics/ps_odds.php

Of course there will be flaws, but ultimately, it is a hell of a lot better then Stephen A. SMITH screaming about something he knows nothing about.


Comments closed December 21, 2007.

Copyright © 2008 by The Atlantic Monthly Group. All rights reserved.