« Columbus Day | Main | Expertise »

Gassy

08 Oct 2007 02:23 pm

oil2.png

We all know that correlation doesn't prove causation anyway, but the issue I'd like to raise about the purported tight link between the price of gasoline and George W. Bush's approval rating is that it's hardly clear to me that there's even a correlation here beyond the basic fact that Bush's approval rating has generally gone down since 9/11 and oil prices have generally gone up.

Consider, if you will, the detail to the left. This shows the data from September of 2005 to September of 2007, a period during which the final price of gas was very close to the initial price, but Bush's approval rating fell by a small but clear amount. Nothing about eyeballing this chart would lead you to conclude that gas prices were driving changes in Bush's approval rating. Sometimes the two indexes move in the same direction and sometimes they move in different directions. But since each index can only go in one of two directions, one would expect totally unrelated quantities to move in the same general directions about half the time.

All that said, obviously we do know that economic conditions are one of several factors that impact presidential approval ratings and that gasoline prices are an important determinant of people's assessments of their economic well-being. We also know that sometimes gas prices go up because of events (Katrina, for example) that independent make the president look bad. But the initial formulation of the gas-approval chart is meant to show a very tight link between the two quantities and it seems to me that the link just isn't there.

Share This

Comments (24)

i'm no statistician, but looking at the whole graph (not just the detail you've pulled out) it seems pretty clear to me that the two lines move as if one affects (but does not dictate) the other.

Matt, you really need to take a statistics course.

it's hardly clear to me that there's even a correlation here

This is something that you can test rigorously.

But since each index can only go in one of two directions, one would expect totally unrelated quantities to move in the same general directions about half the time.

This is what significance testing (among other approaches) is all about.

I do wish people would draw scatter plots when the are making statements about correlations. It's very difficult to accurately judge correlation from parallel time series.

(As an aside I put much of the blame on Microsoft Excel - the default graph for a pair of time series is exactly what is shown in the graph. Doing anything else requires effort)

If I remember right from my statistics is that if the odds of either plot moving up or down by itself is 1 out of 2 (1/2) then the odds for both going up or down at the same time are 1/2 * 1/2 = 1/4. So if the two measurements are totally independent of each other you would expect to see less tracking as the odds of them going in the same direction are 1 in 4.

Why not do a true statistical analysis. The time is irrelevent to the correlation. For each point in time there is a gas price and a approval rating. Make the approval ratings the x-axis and the gas prices the y-axis. There are formulas for determining the correlation, but a plot like the one I descirbed should help clarify things more. Obivously correlation and causation are different.

I would suggest that they move in a correlated manner because they are both significantly influenced by the situation in Iraq.

As the situation gets worse, global oil markets price in more risk raising the prices. Independently the American public wonders what the F we are still doing there.

An alternate theory is that they both could be significantly influenced by the declining value of the dollar which is caused by our budget/trade deficits. as the dollar gets weaker, the relative price of oil increases. as the dollar gets weaker, the american public realizes what a cluster-F we are in.

I would suggest that they move in a correlated manner because they are both significantly influenced by the situation in Iraq.

As the situation gets worse, global oil markets price in more risk raising the prices. Independently the American public wonders what the F we are still doing there.

An alternate theory is that they both could be significantly influenced by the declining value of the dollar which is caused by our budget/trade deficits. as the dollar gets weaker, the relative price of oil increases. as the dollar gets weaker, the american public realizes what a cluster-F we are in.

Mr Yglesias, I hate to beat a dead horse here, but it is very easy to address the problem you think you've identified. So easy, in fact, that it is the first thing anyone analyzing time series data will do. I would be surprised if someone hasn't already. The data can be de-trended, by subtracting from each data point the value of the immediately previous data point (such that your time series measures differences, rather than levels). Or an error correction model can separate long-term from short-term relationships. Just by eye-balling the data, I am pretty sure that you would find a correlation even if you eliminated the trend entirely. But like I said, this is a question that can be answered very easily with very simple computations. And my guess is that whoever created the graph you've posted has answered the question already.

Brendan Nyhan was all over this last year (click name for link).

Henwood's finding that "78% of the movement in Bush's ratings could be correlated with changes in gas prices" is based on an incorrectly specified statistical model. He used the logarithm of nominal gas prices to predict President Bush's approval rating. There are two problems with this, one conceptual and one technical. The conceptual problem is simple: 9/11. It boosted President Bush's approval ratings into the stratosphere, and they've more or less declined consistently since then. Meanwhile, gas prices have trended upward over Bush's presidency. The two series are correlated, but any variable that trended upward during this time period would show a similar relationship (I can "explain" 62 percent of the variance in Bush approval using a variable that just counts the number of months he has been in office). Second, the model is incorrectly specified (it suffers from what's called serial correlation, which means the errors at time t and time t+1 are correlated).

We also know that sometimes gas prices go up because of events (Katrina, for example) that independent make the president look bad.

Even more significantly - both gasoline prices and Bush's popularity are tied to events in the Middle East.

The secular trend in gas prices is up, up, up, due to greater global demand.

The natural trend of GWB's approval ratings is down, down, down (except when some foreign policy event makes people want to believe their President knows what he's doing) because to know him is not to love him.

That said, I'm sure the high price of gas plays some role in discontent with Bush, as it did with Carter.

Ug, we need some bloggers that aren't afraid of math. Detrend the data, calculate the residuals, correlate those, and post the damned numbers. Then one could do some informed interpretation.

Stupid eyeball analyses of graphs really don't tell us much of anything.

Give me the data in a csv file and I'd be happy to do that.

Every time I see a head like this one ("Gassy") on an Iglesias post, and especially on Monday, I assume that he is critiquing the Sunday news talk shows.

Even if they are correlated, I wouldn't draw any premature conclusions from the data.

For example, shorter people are known to live longer than tall people. So, should you cut off feet? Of course not.

Matt, your intuitions are basically correct. The comments by people who believe themselves to be statistically sophisticated are a great example of why a little knowledge can be a dangerous thing.

You are right that time series tend to be correlated even when there is no causal relationship between them at all. Standard statistical tests, which are based on the assumption of independent observations, can then be highly misleading. This is true whether the time series are "mean reverting" or "random walks." Showing a scatterplot, as some commenters suggest, will not solve this problem.

Taking first differences, as matt_c suggests, will generally only solve the problem if one of the time series is a "random walk," which I think is unlikely for these data.

(For what it's worth, I have a Ph.D. in statistics with a concentration in econometrics.)

I like this graph: Linky

The comments by people who believe themselves to be statistically sophisticated are a great example of why a little knowledge can be a dangerous thing.

In defense of those critiquing Matt here, doing any statistical analysis on the data is superior to eyeballing the two time series. I see this figure a lot - used to suggest that gas prices are the dominant factor in Bush's popularity. This simply doesn't hold up under any real analysis - nevermind common sense.

The reason the correlation seems so good at first glance is that Bush's ratings have steadily declined since 9/11, while gas prices have steadily risen. As the link Barbar provided illustrates, you can find lots of variables that follow that first order trend (including obviously nonsensical ones).

Ed, I'm sorry if I seemed to be oversimplifying the problem. I'm not sure what it is I've written that indicates I have only a little knowledge of statistics, or that what knowledge I have is dangerous. If I told you I was using the word "trend" in place of (the, yes, more generally applicable) "unit root" because I assumed that the latter was too technical a term for this discussion, would you let me have an opinion on this subject, too? Pretty please?

Note that first differencing the data wasn't the only suggestion I made.

My point was not that this was the only solution to the problem Yglesias seems to be so obsessed with, but that there are tools for dealing with these problems, and that the appropriate response is not "oh well, I guess we'll never know if they're correlated, because they may just be co-trending."

doing any statistical analysis on the data is superior to eyeballing the two time series

I emphatically disagree. Doing a misleading statistical analysis is worse than none at all.

matt_c, I guess I'm less sure than you are that the right answer is so easy to find through statistical methods. Yes, there are tools, and the tools are useful, but the tools themselves rest on certain unverifiable assumptions. You are quite right that a much more complete analysis could be done with these data that would shed light on this question. But it is not "very easy to address the problem" of time-series correlation, because the statistical properties of the underlying series are not known and can only be imperfectly estimated. There is a lot of art in there with the science. And I still think Matt's point was basically right, and fine for a blog post.

I emphatically disagree. Doing a misleading statistical analysis is worse than none at all.

But the statistical analysis is already being done - badly. It's the whole reason for the original plot: an attempt to prove a correlation between gas prices and Bush's approval rating. It's the point of the USA Today article and the point of Chris Bowers' lengthy discussion of the implications of the graph.

You're complaining that those commenting here ask for an actual (or at least better) analysis of the data. They are complaining that no actual analysis is even being attempted.

Matt makes a good point that the correlation doesn't look so good on close inspection. Anyone defending the original interpretation should be able to back it up statistically. Otherwise the plot is garbage.

Ed -- Matt's point was that if you get rid of the trend, you get rid of the correlation. My point is that this is not necessarily true. I'm still not sure what your point is. "Multivariate time series analysis is hard"?

Bill James has shown over the years that simple quantitative analyses with a lot of thought put into what you're actually analyzing in the real world are generally more effective than applying ultra-sophisticated statistical methodologies to poorly thought through models of reality. Bill James isn't a great statistician, but he knows a huge amount about baseball and has real good horse sense of how it works.

In contrast a lot of statistical wizards don't know much about anything else -- see for example my debate in Slate in 1999 with Steven "Freakonomist" Levitt over whether legalizing abortion lowered crime. His complicated model based on state data said it did while my simple reality checks based on national data said the results were sharply in the wrong direction, so I doubted his theory. Finally, six years later, two economists redid his model and discovered he'd made two fatal errors in it.

So, a better approach than thinking just about Bush would be to look at other Presidents' experiences. Gasoline shortages definitely hurt Carter in 1979, leading him to give his notorious "malaise" speech, and probably hurt Nixon in late 1973-74, although people were mad at him at the time for a lot of things. However, both of those traumas included not just rising prices but also long lines at the gas pump due to price controls.

There was a short sharp bump upward in the summer of 2000. Did that hurt Clinton and Gore? I don't know but somebody could look it up. What about the rise and fall in 1990-1991 -- what effect did that have on Bush? What about the fall to under $1 gas in 1986 when Reagan got the Saudis to open the spigots to bury the Soviets? I don't know what the results of a careful look at the data would show. I'd guess gas prices tend to have a modest effect on Presidential popularity, but that they don't play the dominant role with Bush's sinking approval ratings, which I suspect are driven largely by people getting to know him better.

Matt's point was that if you get rid of the trend, you get rid of the correlation.

More to the point, if you look at the Bush's popularity figures it's dominated by 3 events - 911, the invasion of Iraq and the capture of Saddam. Each time he had an immediate jump in approval followed by a nearly linear drop back to the mean. I don't think anyone can seriously argue the price of gas dominated his poll ratings during that time.

That brings us to about January of 2004. Now look at pollkatz's chart. Still see a trend?


Comments closed October 22, 2007.

Copyright © 2007 by The Atlantic Monthly Group. All rights reserved.