Tuesday, November 9, 2010

Polling in Minnesota Governor's races; part 2

This is part two of a series I began yesterday. Part one can be found here.

In yesterdays post I went through some past Minnesota elections and the polling associated with them. What I found is that polling in Minnesota Governor's races has exhibited a consistent DFL bias, by between 4-5 points on average.

I concluded the post with this:

There are two possibilities; either the polls themselves are wrong, or the polls are right and the GOP candidate is receiving a wave of support at the end of the race.

It's those possibilities that I will deal with today.

The case for the polls being wrong

Before the election I pointed something out in the polling of the Minnesota Governor's race that others had been seeing on a national level, that automated polls tended to be more favorable to GOP prospects than traditional live interviewer polls.

Here's the final breakdown of automated versus live interviewer polls for this years Governors race:

Image Hosted by ImageShack.us

Prior to the election this looked like a significant GOP leaning "house effect." Now that the election has taken place we can say that it was the live interviewer surveys that had a significant DFL bias.

For as much heat as Rasmussen has taken for their polling this year, in Minnesota they were one of the more accurate pollsters. Certainly much more accurate than Humphrey institute or St. Cloud St.

The same is true of 2006, the automated polls faired much better.

Image Hosted by ImageShack.us

And again, the DFL candidates share isn't affected by the type of poll nearly as much, it is the GOP candidate that experiences about a 5 point drop off.

Beyond the fact that the automated polls have been more accurate, is the fact that the automated pollsters in the above lists have much better pollster ratings than those in the live interviewer list, so we should expect them to produce better polls.

In 2006 and 2010, in fact, the most accurate public poll released in the last month of the campaign was the last SurveyUSA poll and not coincidentally they are the best rated pollster to poll those races.

The least accurate polls of those two elections were from Humphrey institute and St. Cloud St, which, also not coincidentally, are among the least accurate pollsters.

Regardless of this, the relative inaccuracy of the live interviewer polls and pollsters shouldn't resolve itself in a consistent DFL bias. If it was just a case of bad polling we would expect the live interviewer polls to overestimate the GOP vote share as well on occasion, but this hasn't happened.

Something else is going on besides bad polling.

One possibility; response bias.

Fundamentally the difference between live interviewer polls and automated polls is that in the former you speak with an actual person who asks questions and records answers. In the latter the questions are asked by a computer and the responses are received by pressing digits on the phone.

Due to these difference's robo-polls have a comparatively lower response rate to that of live-interviewer polls. When being polled by a live interviewer firm, the person conducting the phone call can help to persuade you to stay on the line for the entire survey and people may be less inclined to hang up on an actual person.

When being polled by a computer some people will hang up part way through and there is no one there to convince them to complete the survey. This is just one way in which an automated poll can exhibit a response bias.

Another way:

What if voters are more likely to admit their tolerance for marijuana to an automated script, which may create the feeling of greater anonymity? Marijuana usage remains fairly stigmatized in polite society in America, enough so that even liberal politicians like Barbara Boxer, Dianne Feinstein, Jerry Brown and Barack Obama have refused to state their support for legalizing the drug. But as most Americans between ages 20 and 55 have smoked marijuana, they may not consider it such a big deal in the privacy of their homes -- or the privacy of the ballot booth.

In this theory Tom Emmer is Marijuana, and the Minnesota electorate is embarrassed to tell a live interviewer that they support him.

I'm somewhat doubtful that this is the case, after all, why would it only show up in Minnesota Governor's races, only the most recent of which featured Tom Emmer, and not Senate or Presidential race's.

Nonetheless, the relative accuracy of the robo-polls might have something to do with the response bias that goes with them.

Another possibility; likely voter screens

The polling in the 2004 and 2008 Presidential elections and the 2008 Senate election was very accurate, the polling in 2010 and 2006 less so.

Could it be the electorate that shows up in mid-terms in Minnesota is substantially different than the electorate that shows up in Presidential years?

In this case the answer could be that the better pollsters are better pollsters because of their better likely voter screens and the automated/live interviewer difference I'm picking up is really a difference in quality of pollster.

Additionally in the presidential years and the 2008 Senate race there were a ton of polls conducted which will tend to take care of whatever problems exist with individual pollsters through sheer quantity of data.

When you are dealing with a smaller amount of polling however, like in the 2006 and 2010 races, then maybe the quality of that polling should be of greater consideration. This sort of dovetails into the next topic, the possibility that the polls are actually right, or at least some of them.

The case for the late GOP surge

Let's look at a smaller subsection of polls from past governor's races, those done by the automated pollsters and conducted closest to the election.

Image Hosted by ImageShack.us

Image Hosted by ImageShack.us

Using just the polls from automated pollsters done closest to the election gives us a much better idea of the outcome than including all the previous polls. This would tend to support the idea that the election itself is tightening down the stretch.

Let's take another look at the last poll SuveryUSA did this year.

SurveyUSA (10/28, likely voters, 10/14 in parentheses):
Mark Dayton (D): 39 (42)
Tom Emmer (R): 38 (37)
Tom Horner (I): 13 (14)
Undecided: 6 (4)
(MoE: ±4%)

At the time I wrote that the trend lines could just be float within the margin of error. The other possibility though was that Emmer had in fact closed the gap in the two weeks since the previous poll, going from down five to down one.

There is also the case of the internal POS poll the Emmer camp released, that showed a tied race, pre-affirming the above SurveyUSA poll that would be released later.

If this is the case, that the race significantly tightened down the stretch, as opposed having been tight all along, then the live interviewer polls that were done a few weeks before were not as inaccurate as they look now.

In fact there is some evidence that this is true on a macro level, not just in Minnesota:

But there were more differences between polling in governors’ races and Senate polling than anticipated. A particularly noteworthy trait of the polls of governors’ races is the tendency in some cases for those polls to regress toward the mean.


This is all based –- like everything else in our models -– on historical evidence. In governors’ elections since 1998, for example, there have been 24 candidates in open-seat races who, 90 days before the election, had a lead of 10 points to 20 points in the polls (i.e., “about” 15 points). Three-quarters of these candidates –- 18 out of 24 –- underperformed their polling on Election Day. On average, they won not by 15 points, but by 9 (although only one actually lost).

In the end a poll is just a snapshot in time, so if the dynamics of the race are changing a poll will quickly be irrelevant.

It's easy to point to reasons for these changing dynamics in 2002 and 2006, 2010 however is a little trickier. In 2002 the Tim Penny campaign collapsed in the month of October and how that was going to play out on election day was probably still in flux down to the end.

In 2006 Mike Hatch melted down a week before the election and that provided an easy explanation for his failure to win. What's the explanation for 2010?

It may be that those things that looked like explanation's for 2006 and 2002 were in fact hiding the real explanation. This is nothing more than theory speculation, but it's possible that the bulk of the undecided vote are Republican leaning independents who are holding out to see if the IP candidate can gain some traction before finally committing to the GOP candidate.

Or it could simply be some combination of the factors discussed above; a tightening race, a dearth of quality polling, difference in likely voter screens, an unusual mid-term election cycle.

I don't have a nicely wrapped in a bow answer to ldc's question, but if I can take a few thousand words and not come to a satisfying conclusion than I've done well.

With that, here's another tune to twitter away the afternoon with.


  1. "It may be that those things that looked like explanation's for 2006 and 2002 were in fact hiding the real explanation. This is nothing more than theory, but it's possible that the bulk of the undecided vote are Republican leaning independents who are holding out to see if the IP candidate can gain some traction before finally committing [reluctantly] to the GOP candidate."

    This seems logical to me.

  2. I should have phrased that differently. Saying it's theory implies I have some evidence, which I don't. Speculation would have been the correct word to use.