An Expert Explains How You Should Read the Polls

It’s now less than a week until Election Day, and if you’re like most American news junkies, you’re watching your news feed, feeling like an impatient kid on December 21, shaking Christmas presents that might actually have coal in them. To make matters worse, the candidates don’t have much left to do or say at this point, so our unblinking eyes are mostly inundated with polling data. In the absence of candidate debates or hard news, all we can learn is who has an unexpected lead in a new poll, who just surged ahead in which swing state, and exactly how each new result can be projected prematurely on a speculative blue and red electoral map.

Polls are supposed to give us the answer to a simple question—is Hillary Clinton or Donald Trump more likely to win?—but when different polls answer that question differently, and when even different aggregation models are spitting out wildly different numbers (Thursday morning, FiveThirtyEight gave Clinton a 67 percent chance of victory, while the New York Times had her at 86 percent), scrutinizing the polls can make us feel like we know less than when we started.

Videos by VICE

Depending on which polls you read in the past few days, Trump is ahead nationally, just behind Clinton, or being trounced by Clinton. Trump is leading by seven points in North Carolina, and also tied in North Carolina.

So what gives? To walk me through this poll confusion, I got in touch with Princeton professor Sam Wang, who is a data scientist, a neuroscientist, and the operator of the Princeton Election Consortium blog. He told me how to crunch numbers if I want to do my own poll aggregation, and how to find an escape hatch from poll mania when things get too crazy.

VICE: I’m trying to understand what’s going to happen in this election, but FiveThirtyEight, the Upshot, the Huffington Post, and others all give me wildly different numbers. What’s going on?
Sam Wang: Different aggregators make different assumptions about how much the polls may be off in either direction, or how much opinion may swing in the next six days. Obviously Clinton is favored, but by how much?

At this moment, Huffington Post’s model says Clinton is leading with 47.8 percent to Trump’s 41.9 percent in the polls, which allows them to project a very probable victory for Clinton. Is FiveThirtyEight’s 68.9 percent probability of a Clinton victory a similar measure of the outcome?
No, that’s totally wrong. People are terrible at gauging probabilities. Anything in the 20–80 percent [range] is up in the air. A chance of one in five of the “surprising” outcome is like a game of Russian roulette. I think people get vote share and probability mixed up.

OK, so what’s the difference between vote share and probability?
If a candidate is polling at 60 percent to 40 percent, his win probability is basically 100 percent. As margins get larger, probability gets way larger. Even a 5 percent margin is pretty definitive if the aggregation is done well.

Aggregators forecast outcomes multiple times a day. Are the actual probabilities really changing all the time?
Ideally, a forecast shouldn’t change that much over time. Effectively, their forecast acts a lot like a snapshot of current conditions, even though they call it something else. Because they also have what they call a “Now-cast,” they basically have two snapshots, one blurrier than the other

Sometimes a poll—like the ABC/Washington Post one from Tuesday—shows a sudden drastic shift in public opinion. What can I make of that?
Basically, single polls can always be off for two reasons: Taking a sample of voters can be a little off in one direction or another, simply by chance; also, each pollster has to apply judgement when getting the sample composition to match the pattern of people who will vote.

But when I dig into the details of how a poll got made, sometimes I feel less certain, like polls don’t mean anything at all. Should I be digging through all those details to get more or better information?
I didn’t say consumers should do it. The point is that pollsters apply their expertise. Individuals can just calculate the median, or rely on aggregators such as HuffPollster or whoever.

How do I calculate the median?
Arrange the polls in order of margin, and then take the middle value. For example, if results are Trump at plus 1 percent, Clinton at plus 2 percent, Clinton at plus 5 percent, then Clinton at plus 2 percent is likely to be closer to the truth than the other two values. Anybody can do this.

What if I need to do the opposite—get just the minimum information I need to stay on top of the news?
For those who are anxious, they should go find a place where their efforts make the most difference. For example, at the Princeton Election Consortium I list which states have races where a few votes have the most power to shape the overall outcome. That’s not the presidential race—that is essentially settled unless there’s a double coastal tsunami. But the Senate is very much in play. People can go get out the vote in key races like New Hampshire, Nevada, North Carolina, Pennsylvania, Indiana, or Missouri. Or they can get out the vote in key House districts—I have an app for that on my website. The other way to get rid of stress is to ration your news intake to once a day. Maybe at the end of the day with a beer. And whatever you do, turn off the TV. That’s an unusually frustrating way to get information, repetitive and not thoughtful.

So just look at an aggregator, and only do it once a day?
Words to live by!

Follow Mike Pearl on Twitter.