Blockchain or bust? Experts debate applications for elections Technology central to vote count in a tight election

Presidential polling data again misses the mark in 2020

Just as they did in 2016, the election polls underestimated support for Donald Trump. Although he didn't defeat Joe Biden, the result was closer than the polls predicted.

Unlike 2016 when presidential polling data and the predictive models it feeds had the general election wrong, the polls got the overall result right in 2020.

According to the website, which tracks each individual poll, Democratic president-elect Joe Biden was expected to defeat President Donald Trump, and when Pennsylvania was called in Biden's favor on Saturday Biden surpassed the 270 Electoral College votes needed to clinch the election and will be the 46th president of the United States.

But while the presidential polls fed analytic models that properly predicted the overall outcome, they still missed their mark in many cases by underestimating the percentage of voters who would vote for Trump.

Even in the popular vote, the tally was closer than expected with Biden expected to win by about 8% but now leading by 3% with a small percentage of votes still being counted.

Had just a couple of closely contested states ultimately gone for Trump rather than Biden, just as in 2016, the data feeding the predictive models would have been flawed just enough to result in an election surprise.

"The most important thing in polling is getting the call right; the bonus is by how much," said Michael Cohen, founder and CEO of Cohen Research Group and an adjunct professor at Johns Hopkins University and the University of California, Washington Center. "As an industry, we got the call right in Biden, but the spread was outside the margins of acceptable error for national polling."

Just as the polls missed blocs of voters who swung the election in Trump's favor in 2016, they missed blocs of Trump voters in battleground states again, particularly in Pennsylvania and the Midwest states of Michigan, Ohio and Wisconsin.

Four years ago, Clinton was predicted to win Michigan and Wisconsin, while Ohio and Pennsylvania were considered toss-ups that would go narrowly for Trump. Instead, Trump won all four states. In addition, Clinton was expected to narrowly win Florida and North Carolina, which also went for Trump, and Trump ultimately won 304-227 in the Electoral College.

Although Joe Biden was able to defeat Donald Trump, the election result was closer than most pollsters expected.
Just as the polls missed large swaths of people who voted for Donald Trump in 2016, they were again unable to account for Trump supporters in 2020.

Presidential polling data in 2020 showed Biden winning five of those six states -- all but Ohio -- but again underestimated the turnout for Trump.

The difference in 2020, however, was that the margin of victory for Biden was expected to be bigger in most of those states than the margin for Clinton in 2016, and though Trump performed better in each of those states than the polls showed, Biden was still able to win Michigan, Pennsylvania and Wisconsin.

An average of the seven most recent polls in Michigan leading up to the election showed Biden ahead by 5.4%. In Pennsylvania, it was 3.7%. And in Wisconsin, an average of the five most recent polls had Biden up 9.2%.

When the votes were counted, however, Biden won Michigan by 2.7%, Pennsylvania by 0.6% and Wisconsin by 0.7%. Florida and North Carolina, meanwhile, are both expected to go for Trump with Trump winning Florida by 3.4% and leading in North Carolina by 1.4%.

In Ohio, an average of the seven most recent polls showed Trump winning by 0.9%. Instead, he won by 8.2%.

The most important thing in polling is getting the call right; the bonus is by how much. As an industry, we got the call right in Biden, but the spread was outside the margins of acceptable error for national polling.
Michael CohenFounder and CEO, Cohen Research Group

In each instance the polls overestimated Biden's percentage and underestimated Trump's, whether the presidential polling data was near or inside the margin of error as in Michigan and Pennsylvania or well without as in Wisconsin and Ohio.

One reason for the underrepresentation of Trump voters in the polls may simply be just how difficult is to get people to respond to pollsters, said David Paleologos, director of the Suffolk University Political Research Center in Boston.

"There is always the idea of a 'shy voter,' i.e., the person who supports a candidate but won't admit to it on surveys, or -- possibly more significantly -- doesn't trust polls or institutions and therefore won't respond to them or participate in them, which could skew the numbers in one direction or another," he said.

In addition, Paleologos continued, polling, while no longer limited only to phone calls and now employing emails and texts as well, still somewhat relies on cold calling potential respondents.

"There are always challenges in capturing people who don't answer landlines or don't answer their [cell]phones at all to numbers that they don't recognize or trust," he said.

Beyond voters who don't admit to supporting a candidate or simply don't respond to a poll request, in 2016 the misses were blamed in part on a lack of polling of non-college educated voters and a lack of polling in the states that ultimately swung the election for Trump in the days leading up to the election. The polls in those states were largely done weeks before the election and missed the late surge in Trump's favor.

Polls in 2020 attempted to correct for those mistakes, but other lessons are already emerging from 2020.

Among them, it's clear that there's no single Latino voting bloc Democrats can count on.

In Florida where there's a large Cuban American population in the Miami area, nearly half of all Latino voters went for Trump. In Arizona, meanwhile, where Mexican Americans make up the bulk of the Latino population, 63% of all Latino voters went for Biden.

"It's crucial that we delineate within the Hispanic demographic," Cohen said. "Cubans and Mexicans are termed as Hispanic, but the Obama administration's opening of Cuba caused a backlash in South Florida, something Biden should have seen coming and didn't really have an answer for in the campaign."

In the Midwest, meanwhile, polls conducted by organizations that poll nationally were once again unable to account for Trump voters. Some of the polls conducted by local organizations, however, proved more accurate at the state level.

In Iowa, for example, an average of the five most recent polls showed Trump leading by 1.6% heading into the election. Four of the five -- including polls from Emerson College in Boston and North Carolina-based Public Policy Polling -- had either Trump or Biden up by just 1%. A poll by Selzer & Company, based in Des Moines, Iowa, however, showed Trump ahead by 7%. He won the state by 8.2%.

"The national media polling organizations need to be given bigger budgets to hire expertise within battleground states so the spread of their calls can be much closer," Cohen said. "But it's expensive."

And cost is indeed a hindrance.

Polling is mostly conducted by media outlets and universities, and each has limited resources. As much as each polling organization would perhaps like to conduct polls each day leading up to the election in battleground states and have people on the ground in each state to lend a level of expertise, that's just not realistic. They need to selectively allocate their limited resources.

"We always remind people that polls are a snapshot in time, and that they are only so useful as a predictor," Paleologos said. "Polls are a tool, and an important one, but they're an imperfect one, subject to a poll's margin of error."

Dig Deeper on Data science and analytics

Data Management
Content Management