Summary numbers (Updated October 27)

  • Voterfile-based predicted turnout rate among registered voters: Depending on assumptions, ranges in between 70- 74.5%. Current 2020 tide estimates weighted for current number of registered voters are at 73.3%.

  • Expected number of voters: Depending on assumptions, ranges in between 5.48-6.01 million voters. Current 2020 tide estimates weighted for current number of registered voters are at 5.9 million.

The model-estimated error range is likely too small (plus/minus 1.0 percentage point), especially considering the impact of the pandemic. But even if that error rate is doubled, the 2020-based prediction still is confident Michigan will break a record with a lower bound far above 5.5 million voters.

These predictions are very high – way too high, honestly. But the same model and approach yielded similarly unrealistically high predictions before the November 2018 midterm and before the presidential primary this March, but these respectively turned out to be a slight underestimate and accurate. It is important to discuss why the model is making this assessment and then consider reasons to doubt the validity of its components.

Context

Michigan has seen a huge increase in registered voters since 2018 because of changes in the law passed by voters in Proposal 3. The current total of 8.05 million is a 7% increase over 2016 and 2018. This total is larger than projections of the Census voting-age population estimate of 7.87 million. I discuss below what to make of this bump.

Voter turnout in Michigan since 2018 has been unprecedented, with turnout records in:

  • 2018 August primary (2.2 million)
  • 2018 November midterm (4.3 million)
  • 2020 presidential primary (2.3 million, highest with an incumbent running for renomination in one party)
  • 2020 August primary (2.5 million, despite no competitive statewide race)

This translates into a much more experienced Michigan electorate: A comparison of voterfiles from September show nearly 57% of registered voters have voted in a statewide election since 2016. In 2016, it was only 51% of registered voters. Election year registrations in 2020 are lower than in 2016 (7.0% vs. 7.7%), but not registrations over the last three years (27.6% vs. 26.9%).

Model-based Predictions

Model estimates of voter turnout are generated from looking at the turnout history of every registered voter in Michigan as listed in the state’s database in early September. It estimates how age, address, and past voting behavior typically predicts voter turnout. And then predicts an individual-level (and jurisdiction-level) estimate of the probability of voting for 2020, and then it simulates an election across nearly 8 million observations. Details are discussed below.

Model estimates allow for a year-specific tide, and this is currently estimated from voter turnout in the March presidential primary and the August statewide party primary. Since there is some uncertainty over how the pandemic has changed these tides, the following table presents separate turnout estimates for tide estimates of 2020, 2018 (very high turnout tide), and 2016 (average or mediocre tide). It also presents observed estimates, which summarize across observations within the state voterfile as accessed from the Secretary of State in early September, and compares them to “updated” estimates, which weigh new voters so that the total number of voters match the Secretary of State’s current count.

2020 Turnout Prediction by Assumption
Tide Updated for Late Registrants? # Registered (millions) Turnout Rate # Voters (millions)
2020 No 7.818 0.7325975 5.728
2018 No 7.818 0.7417390 5.799
2016 No 7.818 0.7004790 5.477
2020 Yes 8.054 0.7359114 5.936
2018 Yes 8.054 0.7450701 6.010
2016 Yes 8.054 0.7036559 5.676

Note that even with a 2016 tide, voter turnout is expected to be much higher than in 2016. That is because that parameter represents the effect of a year conditional on a voter’s time-constant tendency to turnout. Since Michigan has a lot more registered voters who have previously voted, the model assumes such voters are more likely to vote regardless of tide. Tide effects matter more for predicting turnout among newly registered voters.

5.9 million, are you serious?

Yes, at least the model is. I think it is way too high. I would be more comfortable with a prediction of 5.5 million. But I am not going to change the model to get the result I want. The point of the model is to demonstrate what past patterns of behavior and our current history of turnout among Michigan’s registered voters means for turnout this November. Those past patterns among the 8 million registered voters in Michigan suggest that 5.9 million would be the result most consistent with those past patterns.

And before I cast doubts on my own prediction, here are three things to consider.

  1. Caution in calling the state: There is a lot of consternation of how and when media outlets might call an election. Those calls require assessments and predictions of how many ballots will be cast in the state. Even if the model’s prediction is wrong, it has been based on a method and logic that has been right and the past. Efforts to call an election need to recognize that a record high numbers in voting are probable. Extreme caution must be taken in assessing what are enough votes to call which candidate won this state.

  2. The method and model’s been right (and me wrong) before: In 2018 this model and method predicted a turnout rate of registered voters of 57.1%. That rate was nearly four points higher than the previous modern high of 53.6% in 2006. I thought it was wrong. It seemed crazy to expect over 4.2 million voters in a midterm election. But, in the end, over 4.3 million ballots were cast and the turnout rate was within the model’s 90% error range, at 58.1%.

  3. Back-of-the-envelope calculations suggest equally high numbers: When 2006 set the record with 53.6% of registered voters voting in a midterm, 2008 followed with a new record of 67.5% of registered voters voting in a general. With 58.1% of registered voters voted in 2018, that new record represents 8.3% factor increase over the previous record in 2006. If that factor increase over previous high carryovers into the 2020 general, then the following calculation becomes: 8.05 million registered voters in 2020 multiplied by the previous record turnout rate of 67.5% in 2008 which is multiplied by the 1.083 factor increase seen in 2018, and that generates a similarly high estimate of 5.88 million voters in 2020.

Why might it be wrong?

  1. Changes in voterfile maintenance: The Michigan QVF is considered by pollsters to be one of the worst voterfiles in the country, with a lot of “dead” references (voters who are no longer registered in the state). Republicans tend to be a little more active in pruning this file as the law (and time and money) allow. Before Secretary Jocelyn Benson, Michigan had not had a Democratic Secretary of State since Richard Austin left in 1995. Current model estimates may have that active pruning by Republican Secretaries of States baked in to its estimate of the likelihood of repeat voting among “dead” listings. The current status of the state’s Qualified Voter File (QVF), and whether Secretary State Benson has failed to be pro-active in eliminating dead registrants, has been one source of partisan dispute between her and current State Senator Ruth Johnson, a former Secretary of State. The Secretary of State has been dealing with a lot of other challenges, so I am not passing judgment.

  2. Biden’s lead in polls and COVID spiking could reduce turnout to 2016 levels: Michigan suddenly emerged as a battleground state after Trump’s surprise victory in 2016. Voter turnout and campaigning has been intense in the years since. But polls now show a strong lead for Biden. And it seems some of the battleground House races have cooled down (Slotkin, Stevens). Additionally, the state has recently seen COVID19 cases spike to levels not seen since April. The confluence of these circumstances could reduce voter interest to levels that are more typically associated with a presidential election in Michigan, like that Michigan saw in 2016.

Data will be updated on gitlab

This blog post may be updated next week, but in the meantime I will be updating estimates on my gitlab page: https://gitlab.com/smidtc-electionscience/2020predictions/-/tree/master

These data updates include spreadsheets with:

What can we learn?

Although the conventional wisdom says otherwise, political scientists find little evidence that higher turnout benefits Democrats over Republicans. This seems to hold for 2020 and Michigan.

Despite voterfile-based expectations of extremely high turnout, the balance of voters across the state continues to shift away from Democratic strongholds. This is largely because of the consistent decline of registered voters in Wayne County. This is good news for Republicans considering the counties with the greatest declines in population tend to support Democrats (Wayne, Genesee, Saginaw).

Using each county’s average presidential party split across 2012 and 2016 and then adding up the total split by observed and expected turnout, the Democratic party advantage in Michigan has shifted from a +4.9 D electorate in 2012, to a +4.3 D electorate in 2016, to an expected +4.1 D electorate in 2020.

Not a younger electorate, but different, and in ways that hurts Trump.

Trends in registration and voter turnout suggest an increase in the relative composition of younger and older voters, and a relative decline in voting power of middle-age voters associated with Generation X. This is not good news for Republicans considering Trump performs best compared to Biden among this generation.

Electorate Age Composition by Year

Age Group 2014 2016 2018 2020
18-29 6.0 13.6 11.7 ~16.2
30-44 16.8 21.3 20.2 ~21.0
45-59 30.6 29.1 27.8 ~26.0
60+ 46.6 36.1 40.3 ~37.2
Median 58 53 55 ~53

Why this big increase in young voters? Many of them registered and turned out already in 2018, and model predictions expect that most of them will be repeat voters in 2020. However, if the voterfile has not been updated like it has in the past to purge voters who left Michigan, it may also be the case that these young voters are not living in Michigan and, thus, will not be voting in the state.

Current predictions by Congressional District and County

Congressional District

With updating

District 2020 Tide 2018 Tide 2016 Tide
1 448682.1 454328.7 429745.5
2 439499.0 444477.8 420459.7
3 449001.4 454773.7 429815.7
4 411526.7 416690.4 393164.3
5 401435.9 407190.3 382122.1
6 420001.1 425577.2 400220.2
7 412629.9 417724.0 394604.7
8 492495.2 497396.0 473954.7
9 428504.8 433674.0 410591.9
10 426213.2 431310.9 408326.8
11 495855.2 500979.2 478802.2
12 424717.6 430204.3 405397.3
13 305812.8 311031.4 287282.3
14 380142.0 385333.8 361945.8

Without updating

District 2020 Tide 2018 Tide 2016 Tide
1 439476.0 444899.0 420933.4
2 417727.8 422506.2 399559.6
3 418450.8 423795.2 400550.4
4 394679.2 399677.6 377066.0
5 382615.2 387980.8 364299.0
6 406923.0 412316.6 387691.8
7 408152.2 413172.0 390311.0
8 449011.4 453496.0 432078.4
9 422789.4 427985.2 405028.4
10 424958.6 429973.4 407151.0
11 471593.2 476422.2 455345.2
12 419714.0 425142.0 400685.8
13 305955.8 311116.8 287587.6
14 366005.4 371045.4 348516.8

County

Updated to match current voter registration numbers

County Code Registered Voters D-R Avg. Split 2020 Tide 2018 Tide 2016 Tide
ALCONA 1 9848 -0.2902125 7188.675 7286.442 6904.731
ALGER 2 7575 -0.1149894 5762.427 5831.443 5513.699
ALLEGAN 3 93608 -0.2434571 70931.270 71865.460 67931.830
ALPENA 4 24792 -0.1702546 17219.200 17470.110 16478.750
ANTRIM 5 21976 -0.2550694 17086.230 17267.610 16415.910
ARENAC 6 12696 -0.1916405 8841.024 8966.906 8453.337
BARAGA 7 6686 -0.1858169 4205.141 4254.664 3995.385
BARRY 8 49633 -0.2577081 38219.260 38748.260 36568.910
BAY 9 84567 -0.0352517 62853.930 63658.280 60112.130
BENZIE 10 16421 -0.0897758 13048.320 13156.010 12576.920
BERRIEN 11 134643 -0.0964746 85467.580 86603.960 81112.670
BRANCH 12 33646 -0.2818953 21151.930 21444.630 19980.620
CALHOUN 13 106044 -0.0543427 69673.090 70719.520 66055.680
CASS 14 44464 -0.2182697 27596.530 28042.120 26079.770
CHARLEVOIX 15 23684 -0.1946924 18025.010 18260.940 17350.490
CHEBOYGAN 16 22550 -0.2153653 16514.000 16729.860 15808.080
CHIPPEWA 17 26096 -0.1592725 17407.870 17556.550 16576.630
CLARE 18 25166 -0.1832299 16032.090 16326.070 15236.310
CLINTON 19 62064 -0.0948706 54093.720 54663.660 52209.310
CRAWFORD 20 11982 -0.2195292 8071.455 8179.532 7586.786
DELTA 21 31226 -0.1602492 20634.420 20956.290 19663.120
DICKINSON 22 23287 -0.2845264 15651.080 15857.010 14858.840
EATON 23 87689 -0.0118613 69427.940 70157.660 66709.280
EMMET 24 30547 -0.1829157 24637.290 24875.370 23714.750
GENESEE 25 348485 0.1887308 244563.000 248169.600 232656.800
GLADWIN 26 21844 -0.2095885 14605.500 14818.730 13907.600
GOGEBIC 27 14278 -0.0341576 8935.357 9132.916 8509.917
GRAND TRAVERSE 28 80794 -0.1217784 67869.400 68742.410 65407.150
GRATIOT 29 28192 -0.1478703 19768.600 20036.760 18763.210
HILLSDALE 30 36397 -0.3553765 25102.080 25484.520 23851.440
HOUGHTON 31 26489 -0.1244981 20282.160 20520.630 19402.580
HURON 32 26148 -0.2651278 18887.660 19149.340 18074.720
INGHAM 33 213416 0.2741329 172145.700 174130.500 164741.500
IONIA 34 46747 -0.2193859 33983.030 34480.530 32248.210
IOSCO 35 22516 -0.1749577 16139.480 16341.620 15393.320
IRON 36 10447 -0.1864663 6939.030 7034.456 6626.799
ISABELLA 37 46082 0.0281694 31676.360 32143.110 29952.350
JACKSON 38 121916 -0.1295161 84900.480 86081.160 80861.860
KALAMAZOO 39 209953 0.1306618 167067.700 169051.300 159841.200
KALKASKA 40 15974 -0.3160465 11397.170 11530.580 10833.490
KENT 41 499140 -0.0538937 403398.000 408311.200 386673.600
KEWEENAW 42 2047 -0.1700355 1613.348 1628.236 1558.232
LAKE 43 9956 -0.0893137 5856.471 5958.503 5555.875
LAPEER 44 71810 -0.2493168 54952.420 55642.920 52714.700
LEELANAU 45 21344 -0.0478160 18136.920 18279.970 17636.930
LENAWEE 46 78797 -0.1118984 53977.250 54651.210 51460.530
LIVINGSTON 47 160715 -0.2648053 140668.500 141906.300 135759.600
LUCE 48 4550 -0.3223634 3290.389 3319.552 3141.602
MACKINAC 49 10095 -0.1972400 6858.570 6930.123 6554.991
MACOMB 50 690192 -0.0377365 489932.000 495986.400 468379.100
MANISTEE 51 20912 -0.0469941 15081.250 15283.540 14443.160
MARQUETTE 52 54605 0.0915083 41331.580 41928.660 39634.840
MASON 53 24647 -0.1348572 17668.880 17880.890 16900.640
MECOSTA 54 31073 -0.1799285 22352.350 22685.720 21300.780
MENOMINEE 55 20085 -0.1618183 12374.430 12531.070 11651.650
MIDLAND 56 69398 -0.1747673 55236.660 55899.610 52856.040
MISSAUKEE 57 12013 -0.4328780 9233.942 9313.159 8828.393
MONROE 58 130200 -0.1056908 86795.280 87988.890 82501.770
MONTCALM 59 47665 -0.2134246 32821.770 33212.000 31187.340
MONTMORENCY 60 8553 -0.3084663 6259.561 6309.156 6004.208
MUSKEGON 61 140645 0.0937126 93571.710 94877.930 88947.670
NEWAYGO 62 39623 -0.2855443 26860.770 27211.740 25602.690
OAKLAND 63 1031075 0.0809642 857988.300 866786.600 826976.900
OCEANA 64 21139 -0.1883908 13990.000 14130.300 13368.570
OGEMAW 65 17820 -0.2141595 12714.800 12880.770 12157.460
ONTONAGON 66 5732 -0.1762080 3970.013 4039.391 3792.150
OSCEOLA 67 18370 -0.3242425 13218.420 13384.670 12610.780
OSCODA 68 7274 -0.3025338 4709.462 4819.500 4510.304
OTSEGO 69 22956 -0.2868772 16395.390 16669.540 15624.740
OTTAWA 70 219733 -0.3254070 186184.100 187834.400 179183.700
PRESQUE ISLE 71 11522 -0.1872974 8503.561 8563.539 8165.522
ROSCOMMON 72 22920 -0.1671944 15204.980 15431.780 14538.320
SAGINAW 73 156235 0.0539384 114040.300 115439.700 108843.300
ST CLAIR 74 133900 -0.1929637 94626.360 95889.540 90242.130
ST JOSEPH 75 47225 -0.2164792 28627.860 29093.080 26937.610
SANILAC 76 31534 -0.3252763 21226.380 21456.780 20281.020
SCHOOLCRAFT 77 7202 -0.1773448 4546.849 4598.313 4309.236
SHIAWASSEE 78 55717 -0.0797335 42922.270 43513.040 40962.490
TUSCOLA 79 42738 -0.2421559 30295.360 30830.250 28944.170
VAN BUREN 80 60590 -0.0674517 41679.850 42308.260 39602.450
WASHTENAW 81 316759 0.3869044 244280.800 246868.600 234788.100
WAYNE 82 1400197 0.4211735 910036.900 923379.600 864239.900
WEXFORD 83 27366 -0.2587324 19080.760 19310.900 18093.770

Where does this prediction come from? An explanation and disclaimer

The State of Michigan provides a public record of registered voters in Michigan and their turnout history in the State’s Qualified Voter File (QVF). The data in this file include a registered voter’s name, age, address, registration date (for resident’s current jurisdiction), and voter history (for resident’s current jurisdiction).

Knowing a person’s age, registration date, and location provide only limited amounts of information for making a prediction. But knowing a person’s voter history allow some ability to classify voters as to whether they are likely to vote or not for certain types of elections.

There remain some important qualifications to these predictions. Most importantly, this is not a model or prediction of campaign-driven turnout! These predictions set a baseline expectation for what turnout should be given prior history of turnout for current voters and how that history matches with three election-type factors: 1) gubernatorial vs. presidential election cycle; 2) ballot competition (proportion of prominent offices with two-party competition); and 3) year-specific tides. There is no information about polls, spending, or interest in ballot proposals. If turnout is higher or lower than what is predicted, then that is a good sign that the choices and campaigns either mobilized or dissuaded registered voters from turning out. Record high turnout in the 2018 August primary was somewhat expected because the primary had one of the highest level of ballot competition Michigan had recently seen. But that was not the case in August 2020. Record high turnout is also, however, what this model uses to predict this year’s election tide. Hence, these predictions of a high tide in turnout may be off to the degree the general election campaign does not match the intensity of the March presidential primary campaign or the August statewide party primaries.

These predictions cannot fully account for newly registered voters. Since the Secretary of State doesn’t collect these and make them fully available until the final week of the campaign, it would be too late to wait to predict based on this file. I am making a guess that most of the increase in registration by county is similar to the set of newly registered voters observed in the September 1 QVF that I have available and adjusted for the typical young registered voter boost Michigan sees right before an election. This assumption is likely wrong, but it is all that can be done without updated data from the State. And the State Elections Bureau has more important things to do right now than to deal with more FOIAs.

Model specifics

Define y\_{ijt} as the set of QVF observations of whether a registered voter i in precinct j turned out to vote in election t. Voter probabilities of turning out to vote are then modeled as follows:


\\Pr\[y\_{ijt} = 1\] = \\Lambda( x\_{it}'\\beta + z\_{t}'\\delta + Aug\_t( \\gamma\_{1} + \\zeta\_{1j} + \\nu\_{1i} ) + Comp\_t(\\gamma\_{2} + \\nu\_{2i}) + \\zeta\_{0j} + \\nu\_{0i} )
+ Compt(\gamma{2} + \nu{2i}) + \zeta{0j} + \nu_{0i} ) “)

where

  • x\_{it} are of set time-varying individual level predictors: logarithm of age;the interaction of log age with August primaries; the interaction of log age with the midterm cycle; a logarithm of month since registration; and dummies for an initial observation, whether a voter had recently registered in that quarter, and whether that recent registration was during a presidential cycle.
  • z\_{t} are a set of election-specific indicators: whether the contest is a presidential primary; whether that election is during a midterm cycle; and the year of the contest.
  • Aug\_t is an indicator of whether the election is an August primary. The relationship between this variable and a voter turnout is allowed to vary at the precinct level and at the individual-level (\\zeta\_{1j} + \\nu\_{1i}). In other words, August turnout is assumed typically higher or lower in some precincts and among some individuals for factors beyond what is in the model. This specification essentially allows some individuals and precincts to be high turnout precincts in November but not in August (and vice-versa).
  • Comp\_t is a (logged) measure counting statewide and U.S. House contests with major party competition or, for August, with primary competition. This effect is allowed to vary by individual (\\nu\_{2i}). In short, it allows that some people show up to vote regardless of the amount of competition on the ballot. Whereas other people’s likelihood of voting is responsive to the amount of competition on the ballot.
  • \\zeta\_{0j} is a precinct-specific, time constant parameter that captures that precinct’s unexplained tendency to have higher or lower registered voter turnout rates.
  • \\nu\_{0i} is an individual-specific, time constant parameter that captures that individual’s unexplained tendency to have higher or lower voter turnout rates.

Subsampling and prediction generation

Multilevel logit model estimates were performed over 20 separate subsamples of 100 randomly sampled precincts (weighted by population) with up to 50 randomly sampled voters within each precinct. Since the average voter has a voter history spanning back ten past contests. There are about 50,000 observations for each of the 20 estimates. These parameters are then averaged and used to generate Empirical Bayes estimates of the precinct and individual specific error components (\\zeta, \\nu), using 5 integration points. These Empirical Bayes estimates are then combined with the model’s fixed portion to generate a prediction of each voter’s probability of turning out to vote in November.