Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

107
Is the PA "firewall" justified? A programmatic analysis (tldr: seems plausible as a "tie", but nothing to feel safe from - more of a necessary condition for a D win than a sufficient one?)
Post Flair (click to view more posts with a particular flair)
Author Summary
hangingonthetelephon is in Pennsylvania
Post Body

Much has been made about Joshua Smithley's prediction of a 390k vote-by-mail (VBM) firewall for Kamala - it originally seemed to be framed as the margin at which VP Harris' supporters can start to feel confident in PA, but seems to have since moved to being framed as the "break even" point - and has further since been suggested by Smithley that it will be "revised" up.

As far as I could tell, he did not indicate at all how he actually came up with that number, so it is hard to really say if it is justified or not. I decided to do some simple modeling to see if it is.

Methodology

We will take the "break even" interpretation: we seek to model various scenarios for total ballots requested, total ballots returned for each party, how the returns break for each party (i.e. some D's return as R votes, etc), how the rest of the population turns out, etc, and use the modeled results to determine the election day margin required by Mr. Trump to tie (not statistically, literally) VP Harris on election.

To do so, we will take priors over a variety of parameters. Because I have limited knowledge of these things, I used uniform-random priors with fairly wide ranges to capture a very diverse range of outcomes; however the code (linked at the bottom) is incredibly simple to edit, so feel free to update the priors.

  1. The total voting age population of Pennsylvania ~ U(1e7, 1.1e7)
  2. The total number of VBM ballots requested ~ U(1.8, 2.2)
  3. The fraction of VBM ballots requested by D-registered citizens ~ U(0.6, 0.75)
  4. The fraction of the remaining VBM ballots requested by R-registered citizens ~ U(0.8, 0.9)
  5. [Remaining ballots are I-registered citizens]
  6. The fraction of democrat-registered ballots returned (for any party) ~ U(0.6, 0.8)
  7. The fraction of republican-registered ballots returned (for any party) ~ U(0.55-0.75)
  8. The fraction of I ballots returned (for any party) ~ U(0.5, 0.7)
  9. [note that I assumed a slightly higher D return rate]
  10. The fraction of returned-democratic ballots which are votes for Harris ~ U(0.8, 0.9)
  11. The fraction of remaining returned-democratic ballots which are votes for Trump ~ U(0.5, 0.9)
  12. [remaining returned democratic ballots are votes for third-party]
  13. The fraction of returned-republican ballots which are votes for Trump ~ U(0.8, 0.9)
  14. The fraction of remaining returned-republican ballots which are votes for Harris ~ U(0.2, 0.9)
  15. [Remaining returned republican ballots are votes for third-party]
  16. The fraction of returned-independent ballots which are votes for Harris ~ U(0.2, 0.9)
  17. The fraction of remaining returned-independent ballots which are votes for Trump ~ U(0.2, 0.9)
  18. [Remaining returned independent ballots are votes for third-party]
  19. [We now have enough information to deterministically compute the D VBM net total lead in votes]
  20. Election day turnout as fraction of population that did not request a VBM ballot ~ U(0.6, 0.8)
  21. The fraction of election day voters who vote third party ~ U(0.0, 0.05)
  22. [This means we now know the exact number of voteres who are voting either D or R on election day, and can compute the election day margin Trump would need to hit to reach a perfect tie]

We perform the sampling above 40,000 times and determine the returned ballots net lead for the Dems, the actual vbm lead for the dems, and the election day margin trump would need to achieve to tie. One motivation for doing it this way is that we don't need to take any priors on how the election day ballots split (except for the small one on third party votes cast).

Results

With all that out of the way, let's take a look at what these priors yield:

https://imgur.com/rdjy9n3

The priors result naturally in Harris building a lead from about 360k to 530k via VBM (in terms of actual votes! note returned ballots!) and Trump needing around a 6%-9% victory in terms of the *election day* vote to break even with Kamala. In the scatter plot however, we can see an extremely clear correlation between the Democratic vbm actual-vote margin and the election day margin needed by Reps to break even. For every 100k actual votes that democrats add to their VBM lead, it forces republicans to increase their election day victory margin by 1.71%. A 390k lead corresponds to about a 6.6% margin on election day give or take a a percent or so.

However, keep in mind... the number that the firewall refers to is actually the returned ballots, not the actual vbm vote tallies... let's look at those plots:

https://imgur.com/V5N02Hn

In almost all scenarios, the dems naturally end up with 390k returned ballots vis-a-vis R returned ballots, suggesting my priors might be a bit aggressive, however, we see that the margin correlation, though still strong, is quite a bit more uncertain - every 100k votes added to the *returned* D-ballot lead only equates to forcing the R candidate to add an additional 1.28% to their election day margin of victory to tie - and 390k corresponds to forcing the R candidate to just a 5.1% lead on election day, but it could be as low as 3% or as high as 6.5% or so.

Interpretation

To me, this seems to be (a) already a bit aggressive in the leads it builds for Harris through VBM, and (b) pretty feasible margins for Trump to hit on election day. So it seems reasonable to think that if the Dems have a 390k lead in returned ballots, the race could be a tossup - but they really need to build up more than that to force a higher election day margin for Trump.

Code - try it yourself in a Jupyter notebook and tweak the priors!

Obviously I set a variety of priors here - you might have better numbers! Feel free to plug them in yourself and run the notebook to get new results.

https://colab.research.google.com/drive/1lNJp4L3EeNxQbZuH5ERYC1gyAV9i0D6i?usp=sharing

Edit

If anyone has twitter, please tweet this at Smithley, curious what he would use as inputs for the priors!

Comments
[not loaded or deleted]

Michael Dukakis chose Texas Sen. Lloyd Bentsen as his VP pick in 1988, with the expectation that Bentsen would help lock Texas for the ticket.

Bentsen was hugely popular in Texas, and because of a quirky law, he was able to run for reelection while also running on the Dukakis ticket. He won reelection to the Senate with 59 percent of the vote.

The Dukakis/Bentsen ticket got blown out in Texas, earning only 43 percent of the vote.

The vice presidential pick doesn't matter unless they're a disaster. Just as there is one universe where a Harris/Shapiro ticket wins Pennyslvania, there is one where they lose and people are wondering why this immensely popular Pennsylvania governor hitched his wagon to a losing cause.

[not loaded or deleted]

Could is the relevant term. This could also be a wash if Harris and Walz win. Or maybe not. Maybe Harris picks Shapiro and they lose because Harris was unpopular and made Shapiro unpopular because he had to defend unpopular positions.

The last time a vice presidential candidate was picked for his swing state value and it worked was when JFK picked LBJ for Texas, and there are many historians who argue Texas was delivered because only because of fraud. Otherwise, VPs are picked for personality and fit.

State politics are just different from national. Larry Hogan was a popular Republican governor of blue Maryland. He's going to lose decisively in his bid for a Senate seat.

Author
Account Strength
100%
Account Age
7 years
Verified Email
No
Verified Flair
No
Total Karma
13,895
Link Karma
343
Comment Karma
13,537
Profile updated: 1 day ago

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
3 weeks ago