Predicting the Philadelphia District Attorney Primary - May 16, 2017

I have developed a very tentative prediction model for the Philadelphia District Attorney's race and the Democratic primary on May 16, 2017.

The incumbent, Seth Williams, dropped out due to corruption. This left the city with a wide-open race. The Democratic party establishment has not endorsed anyone. Instead individual Democratic leaders have spread their support among the seven candidates.

My model is based on several weighted factors. The weights are somewhat arbitrary. For instance, I would give polling more weight if there were independent polls and more polls in general.

Note: I'm a Krasner supporter so my model may be biased by that fact.

The social media data points were measured on May 13, 2017.

1. Polling
50% weight
As there have not been any independent polls, I use the two internal polls that campaigns have released. There is evidence of bias in the two polls released in this race as one had their candidate in first place and the other had their candidate tied for first. In the past, Harry Enten (538) has examined Congressional polls and found that House polls had an average bias of 6 percent and that Senate polls had an average bias of 4.25%. This Congressional polling bias is for races that typically have two candidates and are closer to 50/50. As the top candidates in the two Philly polls have only received 20%, I've decided to estimate a smaller poll bias of 2%. Ex. In Untermeyer campaign's poll, his score is reduced from 16% to 14%.

2. Ward Leader Endorsements
10% weight
According to the Philadelphia Weekly, the number of ward leader endorsements for six of the candidates was surprisingly similar (7-11). There are 65 ward leaders in Philadelphia and can be critical in turning out the vote and especially for winning primaries where turnout is lower.

3. Google Search Trends
10% weight for searches in the past 30 days
5% weight for searches in the past 90 days

Krasner leads the search trends in both time frames (38.9% in the past 30 days, and 32% in the past 90 days). El-Shabazz does very well in the 90 day time frame (his announcement story was big). O'Neill does very badly with a zero rating for searches in both time frames.

4. Facebook Likes
10% weight
Krasner has 47% of the total Facebook likes. It is hard to say how much this is due to supporter enthusiasm versus advertising. Khan is in second place with 24%. Negrin does surprisingly poorly (especially considering how well he does on Twitter and YouTube) with only 2.5% of the FB likes.

5. Twitter Followers
5% weight
Negrin dominates this measure with 74.5% of the followers. He is the only candidate to use a preexisting account and he has been regularly tweeting since 2009 and built up a wide following.

6. YouTube Video Views
5% weight
I counted the number of views for the most popular video that related to the District Attorney's campaign and that featured the candidate (eg. I excluded the debate videos). Negrin and Krasner have produced quality videos. Khan's video was low quality and had only 195 views. Negrin released his video in December, 2016 and has received 4500 views. By contrast, Krasner released his video only a week ago and has received 1237 views. So counting the total number of views may underestimate Krasner's support.

7. Fundraising and Spending
5% weight
This includes the amount “spent” plus the “account balance” at the end of the last reporting period for the campaigns and any affiliated PACs. I used the data from
Krasner does very well thanks to the $1.45 million in the Soros funded PAC, and Untermeyer does well due to self-financing.

Overall Estimated Scores
Deni: 4.9%
El-Shabazz: 8.9%
Khan: 19.3%
Krasner: 26.5%
Negrin: 19.0%
O'Neill: 6.9%
Untermeyer: 14.5%

Confidence Interval: very large (who knows?)

Krasner benefits from strong FB, Google Search and fundraising to score higher in this final rating than he does in the poll average. A major caution is that my model on the Clinton/Sanders primary showed that Sanders needed around 65% of the FB likes or Google Search Trends to be at 50% support in a state. So the unadjusted social media scores may create a less accurate final score than using a pure polling average approach. We shall see on Tuesday night whether this model is accurate or not. Though even then we won't know for certain, as the correlation (or lack thereof) could be due to bad or good luck.

My Data Spreadsheet and data sources

Updated Algorithm

I created an endorsements index based on a PhillyMag article from May 13.

I gave three points to major city politicians and ward leaders, two points to state/national politicians, and 1-3 points per organization - though almost all of them only got 1 point (FOP and Inquirer get 3 points). Slightly over half of the total endorsement points came from wards.

Endorsement scores
Deni: 1
El-Shabazz: 30
Khan: 34
Kranser: 75
O'Neill: 62
Negrin: 60
Untermeyer: 28

Untermeyer gets 27 out of 28 endorsements points from wards - which is interesting given the rumor/anecdote that ward leader endorsements are traded/bought in exchange for campaign funding. As Untermeyer has the second best funded campaign, but his endorsement base outside of the wards is very small. By contrast, the best funded candidate, Krasner has a much wider endorsement base.

The new algorithm replaces ward endorsements (weight 10%) with this new endorsement index (15% weight). I get the additional 5% weight by reducing Twitter and Youtube to 2.5% weight each (which really hurts Negrin).

The New Scores (including the updated Google Search trends for the past 30 days, as of May 15, 3:30pm)
Deni: 4.5%
El-Shabazz: 9.1%
O'Neill: 8.2%
Khan: 19.4%
Krasner: 28.1%
Negrin: 16.2%
Untermeyer: 14.5%

How did the model do?

Deni 1.46 (4.5 - my prediction)
El-Shabazz 11.62 (9.1)
Khan 20.35 (19.4)
Krasner 38.19 (28.1)
O'Neill 5.95 (8.2)
Negrin 14.25 (16.2)
Untermeyer 8.16 (14.5)

Overall my error was 27.2%, or an average of 3.9% per candidate.
Early in the night I was very happy as my model only had 16% error, but as the vote came in Krasner and El Shabazz increased their share, and Untermeyer and O'Neill fell.

It looks like Democrat turnout was up from around 65,000 in 2013 to 150,000. This probably wrecked the polling.
Krasner's ground game was probably far ahead of the others. This is hard to measure because I'm guessing that most campaigns won't reveal the details of their canvassing and get out the vote operation.

The only indicators of a big victory were 1) the level of enthusiasm I saw for this race (it was greater than Bernie Sanders), 2) Reclaim Philly and related allies - ground game looked very strong (though I'd never been involved in an election campaign before), 3) FB likes (Krasner had 47% of them), and 4) Google Search Trends. Notably with Search Trends - Krasner was getting 52% in the past day and 39% in the past 30 days.

I think the high Google Search Trends were a result of canvassing, and tv ads only played a secondary role in generating the high search trends (this is why Untermeyer's search trends were not that high).

By contrast, Krasner's best poll result (of the two public polls) after weighing the undecideds - was a mere 25%.