Improving my Democratic Primary Prediction Model and Mapping Sanders Support

I'm learning a lot and have made significant improvements to my model.

Notably I've added a turnout variable - and am assuming that turnout will be proportional to Obama's presidential vote in 2012. While this is likely flawed, I don't have a better idea on how to predict turnout.

I added FB likes by county. Interestingly the FB likes by state are still significant.

I created a caucus-only model which has a much smaller confidence interval for its estimates (40% of the general model's interval).

And betting odds from - note they use the chance of winning, whereas I'm predicting vote support.

CT: Clinton 55 / Sanders 45 (Predictit: Clinton 62%)
DE: Clinton 64.5 / Sanders 35.5 (Predictit: Clinton: 82%)
NY: Clinton 66 / Sanders 34 (Predictit: Clinton 85%)
MD: Clinton 72 / Sanders 28 (Predictit: Clinton 88%)
RI: Clinton 52 / Sanders 48 (Predictit: Sanders 56%)
PA : Clinton 57.5 / Sanders 42.5 (Predictit: Clinton 81%)
WI: Sanders 61.5 / Clinton 38.5 (Predictit: Sanders 71%)
WY: Sanders 76 / Clinton 24 (Predictit: Sanders 95%)

It is interesting how well my predictions fill with those of Predictit -- maybe I should include that in my model.

If you assume that one standard deviation is equal to 15%, then you'd have a 31.8% chance that the result would be more than one standard deviation away - and a 15.9% chance that it would be more than one standard deviation away in the same direction (eg. that it would flip from Clinton to Sanders). So Predictit's 81-88% scores are rough estimates of how large people think 1 std deviation is - if you average them you get (14.5+ 16 + 22 + 7.5) / 4 = 60/4 = 15%. This means you have a 15.9% chance that the result could deviate by 15% or more, and a 2.3% chance that the result could deviate 30% or more from the polls (or my estimates) - which means that one of the 50+ races is likely to deviate a lot. This might be an over-estimate and/or the result of polling failure in Michigan.

Note that this model has a terrible record and has been outperformed by polls and other models. However this is the "new and improved" model that features a turnout variable. We'll see soon how the WI prediction holds. The model ignores polls and is based on demographics, FB likes, and past voting. I could easily be off by 5-10%. I'm hoping to be off by less than 5%.

On a practical note, what can Sanders supporters learn from this? I'm going out on a limb and saying that If you want Sanders to win - focus on recruiting FB likes. While this might be totally wrong, it is also possible that increasing FB likes causes the FB algorithm to increase the number of pro-Sanders stories in your feed which increases your support and enthusiasm.

Also I made a map!

Other Models
Tyler Pedigo

Related Sites
Bernie's Path to the Nomination

IN: Sanders: 53.5, Clinton

IN: Sanders: 53.5, Clinton 46.5
KY: Clinton: 56, Sanders 44
OR: Sanders 72, Clinton 28
WV: Sanders 52.5, Clinton 47.5