Chicago Voting Data Analysis
Voting fraud and Benford’s law:
I wanted to replicate a few of the plots created in this video about Benford’s Law and voting fraud. I’m doing this mostly because it is fun, and because I want my research students to try it too. I’m not a political scientist, and I’ll leave the interpretation of these data to experts. To summarize the point of the video, I quote the following passage from Deckert et.al. (2011):
“Benford’s Law is problematical at best as a forensic tool when applied to elections.”
On to the video
I will copy a few of the experts pointed out in the video to this page too. If you want an expert viewpoint, start there.
- Check out Steve Mould’s Numberphile video about Benford’s Law.
- There’s more on Mark Nigrini’s work here:
- “Benford’s Law and the Detection of Election Fraud” 2011 paper. (Requires access to this journal. ECU has access as of the date of this writing.)
- And for balance, here is a paper critical of that other paper (but only in the use of a ‘second digit’ check and they do not dispute the main Benford’s Law claims.).
- And here is a paper by the same author specifically about the 2020 US election results
What is Benford’s law?
Benford’s law states that the probability that the first digit of data that spans multiple orders of magnitude follows the following probability distribution: \(P(d) = \log\left(1+\frac{1}{d}\right)\) We are going to look at that with voting data in Chicago.
Dataset
These data were obtained from the Chicago Board of Election commisioners website.
I cleaned this data elsewhere and saved the voteTally
object to this file:
Presidential candidates who were on the ballot in Chicago were:
- Joe Biden (Democrat)
- Donald Trump (Republican)
- Howie Hawkins (Green)
- Gloria La Riva (Party for Socialism and Liberation)
- Brian Carroll (American Solidarity Party)
- Jo Jorgensen (Libertarian)
The 3rd party candidates were not included in Parker’s video, but I thought it might be fun to include them.
The following functions will be used to process the data further:
Biden plots
Trump plots
“Third party” candidates
Parker didn’t include so-called third party candidates, but I didn’t want to leave them out. Granted the total votes for these candidates were quite small. Here is a histogram of all of the precinct vote distributions. (No pretty colors because I’m lazy)