- Joined
- May 25, 2020
It might be fun to monkey around with crunching this data as a holiday project, but have you actually looked at this shit? One of the first claims is that you have to download the CSV to see ballots received after the 3rd. I had no issue with using the visualizer to find late ballots:
That said, this one is more interesting than a lot of the other ones. You could actually try to go somewhere with this one, though definitions do mater.
Ballot returned date is "the date the county marked the ballot as received after the voter mailed the voted ballot back to the county."
The reason that this is important is that there's a distinction between postmarked and "received." Theoretically, any ballot which was delivered but not properly added to the rolls within those three days would still be seen as "valid" if the postmark was correct (or if lacking a postmark, the ballot had come bundled in with other ballots postmarked within the proper days, for example).
This database doesn't make clear when these ballots were postmarked / received in that matter, only apparently when they were properly processed. This is an important distinction that is apparently not recorded in this dataset, but would be the primary place to make your case for fraud on.
The secretary of state representative would be lying if she meant that only 9,428 ballots were processed by this spreadsheet's definition (even if you restrict the analysis to the 4/5/6th), but may not be lying if she meant postmarked. There is also some question as to whether rejected ballots are reflected within this dataset or not (were they excluded, or counted as received?)
There are a number of anomalies you can dig into with this. There are values which have no application information and yet got / returned ballots. There are applications that are sent out somehow in the early 20th century, perhaps the witness protection items.
The analyzer tool they're using is pretty clearly just comparing the values of each and every dataset. The methodology isn't perfect: if you run the 'unique' test on a single dataset, you find a little under 4,000 values are repeated / identical.
This also means that filling in any value that was null would count as a change, meaning that those ~112k changes it alleges are more than likely people filling in the "returned" date after the 3rd.
You could write a script that checks whether the differences between identical pairs are changes where the value is not originally null, but the tool that the hereistheevidence people are using is not really capable of doing that with the options checked.
Further, even if one assumes that the 69k item changes that were filling in return dates, it leaves in the neighborhood of 43k changes not explained by this. Of course, if there was something so minor as a typo in county name that was changed between versions, this would necessarily flag a large number of differences between the versions.
It's hard to make any real assertions without looking at the original dataset, but if you were going to try to argue that explanation is needed for these numbers, this is more broadly the way to do it. I still don't imagine this leads to any fire, but it would be interesting to see the causes for some of these cells.