Introduction
Warning – this is a long and detailed examination of a complicated trainwreck
[Update] The IAIDQ has issued a press release on this topic…Election Throws A Spotlight On Poor Data Quality. [/update]
In every democracy citizens must be able to trust that the State will not impede their right to vote through any act or omission on the part of the State or its agents. Regular visitors to the iqtrainwrecks.info blog will know that Ireland has it’s fair share of problems with its electoral register. Of course, that isn’t news.
However, the Washington Post has reported last weekend (18th October) that the US elections are being plagued by similar issues. The New York Times covers the same ground in this story from 9th October. With a slightly important vote coming up on the 4th of November, that is news
In a saga that has found its way to the US Supreme Court (in at least one case so far), voters are being forced to re-establish their eligibility to vote before the election on November 4th. As the Post points out, “many voters may not know that their names have been flagged” which could “cause added confusion on Election Day”.
So what is going on (apart from the lawyers getting richer of the inevitable law suits and voters finding themselves reduced to just “Rs” as they lose their Vote)? Where is the trust being lost? Why is this an IQ Trainwreck?
A Change of Process and a Migration of Data
Under the Help America Vote Act, responsibility for the management of electoral registers was moved from locally managed (i.e. county level) to state administered. This has been trumpeted as a more efficient and accurate way to manage the accuracy of electoral lists. After all, the states also have the driver licensing data, social welfare data and other data sources to use to validate that a voter is a voter and not a gold fish.
However, where discrepancies arise between the information on the voter registration and other official records, the voter registration is rejected. And as anyone who has dealt with ‘officialdom’ can testify to, very often those errors are outside the control of the ‘data subject’ (in this case the voter). The legislation requires election officials to use the state databases first, with recourse to the Federal databases (such as social security) supposedly reserved as a ‘last resort’ because ,according the the New York Times, “using the federal databases is less reliable than the state lists and is more likely to incorrectly flag applications as invalid”.
Of course, for a comment on the accuracy of state databases I’ve found this story on The Risks Digest which seems to sum things up (however, as a caveat I’ll point out that the story is 10 years old, but my experience is that when crappy data gets into a system it’s hard to get it out). In the linked-to story, the author (living in the US) tells of her experience with her drivers license which insisted on merging her first initial and middle name (the format she prefers to use) to create a new non-name that didn’t match her other details. That error then propagated onto her tax information and appeared on a refund cheque she received.
In short, it would seem she might have a problem voting (if her drivers license and tax records haven’t been corrected since).
Accuracy of Master Data, and consistency of Master Data
The anecdote above highlights the need for accuracy in the master data that the voter lists are being validated against. For example, the Washington Post article cites the example of Wisconsin, which flags voters data discrepancies “as small as a middle initial or a typo in a birth date”.
I personally don’t use the apostrophe in my surname. I’m O Brien, not O’Brien. Also, you can spell my first name over a dozen different ways (not counting outright errors). A common alternate spelling is Darragh, as opposed to Daragh. It looks like that in Wisconsin I’d have high odds of joining the four members of their 6-strong state elections board who all failed validation due to mismatches on data.
In Alabama, there is a constitutional ban on people convicted of felony crimes of “moral turpitude” voting. The Governor’s Office has issued one masterlist of 480 offences, which included “disrupting a funeral” as a felony. The Courts Administrator and Attorney General issued a second list of more violent crimes to be used in the voter validation process. Unfortunately, it seems that the Governor’s list was used until very recently instead of the more ‘lenient’ list provided by the Courts Administrator.
Combine this with problems with the accuracy of other master data, such as lists of people who were convicted of the aforementioned felonies and there is a recipe for disenfranchisement. Which is exactly what has happened to a former governor (a Republican at that) called Guy Hunt.
In 1993 Mr Hunt had been convicted of a felony related to ethics violations He received a pardon in 1998. In 2008 his name was included on a “monthly felons check” sent to a county Registrar. Mr Hunt’s name shouldn’t have been on the list.
According to the Washington Post article, Mr Hunt isn’t the only person who was included on the felon list. 40% of the names on the list seen by the Washington Post had only committed misdemeanors. In short, the information was woefully inaccurate.
But it is being used to de-register voters and deprive them of their right tohave their say on the 4th November.
The Washington Post also cites cases where US citizens have been flagged as non-citizens (and therefore not entitled to vote) due to problems with social security numbers. Apparently some election officials have found the social security systems to be “not 100% accurate”. But this is the reason why they are supposed only to be used when the state systems on their own are insufficient to verify the voter. That’s the law(apparently).
Data in the wrong fields
So, we have established that a raft of inaccuracies, inconsistencies and other issues with the master data seem to have created a need for some voter records to be checked off the federal systems which (it seems to me) have big health warning around them.
However, it would seem that the number of Social Security searches being done is too high for it to be down to data being absent from state systems. According to the New York Times, in the year to September 30th the state of Nevada had filed almost three quarters of a million requests and found 715,000 non-matches. In Georgia, they ran almost 2 million checks and found more than 260,000 non-matches.
Nevada explained their need for searches by pointing to the fact that county clerks had input the drivers licence and social security numbers in the wrong fields before the data was sent to the state. In otherwords they had to go on fishing trips because they didn’t know what context to look at the data in to get actionable information.
Apparently, other states have had similar difficulties with data entry.
What else might be contributing?
We’ve already established that the validation of crummy quality data against master data sets that are in themselves of crummy quality is a contributing factor in the number of non-matches and referrals to the Social Security system.
However, information doesn’t get to be of crummy quality by itself, and this tale presents examples of the other common contributory cause – crummy processes, or unclear processes or processes which people don’t know about or which applied inconsistently.
Unfortunately Federal law allows for each state to decide what constitutes a match in its data. For example, some states may allow one or two character variations in spelling (Daragh vs Darragh or O Brien vs O’Brian for example) where as others may not recognize nicknames.
Furthermore, despite federal law requiring that anyone whose name is flagged by notified so they have a chance to prove their eligibility, it seems that voters are not always alerted. The Washington Post points out that anyone who is still under a cloud of ineligibility come polling day can cast a “provisional ballot”. But again, whether these ballots are actually counted can depend on local and state rules.
In the New York Times article a supervisor of elections in Florida (a state which has had its electoral information quality problems in the past) is described as being “angry to learn from the state recently that it was her responsibility to contact each flagged voter to clear up the discrepancies before Election Day”. And the relevant legislation has been in force for nearly six years (and one election).
Who is the customer?
What is complicating matters even further is an apparent confusion as to who the ‘customer’ is of the elections register information. This is evidenced by the Supreme Court action resulting from Ohio’s Republican party to try and force the Secretary of State for Ohio Jennifer Brunner (Democrat) to produce lists of voters who encountered problems matching their voter registration information against driver’s license and Social Security data.
In the heat of the polictical wrangling, Brunner described the case as “another partisan lawsuit”, only to be berated by the Republican chairman in Ohio for continuing “to do everything she can to help her candidate”. And the litigation isn’t just in Ohio. There are legal actions in train in Michigan, Florida and Georgia as well, with actions having taken place in Montana. In the Montana case a federal judge ruled that the Republicans had filed the case “with the express intent to disenfranchise voters”.
And there was me thinking that the customers of the electoral register information were the voters (in that they can vote). Instead it seems that the politicians view themselves as the sole customers of the information, with the lawyers certainly deriving significant benefit.
Unfortunately however, in a tight election or one that is being aggressively fought, it seems that the tactic de jour is to try to take advantage of the crummy state of electoral lists and other data to find some form of competitive advantage. Indeed, efforts to keep names of the voter lists seem to be a new trend primarily in the battle ground states in the US election.
But is this an IQTrainwreck?
We have a rule of thumb in these parts that applies here. If, for any given information quality issue, N >1, where N is the number of lawyers involved, then it is an IQtrainwreck. That’s even before we begin to consider the implications for the democratic process and the ability for a fired up electorate to accept a result if they have misgivings about the key information resource in the elections process.
Wendy R. Weiser of the Brennan Center for Justice , speaking in the Washington Post, describes the voter register problem as “this season’s big issue”. Speaking in the New York Times, Daniel P. Tokaji (a law professor in Ohio State University) says :
“Just as voting machines where the major issue that came out of the 2000 presidential election and provisional ballots were the big issue from 2004, voter registration and these statewide lists will be the top concern this year”
Judging by the number of press releases issued by the Brennan Center on this topic in recent days, this is a significant issue across a number of states. However, as many of the commentators quoted in the Brennan Center’s press releases say, the problem is solvable. However not before November 4th.
Perhaps what the American people need to help them vote isn’t a piece of legislation. Perhaps they just need to be able to trust that they won’t be deprived of their right to vote because of poor quality information.
Here I am thinking: Why doesn’t this great nation just do what we do in my humble country being Denmark. We have a single public registration system covering all the citizen roles. When we have an election or referendum, they push the button and the ballots are written. That’s it. MDM at work.
But I guess this shows that often there are reasons outside the control of any data quality initiative that leads to that you just can’t copy a best practice from another organisation.
Also it reminds me that data quality solutions may work very in one country but be almost useless in another country.
As globalization moves forward this challenge becomes more and more important. Enterprises tend to standardize world wide on tools and services, shared service centres takes care of data covering many countries and so on. When an employee works with data from another country he often wrongly adapts his local standards to these data and there by challenges the data quality more than seen before.
So this is not easy, you have to both think and act global and local.
The Washington Post has more coverage of the election issues in this article by Mary Pat Flaherty.
Henrik, I’m jealous of your country’s system which seems to work so well. Ireland’s is supposed to work like that as well but is plagued by problems the quality of the information (see my personal blog for more details). 3 years ago we had an idea of how bad it was. Then the government ‘cleaned it up’ and now we have no idea how bad it really is. All we do know is that the defective processes and governance that caused the issue are still there. Next election cycle is next year (for local government and EU Parliament), and we probably have at least 1 referendum during that period as well. But the electoral register issue has faded from the radar (again).
The Brennan Center has a good post on its blog about how the problems in state and Federal databases, combined with lack of clarity on voter registration forms can exclude candidates from voting.
It can affect everyone… even Joe the Plumber (yes, that Joe the Plumber)
And here is an article from the ASQ website about the problems in 2000…
http://www.asq.org/advocacy/issues-actions/20041029electionprocess.html
This article (again from the Washington Post) suggests to me some ‘concerns’ in Virginia about electronic voting amongst other things.
http://www.washingtonpost.com/wp-dyn/content/article/2008/10/30/AR2008103004224.html
Cripes… this story I found on rawstory.com is a bit worrying. It highlights the importance of quality of information design as part of any information quality process.
http://rawstory.com/rawreplay/?p=2316
As I understand the US public registration of citizens – and also the Irish – you can have a row with your name, address and birthday in several different registers, like an Electoral Register, a Driver License Register, a Social Security Register and so on. And then of course the names, addresses, birthdays could differ between these registers.
How to fix this is one of the first things you learn about data modelling, and also a core practice in Master Data Management. What you do is having one register with a unique key for each person born in, moving to, working in a country. Name, address, birthday, gender, nationality will be obvious attributes. Then electoral authorities, driving license issuers, social security administration may have additional information needed for their specific area, but always referencing a citizen with the unique key – and thereby having name, address and birthday from the master.
I know I am very naïve disregarding constitutions, politics, traditions, and so on, but as said, this is basicly how it have been working in Denmark for 40 years now.