Category Archives: Uncategorized

X Marks the Spot? Mapping errors cost a house…

via Nic Jefferis comes this great example of the impact of a data inaccuracy, and one that is all too uncommon!

http://www.engadget.com/2016/03/24/texas-wrong-house-torn-down-google-maps/

Demolition workers relied on google maps to bring them to an address of a residential property that was to be demolished after storm damage.

The problem was that Google Maps had the location in the wrong place. The actual address was a block away. This resulted in a home being demolished, despite the owners having applied for funding to repair the storm damage it had suffered.

Yet again, the builders maxim of “Measure twice, cut once” appears to apply to a classic information quality problem.

 

 

Space triggers early release of Scottish Exam Results

It was widely reported yesterday that students in Scotland who had signed up for SMS notification of their results had received them a day early, giving them a jump on their less technically minded compatriots and competitors and causing stress and distress to students, parents, educators, civil servants, and politicians.

The Scottish Education Authorities began a root cause investigation as the secrecy and security of the examination results system seemed to have been compromised.

Right now I suspect you are settling in for a tale of hackers and Jason Bourne like derring do. Well, here at IQtrainwrecks we never get that lucky. After all, this is a blog that looks at information and data quality problems.

According to The Register the root cause of this problem is good old data interchange and exchange across organisations.

  1. A template spreadsheet was used to perform the data interchange between the Scottish Education Authority and the company which provides the SMS gateway and related processing as a pro-bono to the Education Authority, AQL. A batch template is used rather than an on-line interface as the service is only used once a year.
  2. The template was populated and saved in a later version of Microsoft Excel
  3. The process of populating and saving the spreadsheet appended a white space to the end of each date stamp (the date that the sms was to be sent).
  4. The ETL process interpreted the “DATE” field as text (which it was thanks to that errant space) and rejected the field on the load. Luckily AQL had developed error handling for situations where a date field couldn’t be loaded and applied a default… the day of the file load (which was the day before the messages were to go out).
  5. As a result the SMS system read the file and sent the messages a full day early.
One way of looking at this is that a technical information management issue has resulted in a gaggle of Scottish students got to go celebrating a full 24 hours early and are now making life changing decisions about their future education with the biggest hangovers of their young lives. Which will obviously end well.
Another way of looking at it is that this highlights the importance of proper standards and defined information flows particularly where the cycle time of the process is long and the frequency of the operation is low. What kind of “pre-flight” checks could have been built into the governance to prevent this? What assumptions were being made that should have been challenged.

Organ Donor Records Mix-up

The Sunday Times reported in April 2010 that NHS Blood and Transplant, who run the UK organ donor register, last year wrote to new donors with their consent details. After respondents complained the information was incorrect it was discovered 800,000 individuals’ details had been recorded incorrectly. 45 of those affected have since died and their incorrect wishes carried out!

“The mistake occurred in 1999 when a coding error on driving licences wrongly specifying donors’ wishes was transferred to the organ registry.”

400,000 of the affected records have been changed, and the remaining 400,000 people will be contacted soon and asked to update their consent.

Fruit of a poisoned tree – Information Quality meets Data Protection

Sears, the US retailer, has been ordered to delete all customer data it obtained through the use of on-line tracking software it installed on customer’s computers.

While the programme was an opt-in programme for which customers were paid US$10, the extent of the information captured was far beyond what customers might have considered “reasonable” and included data capture that a reasonable person might class as “questionable”. The Register tells us:

The FTC said that while customers had been warned that, once downloaded, software would track their browsing, it had in fact tracked browsing on third party websites, secure browsing including banking and transactions and even some non-internet computer activity.

“The FTC charged… that the software also monitored consumers’ online secure sessions – including sessions on third parties’ Web sites – and collected consumers’ personal information transmitted in those sessions, such as the contents of shopping carts, online bank statements, drug prescription records, video rental records, library borrowing histories, and the sender, recipient, subject, and size for web-based e-mails,” said an FTC statement.

Under EU law, there are protections for individuals as regards the nature of information that can be captured and how it should be captured. These rules are encapsulated in the Data Protection regulations that apply in all EU countries.

A key part of those principles and rules is that the “data subject” (the person to whom the data relates) needs to be given a clear and upfront statement of what information is being captured about them, why, what uses it will be put to, and who it may be shared with.

The FTC specifically criticised Sears for how they presented the information on what was being captured:

“Only in a lengthy user license agreement, available to consumers at the end of a multi-step registration process, did Sears disclose the full extent of the information the software tracked,” said an FTC statement. “The [FTC] complaint charged that Sears’s failure to adequately disclose the scope of the tracking software’s data collection was deceptive and violates the FTC Act.”

So, failing to take adequate care and attention in setting and meeting your customer’s expectations about how you will use their data can seriously jeopardise your ability to capitalise on your information assets. Furthermore it can result in reputational damage and other loss. Managing that expectation improves the quality of the data you have (e.g. customers won’t input spurious data, or  you will be legally allowed to use it for other purposes) as well as meeting obligations for trust and transparency with how you manage your customer’s privacy through effective data protection.

In this case, the data gathered was fruit of a poisoned tree and Sears could not retain it or use it, negating the value of any investment they had made in the tracking programme.

Interestingly the FTC initated this case themselves, suggesting that US based Regulators may be taking data protection more seriously. Doubly interesting is the fact that the principles they are setting out are similar to EU regulations.

No child left behind (except for those that are)

Steve Sarsfield shares with us this classic tale of IQ Trainwreck-ry from Atlanta Georgia.

An analysis of student enrollment and transfer data carried out by the Atlanta Journal-Constitution reveals a shocking number of students who appear to be dropping out of school and off the radar in Georgia.  This suggests that the dropout rate may be higher and the graduation rate lower than previously reported.

Last year, school staff marked more than 25,000 students as transferring to other Georgia public schools, but no school reported them as transferring in, the AJC’s analysis of enrollment data shows.

Analysis carried out by the State agency responsible was able to track down some of the missing students. But poor quality information makes any further tracking problematic if not impossible.

That search located 7,100 of the missing transfers in Georgia schools, state education spokesman Dana Tofig wrote in an e-mailed statement. The state does not know where an additional 19,500 went, but believes other coding errors occurred, he wrote. Some are dropouts but others are not, he said.

In a comment which should warm the hearts of Information Quality professionals everywhere, Cathy Henson, a Georgia State education law professor and former state board of education chairwoman says:

“Garbage in, garbage out.  We’re never going to solve our problems unless we have good data to drive our decisions.”

She might be interested in reading more on just that topic in Tom Redman’s book “Data Driven”.

Drop out rates consitute a significant IQ Trainwreck because:

  • Children who should be helped to better education aren’t. (They get left behind)
  • Schools are measured against Federal Standards, including drop out rates, which can affect funding
  • Political and business leaders often rely on these statistics for decision making, publicity,  and campaigning.
  • Companies consider the drop out rate when planning to locate in Georgia or elsewhere as it is an indicator of future skills pools in the area.

The article quotes Bob Wise on the implications of trying to fudge the data that sums up the impact of masking drop outs by miscoding (by accident or design):

“Entering rosy data won’t get you a bed of roses,” Wise said. “In a state like Georgia that is increasingly technologically oriented, it will get you a group of people that won’t be able to function meaningfully in the workforce.”

The article goes on to highlight yet more knockon impacts from the crummy data and poor quality information that the study showed:

  • Federal standard formulae for calculation of dropouts won’t give an accurate figure if there is mis-coding of students as “transfers” from one school to another.
  • A much touted unique student identifier has been found to be less than unique, with students often being given a new identifier in their new school
  • Inconsistencies exist in other data, for example students who were reported “removed for non-attendance” but had  zero absent days recorded against them.

Given the impact on students, the implications for school rankings and funding, the costs of correcting errors, and the scale and extent of problems uncovered, this counts as a classic IQTrainwreck.

Information Quality Problems in Danish EU Elections

In a story that is bound to bring tears to the eyes of at least one regular contributor to on-line discussions about information quality, news reaches us from Jan Erik Ingvaldsen of a series of information quality disasters in the Danish EU elections which take place on Sunday.

Denmark is also holding a referendum on the same day, which some are viewing as a root cause for these problems.

Here is the story in Danish (big thank you to Jan Erik for this link via twitter)

Here is a rough translation in Google English. As google’s machine translations are never 100% accurate, we’d welcome any links to this story in English.

The problems (as we can make them out from the translation)

  • In Ikast-Brande, a municipality of 1700 people, voters received polling cards directing them to the wrong polling stations (wrong people, wrong ballot boxes)
  • In Stvens, the polling station is correct (right people going to the right ballot box), but the address given for the polling station is wrong.
  • In Vejle the wrong zip code was included in the map of the electoral district (i.e. people entitled to vote in a given polling station)
  • 2700 voters in three constituencies have been instructed to vote in two different locations
  • Some voters who are entitled to vote in both the Referendum and the EU Parliament elections have only received polling cards for one of the two ballots
  • Some postal voters report that they have  not received any information on the candidates running in the area their vote is being counted in.

Danish speaker – please correct my listing of the problem (admitting your problems is the first step on the road to recovery).

Jan Erik Ingvaldsen has a great post which summarises the problems in English.

Despite all of this, some elections commentators say that the preparations for the election are good, and they will sort out the problems when the complaints come in. Hanging Chads anyone?

While the fact that the Danes are running an EU election and a referendum on the same day, with different voter eligibility rules for each, goes some way to explaining the challenges that might have contributed to these problems, the defence is weakened by the fact that Ireland is running two ballots as well this week for EU Parliament elections and Local Government Councils and has not had reports of similar problems (yet).

Of course, the problem in Ireland is that our electoral register is wonderfully inaccurate.

For the impact that poor quality information can have on democratic processes, the view that the errors and impact can be “inspected out”, and because investigating this story made me have to figure out Google’s attempts at translation, this is being classed as a definite IQ Trainwreck.

Also, as Jan Erik points out in his blog, if this was happening in a South American or African nation there would be widespread media outcry.

Thanks to Jan Erik  (@jeric40 on twitter) for flagging this one to our attention.

These are the IQ trainwrecks in your neighbourhood

Stumbled upon this lovely pictorial IQTrainwreck today on Twitter. Thanks to Angela Hall (@sasbi) for taking the time to snap the shot and tweet it and for giving us permission to use it here. As Angela says on her Twitpic tweet:

Data quality issue in the neighborhood? How many street signs (with diff names) are needed? Hmmmm

Data quality issue in the neighborhood? How many street signs... on Twitpic In the words of Bob Dylan: “How many roads must a man walk down?”

Trusted Electoral Information

Introduction

Warning – this is a long and detailed examination of a complicated trainwreck

[Update] The IAIDQ has issued a press release on this topic…Election Throws A Spotlight On Poor Data Quality. [/update]

In every democracy citizens must be able to trust that the State will not impede their right to vote through any act or omission on the part of the State or its agents. Regular visitors to the iqtrainwrecks.info blog will know that Ireland has it’s fair share of problems with its electoral register. Of course, that isn’t news.

However, the Washington Post has reported last weekend (18th October) that the US elections are being plagued by similar issues. The New York Times covers the same ground in this story from 9th October. With a slightly important vote coming up on the 4th of November, that is news

In a saga that has found its way to the US Supreme Court (in at least one case so far), voters are being forced to re-establish their eligibility to vote before the election on November 4th. As the Post points out, “many voters may not know that their names have been flagged” which could “cause added confusion on Election Day”.

So what is going on (apart from the lawyers getting richer of the inevitable law suits and voters finding themselves reduced to just “Rs” as they lose their Vote)? Where is the trust being lost? Why is this an IQ Trainwreck?

A Change of Process and a Migration of Data

Under the Help America Vote Act, responsibility for the management of electoral registers was moved from locally managed (i.e. county level) to state administered. This has been trumpeted as a more efficient and accurate way to manage the accuracy of electoral lists. After all, the states also have the driver licensing data, social welfare data and other data sources to use to validate that a voter is a voter and not a gold fish.

However, where discrepancies arise between the information on the voter registration and other official records, the voter registration is rejected. And as anyone who has dealt with ‘officialdom’ can testify to, very often those errors are outside the control of the ‘data subject’ (in this case the voter). The legislation requires election officials to use the state databases first, with recourse to the Federal databases (such as social security) supposedly reserved as a ‘last resort’ because ,according the the New York Times, “using the federal databases is less reliable than the state lists and is more likely to incorrectly flag applications as invalid”.

Of course, for a comment on the accuracy of state databases I’ve found this story on The Risks Digest which seems to sum things up (however, as a caveat I’ll point out that the story is 10 years old, but my experience is that when crappy data gets into a system it’s hard to get it out). In the linked-to story, the author (living in the US) tells of her experience with her drivers license which insisted on merging her first initial and middle name (the format she prefers to use) to create a new non-name that didn’t match her other details. That error then propagated onto her tax information and appeared on a refund cheque she received.

In short, it would seem she might have a problem voting (if her drivers license and tax records haven’t been corrected since).

Accuracy of Master Data, and consistency of Master Data

The anecdote above highlights the need for accuracy in the master data that the voter lists are being validated against. For example, the Washington Post article cites the example of Wisconsin, which flags voters data discrepancies “as small as a middle initial or a typo in a birth date”.

I personally don’t use the apostrophe in my surname. I’m O Brien, not O’Brien. Also, you can spell my first name over a dozen different ways (not counting outright errors). A common alternate spelling is Darragh, as opposed to Daragh. It looks like that in Wisconsin I’d have high odds of joining the four members of their 6-strong state elections board who all failed validation due to mismatches on data.

In Alabama, there is a constitutional ban on people convicted of felony crimes of “moral turpitude” voting. The Governor’s Office has issued one masterlist of 480 offences, which included “disrupting a funeral” as a felony. The Courts Administrator and Attorney General issued a second list of more violent crimes to be used in the voter validation process. Unfortunately, it seems that the Governor’s list was used until very recently instead of the more ‘lenient’ list provided by the Courts Administrator.

Combine this with problems with the accuracy of other master data, such as lists of people who were convicted of the aforementioned felonies and there is a recipe for disenfranchisement. Which is exactly what has happened to a former governor (a Republican at that) called Guy Hunt.

In 1993 Mr Hunt had been convicted of a felony related to ethics violations He received a pardon in 1998. In 2008 his name was included on a “monthly felons check” sent to a county Registrar. Mr Hunt’s name shouldn’t have been on the list.

According to the Washington Post article, Mr Hunt isn’t the only person who was included on the felon list. 40% of the names on the list seen by the Washington Post had only committed misdemeanors. In short, the information was woefully inaccurate.

But it is being used to de-register voters and deprive them of their right tohave their say on the 4th November.

The Washington Post also cites cases where US citizens have been flagged as non-citizens (and therefore not entitled to vote) due to problems with social security numbers. Apparently some election officials have found the social security systems to be “not 100% accurate”. But this is the reason why they are supposed only to be used when the state systems on their own are insufficient to verify the voter. That’s the lawapparently).

Continue reading

Plumbing the depths of information quality

Ireland’s Evening Herald newspaper carried a story recently about the costs and impacts of address data quality.

To cut a long story short, the family at Address A made enquiries with a heating contractor/plumber about the cost of a home heating system. A short time later, the plumbers arrived at Address A (who had only made an enquiry) and proceeded to rip up floorboards and fit radiators etc.

After 4 hours working, the plumbers in House A got a call. They were at the wrong address. They should have been at House B, which was an address that differed by one word from House A.

The owner of House A subsequently sued to have her home restored and to compensate her for distress. Going to the wrong address (and not noticing it for four hours) cost the plumbing firm €5000 plus legal costs.

Schoolboy millionaire dreams

In a story that, in this reader’s view says as much about the lack of imagination of modern youth as it does about information quality management, comes news that a teenager in the UK has found himself stg£300 in debt after using an ATM card his bank sent him.

When the young man went to take some cash out, he was pleased to find that he could take out the maximum £300 and blew it all on the things teenagers blow their money on – iPods and such like. When he later checked his balance he found he had a bank balance of (apparently) stg£2 Million.

The boy had been waiting on payments due to him under the UK Government’s education maintenance allowance (EMA) scheme, under which students get 30 pounds a week to encourage them to stay on at school, so he didn’t initially question how he had sufficient funds in his account for the £300 withdrawal. He was, it seems, overjoyed to find the seven figure sum in his account, apparently as a result of his having gone to school when he should have (although one might suggest he should pay closer attention in maths classes so he can work out how many weeks he’d have to be in school to earn £2 million – the equivalent of 1282 years).

Subsequently the bank corrected its error and the young man found himself £300 over-drawn.
While the bank has suggested that the boy “should have known better” (and it is hard to argue with that), it is clear that the bank did make an error in the information associated to this boy’s account, and that is the IQ Trainwreck here. Some process within the bank erroneously put a large amount of money in the wrong account, resulting in a foolish teenager digging themselves into debt they could not afford (and the lad’s contribution to his own woes cannot be ignored).