The professional cycling sport has been havocked by the doping ghost during the last years with the confessions from Lance Armstrong as the latest paramount following other confessions for example by fellow Tour de France winner Bjarne Riis.
The word denial is probably the most central term in all this mess. The riders have kept denying the facts past the threshold of absurdity.
We do see a lot of the same kind of denial within the realm of data management where data quality issues obvious to everyone are denied often with the sentiment that of course there are a lot of data quality issues around, but certainly not with my data. My data is clean.
But they ain’t.
When working with data quality issues some of the big questions are: How bad is it? Is it getting worse? Can we do something about it? Who should do something about it?
These questions are basically the same as those around the changing climate on this planet including rising sea levels.
This morning I read an article on BBC news telling that several scientific teams have joined forces in an attempt to quantify exactly how it is with rising sea levels. The short answer is that the sea level now is 11.1 millimeters (7⁄16 of an inch) higher than in 1992.
The sea is rising because of melting ice primary on Antarctica and Greenland as seen below:
So I think it’s high time to ask the people of Antarctica and not at least the people of Greenland to do something serious about that their ice is melting and flooding innocent people in the rest of the world.
Most organizations have a lot of data quality issues where there is a wealth of possible solutions to deal with these challenges.
What you usually do is that that you categorize the problems into three different types of best resolutions:
You could go ahead with solving the data quality problems today but probably you have better and more important things to do right now.
Your organization may have a global SAP rollout going on or other resource demanding implementations. Therefore it is most wise to deal with the data quality issues when everything is running smoothly.
Maybe a resolution has been tried before and didn’t work. Chances that alternate people management, different orchestration of processes and development in available technology will change that are very slim.
May the force be with you
Many problems solve themselves over time or hopefully don’t get noticed by anyone. If things get ugly you always have your lightsaber.
Finally wordpress.com, the hosted version of WordPress that I am using, has added geography to the stats.
The counter has been running for 14 days now, so I have tried to have a first look into the numbers.
First of all I’m pleased that I during these 14 days have had visitors from 67 different countries around the globe:
Most visitors have been from the United States, followed by my current home country United Kingdom and then my former home country Denmark:
Note: This figure is made by copying the results into excel.
If grouped by regions of the world, it looks like this:
The world has certainly become a small place. Of course your interactions are biased towards your neighborhood, but in blogging as well as in business our success will increasingly become dependent on meeting, understanding and interacting with (maybe not so) strange people of the world.
Sometimes I, along with other folks in my social network circles and groups, describe myself as a data geek.
Another none anonymous data geek, Rich Murnane, recently started a series of excellent cartoons on his blog about DataGeek’s first days on a new job. Hard work indeed.
Then the data geeky corporate twitter account of IBM Initiate has made a twittpoll asking: Do you consider yourself a data geek or a management geek?
It’s a hard question. Because you know that a lot of things about better data is about better management and it’s much more admirable to be a management geek than a poor data geek.
Anyway I stood firm and admitted that I am a data geek. Because the world has always been crowded with management consultants with little attention to the needs of the data. Someone has to take care about the data. It’s hard, but it’s worth it.
When we move around in the traffic we may have different roles at different times. Sometimes I drive a car, sometimes I’m a pedestrian and sometimes I ride a bicycle. The traffic infrastructure tries to separate these roles by having roads for cars, sidewalks (pavements) for pedestrians and bicycle paths for bicycles. But in intersections these separations meets and creates cases of who’s to have the upper hand and sometimes all three constructions aren’t available, so pedestrians and bicycle riders may use a road made for cars.
I have just completed a short (kind of) holiday where we took our bicycles on a tour around parts of the Baltic Sea coast through four different countries: Denmark, Germany, Poland and Sweden. Our start and end was in Copenhagen, which is known for having extremely good conditions for bicycling coined by the term “Copenhagenization”.
The quality and availability of bicycle paths varied a lot on the route. Sometimes you felt that the bicycle paths were constructed to make the life of bicycle riders as miserable as possible. When the bicycle path wasn’t there or was too bad we hit the road, which was extremely unpopular among the car drivers. Not at least German Mercedes drivers love their horns.
But I guess it’s nothing personal. When I drive my car I also think pedestrians and bicycle riders are a pain in the …
Such cases of not liking a role you have yourself at another time also applies to a lot of other situations in life. For example I’m not very excited about all the data quality checks and mandatory fields I have to deal with in the CRM system when I have sold a data quality tool or service. I see them as a pain in the …
And oh yes, after finishing the cycling tour I did have some pain…
When did the first data quality issue occur? Wikipedia says in the data quality article section titled history that it began with the mainframe computer in the United States of America.
Fellow data quality blogger Steve Sarsfield made a blog post a few years ago called A Brief History of Data Quality where it is said “Believe it or not, the concept of data quality has been touted as important since the beginning of the relational database”.
However, a predominant sentiment in the data quality realm is that data quality is not about technology. It is about people. People are the sinners of data quality flaws and as the main part of the problem people should also be the overwhelming part, if not the only part, of the solution.
So I guess data quality challenges were introduced when people showed up in the real world. How and when that happened is a matter of discussion as discussed in the blog post Out of Africa.
As explained in the post Movable Types the invention of movable types in printing some hundreds of years ago (the most important invention since someone invented the wheel for the first time) made a big boost in knowledge sharing among people – and also a big boost in data and information quality issues.
But I think the saying “To err is human, but to really foul things up you need a computer” is valid. Consequently I also think you may need a computer to help with cleaning up the mess and to prevent the mess from happening again. End of (hi)story.