Turning a Blind Eye to Data Quality

The idiom turning a blind eye originates from the sea battle at Copenhagen where Admiral Nelson ignored a signal with permission to withdraw by raising the telescope to his blind eye and say “I really do not see the signal”.

Nelson went on and won the battle.

As a data quality practitioner you are often amazed by how enterprises turns the blind eye to data quality challenges and despite horrible data quality conditions keeps on and wins the battle by growing as a successful business.

The evidence about how poor data quality is costing enterprises huge sums has been out there for a long time. But business success are made over and again despite of bad data. There may be casualties, but the business goals are met anyway. So, the poor data quality is just something that makes the fight harder, not impossible.

I guess we have to change the messaging about data quality improvement away from the doomsday prophesies, which make decision makers turn a blind eye to data quality challenges, and be more specific on maybe smaller but tangible wins where data quality improvement and business efficiency goes hand in hand.        

Bookmark and Share

Warning about warnings

In the two months where I have been living now I have seen as many warnings about ”wet floor” and ”slippery ground” as I had until then in my entire life. And I’m not that young.

Given the amount of these warnings all over makes me think that the message is: “Yes, we know that you may tilt and hurt yourself. Actually we don’t care and we don’t intend to do anything about it. But at least now you can’t say, that we didn’t warn you”.

It also makes me think about what is being done about poor data quality all over. There are lots of warnings out there and lots of ways and methodology available about how to measure bad data. But when it comes to actually doing something to solve the problems, well, warning signs seems to be the most preferred remedy.

I’m as guilty as anyone else I guess. I have even proposed a data quality immaturity model once.

Doing something about “wet floor” and “slippery ground” often have a short term workaround and a long term solution. And actually “wet floor” is often due to a recent cleaning action.

A common saying is: “Don’t Bring Me Problems—Bring Me Solutions!”.

Let’s try to put up fewer warning signs and work on having less slippery ground including immediately after a cleaning action.

Bookmark and Share

Faster than the Speed of Light

One phrase I always have disliked is: “It can’t be done”.

What couldn’t be done yesterday at some place may be done today at your place.

Everyone knows that the bumblebee can’t fly. Except the bumblebee. So it does fly.

We all know that nothing can travel faster than the speed of light.

Well, until that a recent experiment at CERN showed that something apparently did travel faster than the light as told in an article on Sky News this morning called Amazement as Speed of Light ‘Is Broken’.

So, don’t always say that data quality can’t be improved because of this and that. Maybe it really couldn’t be done yesterday but things have changed today.

Bookmark and Share

International Data Steward of the Year

The 11th October is declared International Data Steward Day by the Data Roundtable and yesterday I threw in my candidate for the The Data Steward of the Year. So the next month I will be lobbying the fine selection of judges.

It’s going to be hard work as my candidate is behind from the start, as she will not see the 11th October 2011 as 10.11.11 but as 11.10.11. Let’s see if the contest is truly international or if the US candidates are playing on home ground.

Bookmark and Share

Tear Down This Wall!

Today is the 50th anniversary of the Berlin Wall. The wall is fortunately gone today, torn down as suggested by Ronald Reagan in 1987 with his famous words: Mr. Gorbachev, tear down this wall!

But today we have another bad wall, saying that an enterprise has two parts: Business and IT.

I disagree. So do many other people as for example Michael Baylon in this blog post called Is IT part of the business?

Yes, IT is part of the business. Tear down this wall!

Bookmark and Share

Marco Polo and Data Provenance

Besides being a data geek I am also interested in pre-modern history. So it’s always nice when I’m able to combine data management and history.

A recurring subject in historian circles is a suspicion saying that Explorer Marco Polo never actually went to China.

As said in the linked article from The Telegraph: “It is more likely that the Venetian merchant adventurer picked up second-hand stories of China, Japan and the Mongol Empire from Persian merchants whom he met on the shores of the Black Sea – thousands of miles short of the Orient”.

When dealing with data and ramping up data quality a frequent challenge is that some data wasn’t captured by the data consumer – not even by the organization using the data. Some of the data stored in company databases are second-hand data and in some cases the overwhelming part of data is captured outside the organization.

As with the book telling about Marco Polo’s (alleged) travels called “Description of the World” this doesn’t mean that you can’t trust anything. But maybe some data are mixed up a bit and maybe some obvious data are missing.

I have earlier touched this subject in the post Outside Your Jurisdiction and identified second-hand data as one of the Top 5 Reasons for Downstream Cleansing.

Bookmark and Share

Proactive Data Governance at Work

Data governance is 80 % about people and processes and 20 % (if not less) about technology is a common statement in the data management realm.

This blog post is about the 20 % (or less) technology part of data governance.

The term proactive data governance is often used to describe if a given technology platform is able to support data governance in a good way.

So, what is proactive data governance technology?

Obviously it must be the opposite of reactive data governance technology which must be something about discovering completeness issues like in data profiling and fixing uniqueness issues like in data matching.

Proactive data governance technology must be implemented in data entry and other data capture functionality. The purpose of the technology is to assist people responsible for data capture in getting the data quality right from the start.

If we look at master data management (MDM) platforms we have two possible ways of getting data into the master data hub:

  • Data entry directly in the master data hub
  • Data integration by data feed from other systems as CRM, SCM and ERP solutions and from external partners

In the first case the proactive data governance technology is a part of the MDM platform often implemented as workflows with assistance, checks, controls and permission management. We see this most often related to product information management (PIM) and in business-to-business (B2B) customer master data management. Here the insertion of a master data entity like a product, a supplier or B2B customer involves many different employees each with responsibilities for a set of attributes.

The second case is most often seen in customer data integration (CDI) involving business-to-consumer (B2C) records, but certainly also applies to enriching product master data, supplier master data and B2B customer master data. Here the proactive data governance technology is implemented in the data import functionality or even in the systems of entry best done as Service Oriented Architecture (SOA) components that are hooked into the master data hub as well.

It is a matter of taste if we call such technology proactive data governance support or upstream data quality. From what I have seen so far, it does work.

Bookmark and Share

Howcatchem

I’m sad to learn that Peter Falk has died. Peter Falk is most known as Lieutenant Columbo in the American television series Columbo, which has been shown all over the world including in my country when I was a teenager.

I think Columbo would have been a great data quality specialist too. Underestimated by the smart guys but focused on the important details while exercising the art of “howcatchem”. Opposite to his “whodunit” colleagues who are sitting on a horseback chasing the villains down a crowded city street or doing the same in a car while smashing most other cars on the way.

Bookmark and Share

Book Review: Berson and Dubov on MDM

A few days ago Julian Schwarzenbach over at the Data and Process Advantage Blog published a review of the book “Master Data Management and Data Governance” by Alex Berson and Larry Dubov. Link to Julian’s review here.

And hey, that’s the book I have been reading too during the last months. So why not make my review too.    

I agree very much with Julian’s positive review of the book. It is a very comprehensive book – and thick and heavy I have learned from bringing it with me on travel which is where I usually read offline stuff. But master data management and related data governance is a big and heavy discipline with a lot of details that has to be dealt with.

Probably I have annoyed fellow travellers in trains and airplanes while reading the book with exclamations as: Yes, precisely, that’s what I always have said, good point and so on. Because I agree very much with many of the issues described and the solutions discussed in the book.

For the mandatory bit of criticism that must be included in every book review I will bring on my pet bashing about United States and English language centricity. Well, it’s actually not that bad, as the book at many places does indicate that other angles and pains exist than those being prominent in the United States and with the English language.

Oh, and I bear with that  my surname in the references are spelled “Sorensen” instead of “Sørensen” and that a related date are formatted like “11/22/2009” which will be the 11th day in the 22nd month of the year 2009 to me.     

Bookmark and Share

Pick Any Two

The project triangle expresses the dilemma about that you probably want your project to be good, fast and cheap, but in practice you are only able to prioritize two of these three desirable options, in short:

Good, fast, cheap – pick any two

The pick any two among three theme can be related to a lot of other activities thus stating three terms with only two combinations possible in real life.

So what could be the pick any two among three themes for data quality?

Of course the good, fast, cheap dilemma also goes for data quality projects. But as data quality management isn’t just a project but an ongoing program, what else?

I have one suggestion:

Fit for purpose, real world alignment, fix it as we go – pick any two

The term “fit for purpose” has become more or less synonymous with “high quality data” and thus here chosen to express the good angle of data quality.

Some data, especially those we call master data, is used for multiple purposes within an organization. Therefore some kind of real world alignment is often used as a fast track to improving data quality where you don’t spend time analyzing how data may fit multiple purposes at the same time in your organization. Real world alignment also may fulfill future requirements regardless of the current purposes of use.

Managing data both being fit for multiple purpose and aligned with the real world is not something you just do in a cheap way by fixing it as we go. You may pick any two options in these combinations:

  • Make some data fit for purpose by fixing it as the pains shows up.
  • Align data with the real world typically by exploiting external reference data as the prices go down.
  • Lay out a thorough plan for having fit for multiple-purpose data aligned with the real world.

Bookmark and Share