The Role of Technology in Data Quality Management

A recent article called Data’s Credibility Problem by Thomas Redman on Harvard Business Review has rightfully got a lot of mentions in the data quality community on social media including Twitter.

I agree with many things in the article except I have to question the credibility of this saying:

The solution is not better technology: It’s better communication between the creators of data and the data users”.

There is a lot a truth in this saying. But it is in my eyes not valid.

If the human race had relied solely on communication we would still discuss if a wheel should have the shape of a square or a circle. There is a balance between fruitful communication and throwing technology at problems and you may emphasize on one side or the other depending on if you sell data quality consultancy or data quality tools.

I would say:

“The solution is better communication between the creators of data, the data users and the innovators of data quality technology”.

Now, how do I best spread this message….

papertweet

Bookmark and Share

6 thoughts on “The Role of Technology in Data Quality Management

  1. John Owens 3rd December 2013 / 21:06

    Hi Henrik

    Technology is only an enabling tool. It also has no judgement. It can enable you to do both right the thing and the wrong thing.

    If the people creating data do not communicate with those who are going to us it, then no amount of technology is going to bridge the void the this creates.

    I wonder if the loss in the fundamental skills in the field Data Quality Management is due to a whole generation who have become over reliant on data cleansing technology?

    I am no Luddite. I am all for ever better technology but only when it is developed, chosen and employed by people with the right skills and knowledge to do so properly.

    Kind regards
    John

    • Henrik Liliendahl Sørensen 3rd December 2013 / 21:20

      Thanks a lot John for commenting within minutes from the antipole.

    • Henrik Liliendahl Sørensen 4th December 2013 / 09:43

      Thanks for the follow up post Gary. From reading the paper that elaborates on Redman’s stance and a lot of other postings out there a problem also is that data quality technology often is regarded as batch cleansing. As Redman says in his paper we must shift our focus to how data is created in the first place. I agree with that and as consequence I have spend the last couple of years helping with developing technology that enables doing just that.

  2. gino fortunato 4th December 2013 / 20:28

    Excellent points here. I think the original statement would be worded better as “The solution is better technology AND better communication between the creators of data and the data users” The original statement seems to imply that data quality tools can not get better. Clearly that will ‘always’ be false. But there are limits on the amount communication between the users and creators of data. For example, at the time of data creation, the data users may not be aware what they want to do with the data. They may not be aware that the data is being created! To take the point to the extreme, the users of the data may come many generations after the data creation. Think of economists trying to measure economic activity from hundreds of years ago or more.

    • Henrik Liliendahl Sørensen 4th December 2013 / 21:49

      Thanks for adding in Gino.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s