Data Quality 2.0

Data Quality tools mainly support you with automation of Data Profiling and Data Matching.

There has to be something called “Data Quality 2.0”. The term is already flashed here and there. My thoughts about what is included are as follows:

Exploiting rich external reference data

Using external reference data in data quality improvement is not new at all. But we will see a lot of new sources in the cloud and governments around the world are planning to release huge amounts of public sector data. More on this in the post: Government says so.

Sharing data is key to a single version of the truth. The rise of social networking is also a new and exiting source of shared data.

International capabilities

WorldMatchWhile working with data quality tools and services for many years I have found that many tools and services are very national. So you might discover that a tool or service will make wonders with data from one country, but be quite ordinary or in fact useless with data from another country.

As globalization moves forward this challenge becomes more and more important. Enterprises tend to standardize world wide on tools and services, shared service centres takes care of data covering many countries and so on. When an employee works with data from another country he often wrongly adapts his local standards to these data and there by challenges the data quality more than seen before.

It will be a must that data quality tools in the future works globally.

Also visit The Tower of Babel and have a look at Sweden meets United States.

Service orientation

In my opinion the rise of Service Oriented Architecture is a golden opportunity for getting the benefits from data quality tools that we haven’t been able to achieve so much with the technology and approaches seen until now as described in the post: Service Oriented Data Quality.

One of the really great advantages of SOA components is support for multi purpose information quality as discussed in: Fit for what purpose?.  This is where Data Quality 2.0 meets MDM 2.0. One of the often forgotten features here is upstream prevention by error tolerant search.

Small business support

The real world is crowded with organisations where bureaucratic data owner, steward and custodian hierarchies, comprehensive data governance policies and excessive technology implementation makes no sense (and ROI).

Nevertheless many modest organisations do store – and also in my experience use – huge amounts of data. So we need more agile methods and tools to cover the need of these organisations. Check this post: The Statue of Liberty versus The Little Mermaid.

Human like technology

neuralUser friendly configurable software that are able to think and make decisions as – or better than – humans is also in the field of data quality becoming more and more realistic. Add to that the classic advantages of software compared to humans: Speed and repeatability. In the end technology always prevails – and we will build some beautiful data stores around.


Bookmark and Share

11 thoughts on “Data Quality 2.0

  1. Interesting points Henrik, I’ve not actually heard of people using DQ 2.0 but I guess it was inevitable.

    I would like to think that DQ 2.0 is a sign of maturity, moving away from downstream cleansing and cost-creating activities to an upstream prevention model where we actively design DQ management into our new technologies, business processes and the very fabric of how we deliver services within the organisation.

    Sadly, I think we’ll still be at DQ 1.0 for quite some time!

  2. Well said Dylan. Stepping up on the maturity ladder is also often mentioned when defining the other 2.0’s: Enterprise 2.0, Web 2.0, Data Warehouse 2.0….

    Upstream prevention is paramount, I certainly agree on that point too.

    Thanks a lot.

  3. Great read Henrik. Like Dylon I believe a DQ 2.0 is still some way off. Part of the problem is not fully appreciating what’s involved in Globalisation of DQ technologies.

    Don’t get me wrong I utterly agree Global DQ is essential to move forward.

    My issue is that the resource required for a vendor to confidently build such technology is vast. Having the expert knowledge in every country they wish to support – understanding the nuances of addressing, communication, language, dialects etc. is mind blowing!

    I’m not saying its impossible or shouldn’t be done, but having developed data quality technology for 10 years now I know how tricky global accessibility can be. Even more so for SME’s.

    And I can’t stress enough Dylon’s point – that a shift to prevention rather than downstream cleansing is essential. DQ 1.5 perhaps??

    @tizma

  4. Thanks Andy. What a nice day with clever people joining the blog. Very true observations about international aspects. 2.0 is the goal, 1.5 is a milestone on the way from 1.0.

  5. Oh Henrik! Please don’t ruin my day by telling me that yet another buzz word is in the offing! Heaven forfend! Don’t we have enough to contend with?

    I have an in-built horror of such terminology. The terms come around with boring regularity. They provide a shield for “experts” to hide behind and usually serve only to distract and to provide excuses for not tackling the reals issues – the quality of the data!

    I’ve been banging on since the year dot about needing to understand global markets and cultures, and of moving the data quality effort to upstream prevention instead of downstream cleansing, but while I may be on DQ 2.0 (gruesome!), then there are plenty of examples of companies still at DQ 0.0. As we all move at different speeds … and sometimes backwards … I don’t expect we’ll all have our noses in the same direction any time soon!

  6. Graham, I was waiting for that – but: When in Rome do as the Romans do. I will indeed agree with you on this some other place some other time.

  7. Summing up page and comments at current.

    Data Quality X.X are merely maturity milestones where:

    Data Quality 0.0 may be seen as a Laissez-faire state where nothing is done.

    Data Quality 1.0 may be seen as projects improving data quality typically using batch cleansing with national oriented techniques in order to make data fit for purpose.

    Data Quality 2.0 may be seen as enterprise wide and small business data quality prevention using multi-cultural combined techniques exploiting cloud based reference data in order to maintain data fit for multiple purposes.

  8. Pingback: Data Quality 2.0 meets MDM 2.0 « Liliendahl on Data Quality

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s