The ”First Time Right” principle is a good principle for data quality and indeed getting data right the first time is a fundamental concept in the instant Data Quality service I’m working with these days.
However, some blog posts in the data quality realm this week has pointed out that there is a life, and sometime an end of life, after data has hopefully been captured right the first time.
In the post From Cable to Grave by Guy Mucklow on the Postcode Anywhere blog the bad consequences of a case of chasing debt from a customer not among us anymore is examined.
Asset in, Garbage Out: Measuring data degradation is the title of a post by Rob Karel on Informatica Perspectives. Herein Rob goes through all the dangers data may encounter after being entered right the first time.
Some years ago I touched the subject in the post Ongoing Data Maintenance. As told here I’m convinced, after having seeing it work, that a good approach to also getting it right the last time is to capture data in a way that makes data maintainable.
Some techniques for doing this are:
- Where possible collect external identifiers
- Atomize data instead of squeezing several different elements into one attribute
- Make the data model reflect the real world
And oh, it’s not the first time, neither the last time, I will touch this subject. It needs constant attention.
Its always best to start correctly and to have an onboardingprocess which integrates correct verified information directly at the start. This saves a lot of troubles furter on. However this cannot work without having an ongoing process handling, keeping information fresh and up to date. I always hear that “how many changes can there be in a year, really?” As an example there are about 40 million changes a month in D&B:s global database of 226 million records globally, so the answer is… MANY.
Thanks for commenting Bernt-Olof. Indeed the business world, which is nicely reflected in the Dun & Bradstreet WorldBase, isn’t exactly a constant.