Ongoing Data Maintenance

Getting the right data entry at the root is important and it is agreed by most (if not all) data quality professionals that this is a superior approach opposite to doing cleansing operations downstream.

The problem hence is that most data erodes as time is passing. What was right at the time of capture will at some point in time not be right anymore.

Therefore data entry ideally must not only be a snapshot of correct information but should also include raw data elements that make the data easily maintainable.

An obvious example: If I tell you that I am 49 years old that may be just that piece of information you needed for completing a business process. But if you asked me about my birth date you will have the age information also upon a bit of calculation plus you based on that raw data will know when I turn 50 (all too soon) and your organization will know my age if we should do business again later.

Birth dates are stable personal data. Gender is pretty much too. But most other data changes over time. Names changes in many cultures in case of marriage and maybe divorce and people may change names when discovering bad numerology. People move or a street name may be changed.

There is a great deal of privacy concerns around identifying individual persons and the norms are different between countries. In Scandinavia we are used to be identified by our unique citizen ID but also here within debatable limitations. But you are offered solutions for maintaining raw data that will make valid and timely B2C information in what precision asked for when needed.

Otherwise it is broadly accepted everywhere to identify a business entity. Public sector registrations are a basic source of identifying ID’s having various uniqueness and completeness around the world. Private providers have developed proprietary ID systems like the Duns-Number from D&B. All in all such solutions are good sources for an ongoing maintenance of your B2B master data assets.

Addresses belonging to business or consumer/citizen entities – or just being addresses – are contained as external reference data covering more and more spots on the Earth. Ongoing development in open government data helps with availability and completeness and these data are often deployed in the cloud. Right now it is much about visual presenting on maps, but no doubt about that more services will follow.

Getting data right at entry and being able to maintain the real world alignment is the challenge if you don’t look at your data asset as a throw-away commodity.

Figure 1: one year old prime information

PS: If you forgot to maintain your data: Before dumping Data Cleansing might be a sustainable alternative.

Bookmark and Share

8 thoughts on “Ongoing Data Maintenance

  1. Charles Blyth 17th November 2009 / 16:27

    Henrik,

    A great detailed and clear post on a critical consideration for data management practitioners. Well explained as usual. For me ensuring data is as current as possible is key to MDM, and specifically CDI, success. You aren’t nearly 50 are you?

    Cheers

    Charles
    PS: Love the graphic 🙂

    • Henrik Liliendahl Sørensen 17th November 2009 / 16:42

      Thanks Charles. In Scandinavia we are constantly reminded about our age as the Citizen ID we have to use almost everywhere contains our birth date. So, no way around.

    • Henrik Liliendahl Sørensen 17th November 2009 / 16:45

      Thanks Per.

  2. Jim Harris 17th November 2009 / 16:45

    Great post Henrik,

    You are absolutely right that what is sometimes lost in the “cleanse vs. prevent” debate about data quality best practices is that however you “achieve quality” for your data, your job isn’t done – you must also provide ongoing maintenance.

    Data decay is the enemy of accuracy, relevancy, and many other dimensions of data quality.

    Best Regards,

    Jim

    P.S. I guess that we should expect such wisdom from a man of your age 😉

    • Henrik Liliendahl Sørensen 17th November 2009 / 16:58

      Hi Jim. Nice to see you in here too. You learn all through life – today I discovered the reply threading in WordPress.

  3. Alan Stein 24th June 2010 / 10:25

    Interesting post, Henrik. So many of these concepts are very simple when they’re broken down, but organizational layers, bureaucracy and politics make them difficult.

    The gatherers of the data are traditionally far apart from the users of the data and the managers/stewards of the data. Bringing them closer together so that they appreciate each others’ jobs can help get back to that simplicity.

  4. Henrik Liliendahl Sørensen 24th June 2010 / 15:26

    Thanks Alan, I agree. Communication across silos and borders is essential when improving data quality.

Leave a reply to Per Olsson Cancel reply