Reduplication: The next big thing?

Today I got a very exciting Master Data Management assignment. Usually I do deduplication processes which means that two or more rows in a database are merged into one golden record because the original rows represents the same real world entity.

But in this case we are going to split one row into several rows with random keys (a so called MNUID = Messy Non-Unique IDentifier). Also names and addresses have to be misspelled in different ways so they are not easily recognized as being the same.

My client, the Danish Tax Authorities, has for years tried to develop methods for taxation above 100% and has finally reached this simple but very efficient method. Until now you as one person or one company pay up to 60% tax, but now each duplicate row will pay 60%. Hereby in phase one you may in fact pay 120%, but in later phases this will be extended to larger duplicate groups paying much higher percentages.

Already some foreign tax authorities have shown deep interest in this model (called Intelligent Reduplication for Supertaxation). First of all our Scandinavian neighbors are very interested, but eventually it may spread to the rest of the world.

Bookmark and Share

6 thoughts on “Reduplication: The next big thing?

  1. Rich Murnane 1st April 2010 / 13:09

    What a great project, would kind of feel like being a kid in the candy store I bet…Rich

  2. Jim Harris 1st April 2010 / 13:46

    Excellent report Henrik!

    Finally someone demonstrates the real value of poor data quality! No wonder so many organizations and past clients have resisted fixing the “duplicate problem.”

    Denmark has once again proven its thought leadership in the areas of data quality and master data management.

    Jealous American Regards,

    Jim

    🙂

  3. Steve Sarsfield 1st April 2010 / 14:09

    Henrik,
    Please keep this strategy quiet. It may catch on in other parts of the world, leaving us all in debt to the tax man when we die.
    Steve

  4. Henrik Liliendahl Sørensen 2nd April 2010 / 06:38

    Thanks Rich, Jim and Steve for commenting on this post, which is of course an April fool’s hoax.

  5. Jacqueline Roberts 9th April 2010 / 18:53

    Henrik,

    I understand that your post is an April fool’s hoax however your thought process is actually a necessary functionality in the PIM arena. Many times we will have a kit part submitted to us for cleansing and we need to split the record into individual items that make up the kit, which we refer to this function as a “part split”

    Nice to know that even if your are April fooling, you touch upon a relevant topic!!!

  6. Henrik Liliendahl Sørensen 9th April 2010 / 19:04

    Thanks Jackie, you are absolutely right. Sometimes we actually have to split rows in order to align data with the real world.

    I have done this with party master data too for example with rows containing names as:
    • Margaret & John Smith
    • Margaret Smith. John Smith
    • Johnson & Johnson Limited, John Smith
    • Johnson Furniture Inc., Sales Dept

    More here.

Leave a reply to Jim Harris Cancel reply