When making the baseline for customer data in a new master data management hub you often involve heavy data matching in order to de-duplicate the current stock of customer master data, so you so to speak start with a cleansed duplicate free set of data.
I have been involved in such a process many times, and the result has never been free of duplicates. For two reasons:
- Even with the best data matching tool and the best external reference data available you obviously can’t settle all real world alignments with the confidence needed and manual verification is costly and slowly.
- In order to make data fit for the business purposes duplicates are required for a lot of good reasons.
Being able to store the full story from the result of the data matching efforts is what makes me, and the database, most happy.
The notion of a “golden record” is often not in fact a single record but a hierarchical structure that reflects both the real world entity as far as we can get and the instances of this real world entity in a form that are suitable for different business processes.
Some of the tricky constructions that exist in the real world and are usual suspects for multiple instances of the same real world entity are described in the blog posts:
The reasons for having business rules leading to multiple versions of the truth are discussed in the posts:
I’m looking forward to yet a party master data hub migration next week under the above conditions.