A Master Data initiative is often described as making a “golden view” of all Master Data records held by an organization in various databases used by different applications serving a range of business units.
In doing that (either in the initial consolidation or the ongoing insertion and update) you will time and again encounter situations where two versions of the same element must be merged into one version of the truth.
In some MDM hub styles the decision is to be taken at consolidation time, in other styles the decision is prolonged until the data (links) is consumed in a given context.
In the following I will talk about Party Master Data being the most common entity in Master Data initiatives.
This spring Jim Harris made a brilliant series of articles on DataQualityPro on the subject of identifying duplicate customers ending with part number 5 dealing with survivorship. Here Jim describes all the basic considerations on how some data elements survives a merge/purge and others will be forgotten and gives good examples with US consumer/citizens.
Taking it from there Master Data projects may have the following additional challenges and opportunities:
- Global Data adds diversity into the rule set of consolidation data on record level as well as field level. You will have to comprise on simple global rules versus complex optimized rules (and supporting knowledge data) for each country/culture.
- Multiple types of Party Master Data must be handled when Business Partners includes business entities having departments and employees and not at least when they are present together with consumers/citizens.
- External Reference Data is becoming more and more common as part of MDM solutions adding valid, accurate and complete information about Business Partners. Here you have to set rules (on field level) of whether they override internal data, fills in the blanks or only supplements internal data.
- Hierarchy building is closely related to survivorship. Rules may be set for whether two entities goes into two hierarchies with surviving parts from both or merges as one with survivorship. Even an original entity may be split into two hierarchies with surviving parts.
What is essential in survivorship is not loosing any valuable information while not creating information redundancy.
An example of complex survivorship processing may be this:
A membership database holds the following record (Name, Address, City):
- Margaret & John Smith, 1 Main Street, Anytown
An eShop system has the following accounts (Name, Address, Place):
- Mrs Margaret Smith, 1 Main Str, Anytown
- Peggy Smith, 1 Main Street, Anytown
- Local Charity c/o Margaret Smith, 1 Main Str, Anytown
A complex process of consolidation including survivorship may take place. As part of this example the company Local Charity is matched with an external source telling it has a new name being Anytown Angels. The result may be this “golden view”:
ADDRESS in Anytown on Main Street no 1 having
• HOUSEHOLD having
– CONSUMER Mrs. Margaret Smith aka Peggy
– CONSUMER Mr. John Smith
• BUSINESS Anytown Angels having
– EMPLOYEE Mrs. Margaret Smith aka Peggy
Observe that everything survives in a global applicable structure in a fit hierarchy reflecting local rules handling multiple types of party entities using external reference data.
But OK, we didn’t have funny names, dirt, misplaced data…..