The Place for Data Matching in and around MDM

Data matching has increasingly become a component of Master Data Management (MDM) solutions. This has mostly been the case for MDM of customer data solutions, but it is also a component of MDM of product data solutions not at least when these solutions are emerging into the multi-domain MDM space.

The deployment of data matching was discussed nearly 5 years ago in the post Deploying Data Matching.

Neural NetworkWhile MDM solutions since then have been picking up on the share of the data matching being done around it is still a fairly small proportion of data matching that is performed within MDM solutions. Even if you have a MDM solution with data matching capabilities, you might still consider where data matching should be done. Some considerations I have come across are:

Acquisition and silo consolidation circumstances

A common use case for data matching is as part of an acquisition or internal consolidation of data silos where two or more populations of party master data, product master data and other important entities are to be merged into a single version of truth (or trust) in terms of uniqueness, consistency and other data quality dimensions.

While the MDM hub must be the end goal for storing that truth (or trust) there may be good reasons for doing the data matching before the actual on-boarding of the master data.

These considerations includes

The point of entry

The MDM solution isn’t for many good reasons not always the system of entry. To do the data matching at the stage of data being put into the MDM hub may be too late. Expanding the data matching capabilities as Service Oriented Architecture component may be a better way as pondered in the post Service Oriented Data Quality.

Avoiding data matching

Even being a long time data matching practitioner I’m afraid I have to bring up the subject of avoiding data matching as further explained in the post The Good, The Better and The Best Way of Avoiding Duplicates.

Bookmark and Share

4 thoughts on “The Place for Data Matching in and around MDM

  1. Gary Allemann 6th November 2014 / 17:11

    Many MDM solutions are deployed as data aggregation or integration hubs. Data matching performed as part of the data quality process before the data enters MDM can be a great way to enhance the value of MDM. Great insights, Henrik

  2. Jean-Michel Franco 7th November 2014 / 15:25

    Interesting as always, Henrik Great topic, by the way something I touched upon in a recent blogpost on “data quality everywhere”.
    I’m unclear however on “who” should provide what you refer to as the “data matching capabilities as Service Oriented Architecture component” . Shouldn’t this be a “master data services”, that is, a web service operated by the MDM solution?

    Same question on your last point. My understanding is that what is specific then is that you then have to deal with multiple data points for reference data, correct? But isn’t it still matching, happening then in real time rather than “after the fact” ? and wouldn’t you need then a point of reference for rules, for example to prioritize the different source in case they hold different “versions of truth” ? And how would you deal with data stewardship in case you cannot decide through algorithms between those versions ?

  3. Henrik Liliendahl Sørensen 7th November 2014 / 16:10

    Thanks a lot for commenting Gary and Jean-Michel.

    @Jean-Michel: I can probably best answer your questions by telling a bit about some implementation projects I have been involved in being on the vendor side with data quality tools for master data.

    The SOA approach was taken when I worked with a product called Omikron Data Quality Server. In several projects we supported master data repositories with data matching where the same rules were applied both in batch processing and at point of entry. In many cases the matching capabilities in the master data solutions simply wasn’t good enough.

    In relation with your second question I am today working with a solution called iDQ™. In an interactive on-boarding business process it performs a series of fuzzy searches against internal master data and 3rd party reference data. The result is a mashup of data from where data entry personnel (data stewards if you like) can make an informed decision on using/enriching a current entity or creating a new entity based on best available external reference data.

    Technology wise we are using the fuzzy data matching tool capability only as a fuzzy search tool.

Leave a comment