No One MDM Solution Can Fully Satisfy All Current and Future Use Cases

The title of this post is taken from the Gartner Critical Capabilities for Master Data Management Solutions.

One implication of this observation is that you when selecting your solution will not be able to use a generic analyst ranking of solutions as examined in the post Generic Ranking of Vendors versus an Individual Selection Service.

Selection Model

This is the reason of being for The Disruptive MDM / PIM / DQM List.

Another implication is that even the best fit MDM solution will not necessarily cover all your needs.

One example is within data matching, where I have found that the embedded solutions in MDM tools often only have limited capabilities. To solve this case, there are best of breed data matching solutions on the market able to supplement the MDM solutions.

Another example close to me is within multienterprise (business ecosystem wide) MDM, as MDM solutions are focused on each given organization. Here your interaction with a trading partner, and the interaction by the trading partner with you, can be streamlined with a solution like Product Data Lake.

What is a Golden Record?

The term golden record is a core concept within Master Data Management (MDM) and Data Quality Management (DQM). A golden record is a representation of a real world entity that may be compiled from multiple different representations of that entity in a single or in multiple different databases within the enterprise system landscape.

A golden record is optimized towards meeting data quality dimensions as:

  • Being a unique representation of the real world entity described
  • Having a complete description of that entity covering all purposes of use in the enterprise
  • Holding the most current and accurate data values for the entity described

In Multidomain MDM we work with a range of different entity types as party (with customer, supplier, employee and other roles), location, product and asset. The golden record concept applies to all of these entity types, but in slightly different ways.

Party Golden Record

Having a golden record that facilitates a single view of customer is probably the most known example of using the golden record concept. Managing customer records and dealing with duplicates of those is the most frequent data quality issue around.

If you are not able to prevent duplicate records from entering your MDM world, which is the best approach, then you have to apply data matching capabilities. When identifying a duplicate you must be able to intelligently merge any conflicting views into a golden record as examined in the post Three Master Data Survivorship Approaches.

In lesser degree we see the same challenges in getting a single view of suppliers and, which is one of my favourite subjects, you ultimately will want to have a single view on any business partner, also where the same real world entity have both customer, supplier and other roles to your organization.

Location Golden Record

Having the same location only represented once in a golden record and applying any party, product and asset record, and ultimately golden record, to that record may be seen as quite academic. Nevertheless, striving for that concept will solve many data quality conundrums.

Location management have different meanings and importance for different industries. One example is that a brewery makes business with the legal entity (party) that owns a bar, café, restaurant. However, even though the owner of that place changes, which happens a lot, the brewery is still interested in being the brand served at that place. Also, the brewery wants to keep records of logistics around that place and the historic volumes delivered to that place. Utility and insurance are other examples of industries where the location golden record (should) matter a lot.

Knowing the properties of a location also supports the party deduplication process. For example, if you have two records with the name “John Smith” on the same address, the probability of that being the same real world entity is dependent on whether that location is a single-family house or a nursing home.

Golden RecordsProduct Golden Record

Product Information Management (PIM) solutions became popular with the raise of multi-channel where having the same representation of a product in offline and online channels is essential. The self-service approach in online sales also drew the requirements of managing a lot more product attributes than seen before, which again points to a solution of handling the product entity centralized.

In large organizations that have many business units around the world you struggle with having a local view and a global view of products. A given product may be a finished product to one unit but a raw material to another unit. Even a global SAP rollout will usually not clarify this – rather the contrary.

While third party reference data helps a lot with handling golden records for party and location, this is lesser the case for product master data. Classification systems and data pools do exist, but will certainly not take you all the way. With product master data we must, in my eyes, rely more on second party master data meaning sharing product master data within the business ecosystems where you operate.

Asset (or Thing) Golden Record

In asset master data management you also have different purposes where having a single view of a real world asset helps a lot. There are namely financial purposes and logistic purposes that have to aligned, but also a lot of others purposes depending on the industry and the type of asset.

With the raise of the Internet of Things (IoT) we will have to manage a lot more assets (or things) than we usually have considered. When a thing (a machine, a vehicle, an appliance) becomes intelligent and now produces big data, master data management and indeed multi-domain master data management becomes imperative.

You will want to know a lot about the product model of the thing in order to make sense of the produced big data. For that, you need the product (model) golden record. You will want to have deep knowledge of the location in time of the thing. You cannot do that without the location golden records. You will want to know the different party roles in time related to the thing. The owner, the operator, the maintainer. If you want to avoid chaos, you need party golden records.

Data Matching and Deduplication

The two terms data matching and deduplication are often used synonymously.

In the data quality world deduplication is used to describe a process where two or more data records, that describes the same real-world entity, are merged into one golden record. This can be executed in different ways as told in the post Three Master Data Survivorship Approaches.

Data matching can be seen as an overarching discipline to deduplication. Data matching is used to identify the duplicate candidates in deduplication. Data matching can also be used to identify matching data records between internal and external data sources as examined in the post Third-Party Data Enrichment in MDM and DQM.

As an end-user organization you can implement data matching / deduplication technology from either pure play Data Quality Management (DQM) solution providers or through data management suites and Master Data Management (MDM) solutions as reported in the post DQM Tools In and Around MDM Tools.

When matching internal data records against external sources one often used approach is utilizing the data matching capabilities at the third-party data provider. Such providers as Dun & Bradstreet (D&B), Experian and others offer this service in addition to offering the third-party data.

To close the circle, end-user organizations can use the external data matching result to improve the internal deduplication and more. One example is to apply a matched duns-numbers from D&B for company records as a strong deduplication candidate selection criterium. In addition, such data matching results may often result not in a deduplication, but in building hierarchies of master data.

Data Matching and Deduplication

 

10 MDMish TLAs You Should Know

TLA stands for Three Letter Acronym. The world is full of TLAs. The IT world is indeed full of TLAs. The Data Management world is also full of TLAs. Here are 10 TLAs from the data management space that surrounds Master Data Management:

Def MDM

MDM: Master Data Management can be defined as a comprehensive method of enabling an enterprise to link all of its critical data to a common point of reference. When properly done, MDM improves data quality, while streamlining data sharing across personnel and departments. In addition, MDM can facilitate computing in multiple system architectures, platforms and applications. You can find the source of this definition and 3 other – somewhat similar – definitions in the post 4 MDM Definitions: Which One is the Best?

The most addressed master data domains are parties encompassing customer, supplier and employee roles, things as products and assets as well as location.

Def PIM

PIM: Product Information Management is a discipline that overlaps MDM. In PIM you focus on product master data and a long tail of specific product information – often called attributes – that is needed for a given classification of products.

Furthermore, PIM deals with how products are related as for example accessories, replacements and spare parts as well as the cross-sell and up-sell opportunities there are between products.

PIM also handles how products have digital assets attached.

This data is used in omni-channel scenarios to ensure that the products you sell are presented with consistent, complete and accurate data. Learn more in the post Five Product Information Management Core Aspects.

Def DAM

DAM: Digital Asset Management is about handling extended features of digital assets often related to master data and especially product information. The digital assets can be photos of people and places, product images, line drawings, certificates, brochures, videos and much more.

Within DAM you are able to apply tags to digital assets, you can convert between the various file formats and you can keep track of the different format variants – like sizes – of a digital asset.

You can learn more about how these first 3 mentioned TLAs are connected in the post How MDM, PIM and DAM Stick Together.

Def DQM

DQM: Data Quality Management is dealing with assessing and improving the quality of data in order to make your business more competitive. It is about making data fit for the intended (multiple) purpose(s) of use which most often is best to achieved by real-world alignment. It is about people, processes and technology. When it comes to technology there are different implementations as told in the post DQM Tools In and Around MDM Tools.

The most used technologies in data quality management are data profiling, that measures what the data stored looks like, and data matching, that links data records that do have the same values, but describes the same real world entity.

Def RDM

RDM: Reference Data Management encompass those typically smaller lists of data records that are referenced by master data and transaction data. These lists do not change often. They tend to be externally defined but can also be internally defined within each organization.

Examples of reference data are hierarchies of location references as countries, states/provinces and postal codes, different industry code systems and how they map and the many product classification systems to choose from.

Learn more in the post What is Reference Data Management (RDM)?

Def CDI

CDI: Customer Data Integration is considered as the predecessor to MDM, as the first MDMish solutions focused on federating customer master data handled in multiple applications across the IT landscape within an enterprise.

The most addressed sources with customer master data are CRM applications and ERP applications, however most enterprises have several of other applications where customer master data are captured.

You may ask: What Happened to CDI?

Def CDP

CDP: Customer Data Platform is an emerging kind of solution that provides a centralized registry of all data related to parties regarded as (prospective) customers at an enterprise.

In that way CDP goes far beyond customer master data by encompassing traditional transaction data related to customers and the emerging big data sources too.

Right now, we see such solutions coming both from MDM solution vendors and CRM vendors as reported in the post CDP: Is that part of CRM or MDM?

Def ADM

ADM: Application Data Management is about not just master data, but all critical data that is somehow shared between personel and departments. In that sense MDM covers all master within an organization and ADM covers all (critical) data in a given application and the intersection is looking at master data in a given application.

ADM is an emerging term and we still do not have a well-defined market – if there ever will be one – as examined in the post Who are the ADM Solution Providers?

Def PXM

PXM: Product eXperience Management is another emerging term that describes a trend to distance some PIM solutions from the MDM flavour and more towards digital experience / customer experience themes.

In PXM the focus is on personalization of product information, Search Ingine Optimization and exploiting Artificial Intelligence (AI) in those quests.

Read more about it in the post What is PxM?

Def PDS

PDS: Product Data Syndication connects MDM, PIM (and other) solutions at each trading partner with each other within business ecosystems. As this is an area where we can expect future growth along with the digital transformation theme, you can get the details in the post What is Product Data Syndication (PDS)?

One example of a PDS service is the Product Data Lake solution I have been working with during the last couple of year. Learn why this PDS service is needed here.

MDM License Distribution

Some of the hard facts presented in the Gartner Magic Quadrant for Master Data Management (MDM) Solutions is how the vendor licenses are distributed between the various master data domains. You can find these figures from the previous quadrant in the post Counting MDM Licenses.

The Latest MDM Magic Quadrant also includes these numbers. In order to highlight how the vendors have different profiles, let us concentrate on the innovative solutions registered on The Disruptive MDM / PIM / DQM List.

MDM License Distribution
Source: Gartner

The above figure shows the three domains where the vendor has sold the most licenses and how many customers who are handling multiple domains.

Contentserv is coming from a strong position in the Product Information Management (PIM) market and still have the vast part of licenses attached to product master data.

Enterworks is also coming from the PIM space and are with their ecosystem wide (or interenterprise as Gartner says) approach building up the multidomain grip through encompassing supplier master data.

Informatica is covering all domains with their suite of 360 solutions and have a good portion of customers doing multidomain MDM.

Reltio does cover all domains but are increasingly focusing on the customer domain with their connected customer 360 offering that encompasses all customer data.

Riversand is another vendor coming from the PIM space that is now growing into the multidomain MDM sphere with their new cloud-native platform.

Semarchy is with their Intelligent Data Hub concept going beyond multidomain MDM into handling more kinds of data as reference data and critical application data.

This diversity means that you cannot just use a generic ranking as presented in the magic quadrant when selecting the best fit solution for your intended solution. You must make a tailored selection.

Most Visited Posts in 2019

Another year has gone as this blog is well into the 11th year of being online.

The 3 most visited blog posts this year were:

AI iconData Matching, Machine Learning and Artificial Intelligence: A post from November 2018 about how AI and data matching has been combined for many years and how this theme has got a revival with the general rise of Artificial Intelligence (AI).

Data ManagementA Data Management Mind Map: A post with not so much text but instead an image reflecting how some of most addressed data management disciplines can be mind-mapped.

Forrester vs GartnerForrester vs Gartner on MDM/PIM: A post about how the two most acknowledged analyst firms rate the vendors on the Master Data Management (MDM) / Product Information Management (PIM) market. Early next year we can expect a new MDM Magic Quadrant from Gartner, so let us see how things look then.

Looking forward to what the next year – and decade – brings in the data quality, MDM and PIM space and to write some posts about it.

Happy New Year.

So, you have the algorithm! But do you have the data?

In the game of winning in business by using Artificial Intelligence (AI) there are two main weapons you can use: Algorithms and data. In a recent blog post Andrew White of Gartner, the analyst firm, says that It’s all about the data – not the algorithm.

AI iconIn the Master Data Management (MDM) space the equipment of solutions with AI capabilities has been going on for some time as reported in the post Artificial Intelligence (AI) and Master Data Management (MDM).

So, next thing is how to provide the data? It is questionable if every single organization has the sufficient (and well managed) master data to make a winning formula. Most organizations must, for many use cases, look beyond the enterprise firewall to get the training data or better the data fuelled algorithms to win the battles and the whole game.

An example of such a scenario is examined in the post Artificial Intelligence (AI) and Multienterprise MDM.

Are These Familiar Hierarchies in Your MDM / DQM / PIM Solution?

The term family is used in different contexts within Master Data Management (MDM), Data Quality Management (DQM) and Product Information Management (PIM) when working with hierarchy management and entity resolution.

Here are three frequent examples:

Consumer / citizen family

Family consumer citizenWhen handling party master data about consumers / citizens we can deal with the basic definition of a family, being a group consisting of two parents and their children living together as a unit.

This is used when the business scenario does not only target each individual person but also a household with a shared economy. When identifying a household, a common parameter is that the persons live on the same postal address (at the same time) while observing constellations as:

  • Nuclear families consisting of a female and a male adult (and their children)
  • Rainbow families where the gender is not an issue
  • Extended families consisting of more than two generations
  • Persons who happen to live on the same postal address

There are multicultural aspects of these constellations including the different family name constructions around the world and the various frequency and acceptance of rainbow families as well of frequency of extended families.

Company family tree

When handling party master data about companies / organizations a valuable information is how the companies / organizations are related most commonly pictured as a company family tree with mothers and sisters. This can in theory be in infinite levels. The basic levels are:

  • A global ultimate mother being the company that ultimately owns (fully or partly) a range of companies in several countries.
  • A national ultimate mother being the company that owns (fully or partly) a range of companies in a given country.
  • A legal entity being the basic registered company within a country having some form of a business entity identifier.
  • A branch owned by a legal entity and operating from a given postal / visiting address.

Family companyYou can build your own company tree describing your customers, suppliers and other business partners. Alternatively or supplementary, you can rely on third party business directories. It is here worth noticing that a national source will only go to the ultimate national mother level while a global source can include the global ultimate mother and thus form larger families.

Having a company family view in your master data repository is a valuable information asset within credit risk, supply risk, discount opportunities, cross-selling and more.

Product family

The term “product family” is often used to define a level in a homegrown product classification / product grouping scheme. It is used to define a level that can have levels above and levels below with other terms as “product line”, “product category”, “product class”, “product group”, “product type” and more.

Family productSometimes it is also used as a term to define a product with a family of variants below, where variants are the same product produced and kept in stock in different colours, sizes and more.

Read more about Stock Keeping Units (SKUs), product variants, product identification and product classification in the post Five Product Information Management Core Aspects.

Welcome EntityWise on The Disruptive MDM / PIM / DQM / List

EntityWiseThere is yet a new entry on the Disruptive MDM / PIM /DQM List.

EntityWise is a data matching solution specializing in the healthcare sector. At EntityWise they use machine learning and artificial intelligence (AI) based technology to overcome the burden of inspecting suspect duplicates.

As such EntityWise is a good example of the long tail of Data Quality Management (DQM) solutions that provides a good return of investment at organizations with specific data quality issues.

Learn more about EntityWise here.

Combining Data Matching and Multidomain MDM

Data Matching GroupTwo of the most addressed data management topics on this blog is data matching and multidomain Master Data Management (MDM). In addition, I have also founded two LinkedIn Groups for people interested in one of or both topics.

The Data Matching Group has close to 2,000 members. In here we discus nerdy stuff as deduplication, identity resolution, deterministic matching using match codes, algorithms, pattern recognition, fuzzy logic, probabilistic learning, false negatives and false positives.

Check out the LinkedIn Data Matching Group here.

Multidomain MDM GroupThe Multi-Domain MDM Group has close to 2,500 members. In here we exchange knowledge on how to encompass more than a single master data domain in an MDM initiative. In that way the group also covers the evolution of MDM as the discipline – and solutions – has emerged from Customer Data Integration (CDI) and Product Information Management (PIM).

Check out the LinkedIn Multi-Domain MDM Group here.

The result of combining data matching and multi-domain MDM is golden records. The golden records are the foundation of having a 360-degree / single view of parties, locations, products and assets as examined in The Disruptive MDM / PIM / DQM List blog post Golden Records in Multidomain MDM.