Getting a 360-degree view (or single view) of your customers has been a quest in data management as long as I can remember.
This has been the (unfulfilled) promise of CRM applications since they emerged 25 years ago. Data quality tools has been very much about deduplication of customer records. Customer Data Integration (CDI) and the first Master Data Management (MDM) platforms were aimed at that conundrum. Now we see the notion of a Customer Data Platform (CDP) getting traction.
There are three basic steps in getting a 360-degree view of those parties that have a customer role within your organization – and these steps are not at all easy ones:
Step 1 is identifying those customer records that typically are scattered around in the multiple systems that make up your system landscape. You can do that (endlessly) by hand, using the very different deduplication functionality that comes with ERP, CRM and other applications, using a best-of-breed data quality tool or the data matching capabilities built into MDM platforms. Doing this with adequate results takes a lot as pondered in the post Data Matching and Real-World Alignment.
Step 2 is finding out which data records and data elements that survives as the single source of truth. This is something a data quality tool can help with but best done within an MDM platform. The three main options for that are examined in the post Three Master Data Survivorship Approaches.
Step 3 is gathering all data besides the master data and relate those data to the master data entity that identifies and describes the real-world entity with a customer role. Today we see both CRM solution vendors and MDM solution vendors offering the technology to enable that as told in the post CDP: Is that part of CRM or MDM?
Organizations typically holds product data in three different kind of applications:
In an ERP application together with all other kinds of master data and transaction data.
In an MDM (Master Data Management) application either as a Product MDM implementation or a multidomain MDM implementation together with other master data domains.
In a PIM (Product Information Management) application.
Each of these applications have their pros and cons and where an organization utilizes several of these applications we often see that there is no single source of truth for all product data, but that some product attributes are controlled by one application and some other attributes are controlled by another application. Recently I wrote a post on the Pimnews think forum with a walk through of different kinds of product attributes and if they typically are controlled in PIM or ERP / MDM. The post had the title Six Product Attribute Levels.
The overwhelming part of organizations still use ERP as the place for product data – often supplemented by satellite spreadsheets with product data.
However, more and more organizations, not at least larger global ones, are implementing MDM solutions and, also midsize organisations, are implementing PIM solutions. The solution market was before split between MDM and PIM solutions, but we now do see some of the PIM solution providers also encompassing MDM capabilities. On the Disruptive MDM/PIM List there is a selection of solutions either being more MDM-ish or more PIM-ish as examined in the post MDM, PIM or Both.
A Request for Proposal (RFP) process for a Master Data Management (MDM) and/or Product Information Management (PIM) solution has a hard fact side as well as there are The Soft Sides of MDM and PIM RFPs.
The hard fact side is the detailed requirements a potential vendor has to answer to in what in most cases is the excel sheet the buying organization has prepared – often with the extensive help from a consultancy.
Here are what I have seen as the most frequently included topics for the hard facts in such RFPs:
MDM and PIM: Does the solution have functionality for hierarchy management?
MDM and PIM: Does the solution have workflow management included?
MDM and PIM: Does the solution support versioning of master data / product information?
MDM and PIM: Does the solution allow to tailor the data model in a flexible way?
MDM and PIM: Does the solution handle master data / product information in multiple languages / character sets / script systems?
MDM and PIM: Does the solution have capabilities for (high speed) batch import / export and real-time integration (APIs)?
MDM and PIM: Does the solution have capabilities within data governance / data stewardship?
MDM and PIM: Does the solution integrate with “a specific application”? – most commonly SAP, MS CRM/ERPs, SalesForce?
MDM: Does the solution handle multiple domains, for example customer, vendor/supplier, employee, product and asset?
MDM: Does the solution provide data matching / deduplication functionality and formation of golden records?
MDM: Does the solution have integration with third-party data providers for example business directories (Dun & Bradstreet / National registries) and address verification services?
MDM: Does the solution underpin compliance rules as for example data privacy and data protection regulations as in GDPR / other regimes?
PIM: Does the solution support product classification and attribution standards as eClass, ETIM (or other industry specific / national standards)?
PIM: Does the solution support publishing to popular marketplaces (form of outgoing Product Data Syndication)?
PIM: Does the solution have a functionality to ease collection of product information from suppliers (incoming Product Data Syndication)?
The notion of a data centred application type called a Customer Data Platform (CDP) seems to be trending these days. A CDP solution is a centralized registry of all data related to parties regarded as (prospective) customers at an enterprise.
This kind of solution comes from two solution markets:
Will be interesting to follow how CDP solutions evolve and if it is CRM or MDM vendors who will do best in this discipline. One guess could be that MDM vendors will provide “the best” solutions but CRM vendors will sell most licenses. We will see.
Looking back at the first blog posts I think the themes touched are still valid.
The first post from June 2009 was about data architecture. 2000 years ago, the roman writer, architect and engineer Marcus Vitruvius Pollio wrote that a structure must exhibit the three qualities of firmitas, utilitas, venustas — that is, it must be strong or durable, useful, and beautiful. This is true today – both in architecture and data architecture – as told in the post Qualities in Data Architecture.
A recurring topic on this blog has been a discussion around the common definition of data quality as being that the data is fit for the intended purpose of use. The opening of this topic as made in the post Fit for what purpose?
Diversity in data quality has been another repeating topic. Several old tales including in the Genesis and the Qur’an have stories about a great tower built by mankind at a time with a single language of all people. Since then mankind was confused by having multiple languages. And indeed, we still are as pondered in the post The Tower of Babel.
Thanks to all who are reading this blog and not least to all who from time to time takes time to make a comment, like and share.
Reference Data Management (RDM) is a small but important extension to Master Data Management (MDM). Together with a large extension, being big data and data lakes, mastering reference data is increasingly being part of the offerings from MDM solution vendors as told in the post Extended MDM Platforms.
Reference data are these smaller lists of values that gives context to master data and ensures that we use the same (or linkable) codes for describing master data entities. Examples are:
Reference data tend to be externally defined and maintained typically by international standardization bodies or industry organizations, but reference data can also be internally defined to meet your specific business model.
3 RDM Solutions from MDM Vendors
Informatica has recently released their first version of a new RDM solution: MDM – Reference 360. This is by the way the first true Software as a Service (SaaS) solution from Informatica in the MDM space. This solution emphasizes on building a hierarchy of reference data lists, the ability to make crosswalks between the lists, workflow (approval) around updates and audit trails.
Reltio has embraced RDM has an integral part of their Reltio Cloud solution where the “RDM capabilities improves data governance and operational excellence with an easy to use application that creates, manages and provisions reference data for better reporting and analytics.”
Semarchy has a solution called Semarchy xDM. The x indicates that this solution encompasses all kinds of enterprise grade data and thus both Master data and Reference data while “xDM extends the agile development concept to its implementation paradigm”.
There are intersections between data modelling and data quality. In examining those we can use a data quality mind map published recently on this blog:
Data Modelling and Data Quality Dimensions:
Some data quality dimensions are closely related to data modelling and a given data model can impact these data quality dimensions. This is the case for:
Data integrity, as the relationship rules in a traditional entity-relation based data model fosters the integrity of the data controlled in databases. The weak sides are, that sometimes these rules are too rigid to describe actual real-world entities and that the integrity across several databases is not covered. To discover the latter one, we may use data profiling methods.
Data validity, as field definitions and relationship rules controls that only data that is considered valid can enter the database.
Some other data quality dimensions must be solved with either extended data models and/or alternative methodologies. This is the case for:
A common scenario is that for example a data model born in the United States will set the state field within an address as mandatory and probably to accept only a value from a reference list of 50 states. This will not work in the rest of world. So, in order to not getting crap or not getting data at all, you will either need to extend the model or loosening the model and control completeness otherwise.
With data about products the big pain is that different groups of products require different data elements. This can be solved with a very granular data model – with possible performance issues, or a very customized data model – with scalability and other issues as a result.
Data uniqueness: A common scenario here is that names and addresses can be spelled in many ways despite that they reflect the same real-world entity. We can use identity resolution (and data matching) to detect this and then model how we link data records with real world duplicates together in a looser or tighter way.
Some of the emerging technologies in the data storing realm are presenting new ways of solving the challenges we have with data quality and traditional entity-relationship based data models.
In the Product Data Lake venture I am working with right now we are also aiming to solve the data integrity, data validity and data completeness issues with product data (or product information if you like) using these emerging technologies. This includes solving issues with geographical diversity and varying completeness requirements through a granular data model that is scalable, not only seen within a given company but also across a whole business ecosystem encompassing many enterprises belonging to the same (data) supply chain.
The building next to my home office was originally two cement silos standing in an industrial harbor area among other silos. These two silos are now transformed into a connected office building as this area has been developed into a modern residence and commercial quarter.
Master Data Management (MDM) is on similar route.
The first quest for MDM has been to be a core discipline in transforming siloed data stores within a given company into a shared view of the core entities that must be described in the same way across different departmental views. Going from the departmental stage to the enterprise wide stage is examined in the post Three Stages of MDM Maturity.
But as told in this post, it does not stop there. The next transformation is to provide a shared view with trading partners in the business ecosystem(s) where your company operates. Because the shared data in your organization is also a silo when digital transformation puts pressure on each company to become a data integrated part of a business ecosystem.
Master Data Management (MDM) will play a crucial role in sustaining the needed data quality for AI and with the rise of digital transformation encompassing business ecosystems we will also see an increasing need for ecosystem wide MDM – also called multienterprise MDM.
Right now, I am working with a service called Product Data Lake where we strive to utilize AI including using Machine Learning (ML) to understand and map data standards and exchange formats used within product information exchange between trading partners.
The challenge in this area is that we have many different classification systems in play as told in the post Five Product Classification Standards. Besides the industry and cross sector standards we still have many homegrown standards as well.
Some of these standards (as eClass and ETIM) also covers standards for the attributes needed for a given product classification, but still, we have plenty of homegrown standards (at no standards) for attribute requirements as well.
Add to that the different preferences for exchange methods and we got a chaotic system where human intervention makes Sisyphus look like a lucky man. Therefore, we have great expectations about introducing machine learning and artificial intelligence in this space.
There are three kinds of solutions for handling product master data and related digital assets:
Master Data Management (MDM) solutions that are either focussed on product master data or being a multi-domain MDM solution covering the product domain as well as the party domain, the location domain, the asset domain and more.
Product Information Management (PIM) solutions.
Digital Asset Management (DAM) solutions.
According to Gartner Analyst Simon Walker a short distinction is:
MDM of product master data solutions help manage structured product data for enterprise operational and analytical use cases
PIM solutions help extend structured product data through the addition of rich product content for sales and marketing use cases
DAM solutions help users create and manage digital multimedia files for enterprise, sales and marketing use cases
The below figure shows what kind of data that is typically included in respectively an MDM solution, a PIM solution and/or a DAM solution.