So, you have the algorithm! But do you have the data?

In the game of winning in business by using Artificial Intelligence (AI) there are two main weapons you can use: Algorithms and data. In a recent blog post Andrew White of Gartner, the analyst firm, says that It’s all about the data – not the algorithm.

AI iconIn the Master Data Management (MDM) space the equipment of solutions with AI capabilities has been going on for some time as reported in the post Artificial Intelligence (AI) and Master Data Management (MDM).

So, next thing is how to provide the data? It is questionable if every single organization has the sufficient (and well managed) master data to make a winning formula. Most organizations must, for many use cases, look beyond the enterprise firewall to get the training data or better the data fuelled algorithms to win the battles and the whole game.

An example of such a scenario is examined in the post Artificial Intelligence (AI) and Multienterprise MDM.

Are These Familiar Hierarchies in Your MDM / DQM / PIM Solution?

The term family is used in different contexts within Master Data Management (MDM), Data Quality Management (DQM) and Product Information Management (PIM) when working with hierarchy management and entity resolution.

Here are three frequent examples:

Consumer / citizen family

Family consumer citizenWhen handling party master data about consumers / citizens we can deal with the basic definition of a family, being a group consisting of two parents and their children living together as a unit.

This is used when the business scenario does not only target each individual person but also a household with a shared economy. When identifying a household, a common parameter is that the persons live on the same postal address (at the same time) while observing constellations as:

  • Nuclear families consisting of a female and a male adult (and their children)
  • Rainbow families where the gender is not an issue
  • Extended families consisting of more than two generations
  • Persons who happen to live on the same postal address

There are multicultural aspects of these constellations including the different family name constructions around the world and the various frequency and acceptance of rainbow families as well of frequency of extended families.

Company family tree

When handling party master data about companies / organizations a valuable information is how the companies / organizations are related most commonly pictured as a company family tree with mothers and sisters. This can in theory be in infinite levels. The basic levels are:

  • A global ultimate mother being the company that ultimately owns (fully or partly) a range of companies in several countries.
  • A national ultimate mother being the company that owns (fully or partly) a range of companies in a given country.
  • A legal entity being the basic registered company within a country having some form of a business entity identifier.
  • A branch owned by a legal entity and operating from a given postal / visiting address.

Family companyYou can build your own company tree describing your customers, suppliers and other business partners. Alternatively or supplementary, you can rely on third party business directories. It is here worth noticing that a national source will only go to the ultimate national mother level while a global source can include the global ultimate mother and thus form larger families.

Having a company family view in your master data repository is a valuable information asset within credit risk, supply risk, discount opportunities, cross-selling and more.

Product family

The term “product family” is often used to define a level in a homegrown product classification / product grouping scheme. It is used to define a level that can have levels above and levels below with other terms as “product line”, “product category”, “product class”, “product group”, “product type” and more.

Family productSometimes it is also used as a term to define a product with a family of variants below, where variants are the same product produced and kept in stock in different colours, sizes and more.

Read more about Stock Keeping Units (SKUs), product variants, product identification and product classification in the post Five Product Information Management Core Aspects.

Welcome EntityWise on The Disruptive MDM / PIM / DQM / List

EntityWiseThere is yet a new entry on the Disruptive MDM / PIM /DQM List.

EntityWise is a data matching solution specializing in the healthcare sector. At EntityWise they use machine learning and artificial intelligence (AI) based technology to overcome the burden of inspecting suspect duplicates.

As such EntityWise is a good example of the long tail of Data Quality Management (DQM) solutions that provides a good return of investment at organizations with specific data quality issues.

Learn more about EntityWise here.

Combining Data Matching and Multidomain MDM

Data Matching GroupTwo of the most addressed data management topics on this blog is data matching and multidomain Master Data Management (MDM). In addition, I have also founded two LinkedIn Groups for people interested in one of or both topics.

The Data Matching Group has close to 2,000 members. In here we discus nerdy stuff as deduplication, identity resolution, deterministic matching using match codes, algorithms, pattern recognition, fuzzy logic, probabilistic learning, false negatives and false positives.

Check out the LinkedIn Data Matching Group here.

Multidomain MDM GroupThe Multi-Domain MDM Group has close to 2,500 members. In here we exchange knowledge on how to encompass more than a single master data domain in an MDM initiative. In that way the group also covers the evolution of MDM as the discipline – and solutions – has emerged from Customer Data Integration (CDI) and Product Information Management (PIM).

Check out the LinkedIn Multi-Domain MDM Group here.

The result of combining data matching and multi-domain MDM is golden records. The golden records are the foundation of having a 360-degree / single view of parties, locations, products and assets as examined in The Disruptive MDM / PIM / DQM List blog post Golden Records in Multidomain MDM.

Welcome Reifier on the Disruptive MDM / PIM List

The Disruptive MDM / PIM List is list of solutions in the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) space.

The list presents both larger solutions that also is included by the analyst firms in their market reports and smaller solutions you do not hear so much about, but may be exactly the solution that addresses the specific challenges you have.

The latest entry on the list, Reifier, is one of the latter ones.

Matching data records and identifying duplicates in order to achieve a 360-degree view of customers and other master data entities is the most frequently mentioned data quality issue. Reifier is an artificial intelligence (AI) driven solution that tackles that problem.

Read more about Reifier here.

New entry Reifier

Three Not So Easy Steps to a 360-Degree Customer View

Getting a 360-degree view (or single view) of your customers has been a quest in data management as long as I can remember.

This has been the (unfulfilled) promise of CRM applications since they emerged 25 years ago. Data quality tools has been very much about deduplication of customer records. Customer Data Integration (CDI) and the first Master Data Management (MDM) platforms were aimed at that conundrum. Now we see the notion of a Customer Data Platform (CDP) getting traction.

There are three basic steps in getting a 360-degree view of those parties that have a customer role within your organization – and these steps are not at all easy ones:

360 Degree Customer View

  • Step 1 is identifying those customer records that typically are scattered around in the multiple systems that make up your system landscape. You can do that (endlessly) by hand, using the very different deduplication functionality that comes with ERP, CRM and other applications, using a best-of-breed data quality tool or the data matching capabilities built into MDM platforms. Doing this with adequate results takes a lot as pondered in the post Data Matching and Real-World Alignment.
  • Step 2 is finding out which data records and data elements that survives as the single source of truth. This is something a data quality tool can help with but best done within an MDM platform. The three main options for that are examined in the post Three Master Data Survivorship Approaches.
  • Step 3 is gathering all data besides the master data and relate those data to the master data entity that identifies and describes the real-world entity with a customer role. Today we see both CRM solution vendors and MDM solution vendors offering the technology to enable that as told in the post CDP: Is that part of CRM or MDM?

Top 15 MDM / PIM Requirements in RFPs

A Request for Proposal (RFP) process for a Master Data Management (MDM) and/or Product Information Management (PIM) solution has a hard fact side as well as there are The Soft Sides of MDM and PIM RFPs.

The hard fact side is the detailed requirements a potential vendor has to answer to in what in most cases is the excel sheet the buying organization has prepared – often with the extensive help from a consultancy.

Here are what I have seen as the most frequently included topics for the hard facts in such RFPs:

  • MDM and PIM: Does the solution have functionality for hierarchy management?
  • MDM and PIM: Does the solution have workflow management included?
  • MDM and PIM: Does the solution support versioning of master data / product information?
  • MDM and PIM: Does the solution allow to tailor the data model in a flexible way?
  • MDM and PIM: Does the solution handle master data / product information in multiple languages / character sets / script systems?
  • MDM and PIM: Does the solution have capabilities for (high speed) batch import / export and real-time integration (APIs)?
  • MDM and PIM: Does the solution have capabilities within data governance / data stewardship?
  • MDM and PIM: Does the solution integrate with “a specific application”? – most commonly SAP, MS CRM/ERPs, SalesForce?
  • MDM: Does the solution handle multiple domains, for example customer, vendor/supplier, employee, product and asset?
  • MDM: Does the solution provide data matching / deduplication functionality and formation of golden records?
  • MDM: Does the solution have integration with third-party data providers for example business directories (Dun & Bradstreet / National registries) and address verification services?
  • MDM: Does the solution underpin compliance rules as for example data privacy and data protection regulations as in GDPR / other regimes?
  • PIM: Does the solution support product classification and attribution standards as eClass, ETIM (or other industry specific / national standards)?
  • PIM: Does the solution support publishing to popular marketplaces (form of outgoing Product Data Syndication)?
  • PIM: Does the solution have a functionality to ease collection of product information from suppliers (incoming Product Data Syndication)?

Learn more about how I can help in the blog page about MDM / PIM Tool Selection Consultancy.

MDM PIM RFP Wordle

Human Errors and Data Quality

Every time there is a survey about what causes poor data quality the most ticked answer is human error. This is also the case in the Profisee 2019 State of Data Management Report where 58% of the respondents said that human error is among the most prevalent causes of poor data quality within their organization.

This topic was also examined some years ago in the post called The Internet of Things and the Fat-Finger Syndrome.

Errare humanum estEven the Romans new this as Seneca the Younger said that “errare humanum est” which translates to “to err is human”. He also added “but to persist in error is diabolical”.

So, how can we not persist in having human errors in data then? Here are three main approaches:

  • Better humans: There is a whip called Data Governance. In a data governance regime you define data policies and data standards. You build an organizational structure with a data governance council (or any better name), have data stewards and data custodians (or any better title). You set up a business glossary. And then you carry on with a data governance framework.
  • Machines: Robotic Processing Automation (RPA) has, besides operational efficiency, the advantage of that machines, unlike humans, do not make mistakes when they are tired and bored.
  • Data Sharing: Human errors typically occur when typing in data. However, most data are already typed in somewhere. Instead of retyping data, and thereby potentially introduce your misspelling or other mistake, you can connect to data that is already digitalized and validated. This is especially doable for master data as examined in the article about Master Data Share.

IoT and Business Ecosystem Wide MDM

Two of the disruptive trends in Master Data Management (MDM) are the intersection of Internet of Things (IoT) and MDM and business ecosystem wide MDM (aka multienterprise MDM).

These two trends will go hand in hand.

IoT and Ecosystem Wide MDM

The latest MDM market report from Forrester (the other analyst firm) was mentioned in the post Toward the Third Generation of MDM.

In here Forrester says: “As first-generation MDM technologies become outdated and less effective, improved second generation and third-generation features will dictate which providers lead the pack. Vendors that can provide internet-of-things (IoT) capabilities, ecosystem capabilities, and data context position themselves to successfully deliver added business value to their customers.”

This saying is close to me in my current job as co-founder and CTO at Product Data Lake as told in the post Adding Things to Product Data Lake.

In business ecosystem wide MDM business partners collaborate around master data. This is a prerequisite for handling asset master data involved in IoT as there are many parties involved included manufacturers of smart devices, operators of these devices, maintainers of the devices, owners of the devices and the data subjects these devices gather data about.

In the same way forward looking solution providers involved with MDM must collaborate as pondered in the post Linked Product Data Quality.