The Gartner Magic Quadrant for Master Data Management (MDM) Solutions 2018 was published last month.
Some of the numbers in the market that were revealed in the report was the number and distribution of MDM licenses from the included vendors. These covered their top-three master data domains and estimated license counts as well as the number of customers managing multiple domains:
One should of course be aware of the data quality issues related to comparing these numbers, as they in some degree are estimates based on different perceptions at the included vendors. So, let me just highlight these observations:
The overall number of MDM licenses and unique MDM customers (at the included vendors) is not high. Under 10,000 organizations world-wide is running such a solution. The potential new market out there for the salesforce at the MDM vendors is huge.
If you find an existing MDM solution user organization, they probably have a solution from SAP or Informatica – or maybe IBM. To be complete, Oracle has been dropped from the MDM quadrant, they practically do not promote their MDM solutions anymore, but there are still existing solutions operating out there.
The reign of Customer MDM is over. Product MDM is selling and multidomain is becoming the norm. Several MDM vendors are making their way into the quadrant from a Product Information Management (PIM) base as reported in the post The Road from PIM to Multidomain MDM.
PS: If you, as an end customer organization or a MDM and PIM vendor, want to work with me on the consequences for MDM solutions, here are some Popular Offerings for you.
Ultima Thule is a name for a distant place beyond the known world and the nickname of the most distant object in the solar system closely observed by a man-made object today the 1st January 2019. Before the flyby scientists were unsure if it was two objects, a peanut formed object or another shape. The images probing what it is will be downloaded during the next couple of months.
In a comment to this post Nadim observes that this Gartner quadrant is mixing up pure MDM players and PIM players.
That is true. It has always been a discussion point if one should combine or separate solutions for Master Data Management (MDM) and Product Information Management (PIM). This is a question to be asked by end user organizations and it is certainly a question the vendors on the market(s) ask themselves.
If we look at the vendors included in the 2018 Magic Quadrant the PIM part is represented in some different ways.
I would say that two of the newcomers, Viamedici and Contentserv (yellow dots in below figure), are mostly PIM players today. This is also mentioned as a caution by Gartner and is a reason for the current left-bottom’ish placement in the quadrant. But both companies want to be more multidomain MDM’ish.
8 years ago, I was engaged at Stibo Systems as part of their first steps on the route from PIM to multidomain MDM. Enterworks and Riversand (the orange dots in above figure) is on the same road.
Informatica has taken a different path towards the same destination as they back in 2012 bought the PIM player Heiler. Gartner has some cautions about how well the MDM and PIM components makes up a whole in the Informatica offerings and similar cautions was expressed around the Forrester PIM Wave as seen in the comments to the post There is no PIM quadrant, but there is a PIM wave.
But there was also a good deal of steadiness. Informatica still holds pole position in the race for going towards the top-right corner. Orchestra EBX, now disguised as Tibco EBX, is trailing them in the leaders quadrant. Old challengers as IBM, SAP and Stibo is watching them among the newcomers in the challengers quadrant and still as the only visionary – according to Gartner – we have Riversand.
In the niche players quadrant, we also still have Ataccama and Enterworks.
But there is still lot of free space in the top-right corner. There is still room for disruption. Gartner mentions some traditional forces still on the move being the good old 360 degree view on party data (customer, patient and the bit US biased provider) as well as Product Information Management (PIM) maybe in new wrappings as PCM or PXM.
If Gartner is still postponing this year’s MDM quadrant, they may even manage to reflect this change. We are of course also waiting to see if newcomers will make it to the quadrant and make the crowd of vendors in there go back to an above 10 number. Some of the candidates will be likes of Reltio and Semarchy.
Else, back to the takeover of Orchestra by Tibco, this is not the first time Tibco buys something in the MDM and Data Quality realm. Back in 2010 Tibco bought the data quality tool and data matching front runner Netrics as reported in the post What is a best-in-class match engine?
Then Tibco didn’t defend Netrics’ position in the Gartner Magic Quadrant for Data Quality Tools. The latest Data Quality Tool quadrant is also as the MDM quadrant from 2017 and was touched on this blog here.
So, will be exciting to see how Tibco will defend the joint Tibco MDM solution, which in 2017 was a sliding niche player at Gartner, and the Orchestra MDM solution, which in 2017 was a leader at the Gartner MDM quadrant.
Data matching is a sub discipline within data quality management. Data matching is about establishing a link between data elements and entities, that does not have the same value, but are referring to the same real-world construct. The most common example is establishing a link between two different data records probably describing the same person as for example:
Bob Smith at 1 Main Str in Anytown
Robert Smith at One Main Street in Any Town
Data matching can be applied to other master data entity types as companies, locations, products and more.
In the data matching world there has always been attempts to apply machine learning (or artificial intelligence if you like). This is because deterministic approaches usually result in too many false negatives being actual matching entities not found by the computer. Probabilistic / fuzzy logic approaches usually works better, but often not good enough.
One of my own attempts with machine learning was made within a solution at Dun & Bradstreet Nordic called GlobalMatchBox. One happy result of the machine learning capability was described in the post The Art in Data Matching.
In the recent years I have embraced product master data and product data quality within my business activities. The pain points in handling product information does in some cases include matching product entities but even more it is about matching the different taxonomies in use for product data, not at least between trading partners in business ecosystems.
In software architecture, publish–subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be. Similarly, subscribers express interest in one or more classes and only receive messages that are of interest, without knowledge of which publishers, if any, there are.
This kind of thinking is behind the service called Product Data Lake I am working with now. Whereas a publish-subscribe service is usually something that goes on behind the firewall of an enterprise, Product Data Lake takes this theme into the business ecosystem that exists between trading partners as told in the post Product Data Syndication Freedom.
Therefore, a modification to the publish-subscribe concept in this context is that we actually do make it possible for publishers of product information and subscribers of product information to care a little about who gets and who receives the messages as exemplified in the post Using a Business Entity Identifier from Day One. However, the scheme for that is a modern one resembling a social network where partnerships are requested and accepted/rejected.
As messages between global trading partners can be highly asynchronous and as the taxonomy in use often will be different, there is a storage part in between. How this is implemented is examined in the post Product Data Lake Behind the Scenes.
4 years ago, a post on this blog was called The Scary Data Lake. The post was about the fear about if the then new data lake concept would lead to data swamps with horrific data quality, data dumps no one would ever use, data cesspools with all the bad governed data and data sumps that would never be part of the business processes.
For sure, there have been mistakes with data lakes. But it seems that the data lake concept has matured and the understanding of what a data lake can do good is increasing. The data lake concept has even grown out of the analytic world and into more operational cases as told in the post Welcome to Another Data Lake for Data Sharing.
Some of the things we have learned is to apply well known data management principles to data lakes too. This encompasses metadata management, data lineage capabilities and data governance as reported in the post Three Must Haves for your Data Lake.
A couple of weeks ago Microsoft, Adobe and SAP announced their Open Data Initiative. While this, as far as we know, is only a statement for now, it of course has attracted some interest based on that it is three giants in the IT industry who have agreed on something – mostly interpreted as agreed to oppose Salesforce.com.
Forming a business ecosystem among players in the market is not new. However, what we usually see is that a group of companies agrees on a standard and then each one of them puts a product or service, that adheres to that standard, on the market. The standard then caters for the interoperability between the products and services.
In this case its seems to be something different. The product or service is operated by Microsoft based on their Azure platform. There will be some form of a common data model. But it is a data lake, meaning that we should expect that data can be provided in any structure and format and that data can be consumed into any structure and format.
In all humbleness, this concept is the same as the one that is behind Product Data Lake.
The Open Data Initiative from Microsoft, Adobe and SAP focuses at customer data and seems to be about enterprise wide customer data. While it technically also could support ecosystem wide customer data, privacy concerns and compliance issues will restrict that scope in many cases.
At Product Data Lake, we do the same for product data. Only here, the scope is business ecosystem wide as the big pain with product data is the flow between trading partners as examined here.
The intersection between Artificial Intelligence (AI) and Master Data Management (MDM) – and the associated discipline Product Information Management (PIM) – is an emerging topic.
A use case close to me
In my work at setting up a service called Product Data Lake the inclusion of AI has become an important topic. The aim of this service is to translate between the different taxonomies in use at trading partners for example when a manufacturer shares his product information with a merchant.
In some cases the manufacturer, the provider of product information, may use the same standard for product information as the merchant. This may be deep standards as eCl@ss and ETIM or pure product classification standards as UNSPSC. In this case we can apply deterministic matching of the classifications and the attributes (also called properties or features).
However, most often there are uncovered areas even when two trading partners share the same standard. And then again, the most frequent situation is that the two trading partners are using different standards.
As always, applying too much human interaction is costly, time consuming and error prone. Therefore, we are very eagerly training our machines to be able to do this work in a cost-effective way, within a much shorter time frame and with a repeatable and consistent outcome to the benefit of the participating manufacturers, merchants and other enterprises involved in exchanging products and the related product information.
Learning from others
This week I participated in a workshop around exchanging experiences and proofing use cases for AI and MDM. The above-mentioned use case was one of several use cases examined here. And for sure, there is a basis for applying AI with substantial benefits for the enterprises who gets this. The workshop was arranged by Camelot Management Consultants within their Global Community for Artificial Intelligence in MDM.