The 2020 Gartner Magic Quadrant for Data Quality Solutions is out.
In here Gartner assumes that: “By 2022, 60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement”.
The data quality tool vendor rankings according to Gartner looks pretty much as last year. Precisely is the brand that last year was in there as Syncsort and Pitney Bowes.
Bigger picture here.
You can get a free reprint of the report from Talend or Informatica.
The question is if we are going to see the machine-learning based solutions coming from the crowd of vendors in a bit stalled quadrant or the disruption will come from new solution providers. You can find some of the upcoming machine-learning / Artificial Intelligence (AI) based vendors on The Disruptive MDM / PIM DQM List.
In the game of winning in business by using Artificial Intelligence (AI) there are two main weapons you can use: Algorithms and data. In a recent blog post Andrew White of Gartner, the analyst firm, says that It’s all about the data – not the algorithm.
In the Master Data Management (MDM) space the equipment of solutions with AI capabilities has been going on for some time as reported in the post Artificial Intelligence (AI) and Master Data Management (MDM).
So, next thing is how to provide the data? It is questionable if every single organization has the sufficient (and well managed) master data to make a winning formula. Most organizations must, for many use cases, look beyond the enterprise firewall to get the training data or better the data fuelled algorithms to win the battles and the whole game.
An example of such a scenario is examined in the post Artificial Intelligence (AI) and Multienterprise MDM.
There is yet a new entry on the Disruptive MDM / PIM /DQM List.
EntityWise is a data matching solution specializing in the healthcare sector. At EntityWise they use machine learning and artificial intelligence (AI) based technology to overcome the burden of inspecting suspect duplicates.
As such EntityWise is a good example of the long tail of Data Quality Management (DQM) solutions that provides a good return of investment at organizations with specific data quality issues.
Learn more about EntityWise here.
The Disruptive MDM / PIM List is list of solutions in the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) space.
The list presents both larger solutions that also is included by the analyst firms in their market reports and smaller solutions you do not hear so much about, but may be exactly the solution that addresses the specific challenges you have.
The latest entry on the list, Reifier, is one of the latter ones.
Matching data records and identifying duplicates in order to achieve a 360-degree view of customers and other master data entities is the most frequently mentioned data quality issue. Reifier is an artificial intelligence (AI) driven solution that tackles that problem.
Read more about Reifier here.
Every time there is a survey about what causes poor data quality the most ticked answer is human error. This is also the case in the Profisee 2019 State of Data Management Report where 58% of the respondents said that human error is among the most prevalent causes of poor data quality within their organization.
This topic was also examined some years ago in the post called The Internet of Things and the Fat-Finger Syndrome.
Even the Romans knew this as Seneca the Younger said that “errare humanum est” which translates to “to err is human”. He also added “but to persist in error is diabolical”.
So, how can we not persist in having human errors in data then? Here are three main approaches:
- Better humans: There is a whip called Data Governance. In a data governance regime you define data policies and data standards. You build an organizational structure with a data governance council (or any better name), have data stewards and data custodians (or any better title). You set up a business glossary. And then you carry on with a data governance framework.
- Machines: Robotic Processing Automation (RPA) has, besides operational efficiency, the advantage of that machines, unlike humans, do not make mistakes when they are tired and bored.
- Data Sharing: Human errors typically occur when typing in data. However, most data are already typed in somewhere. Instead of retyping data, and thereby potentially introduce your misspelling or other mistake, you can connect to data that is already digitalized and validated. This is especially doable for master data as examined in the article about Master Data Share.
Yesterday I had the pleasure of attending the Informatica MDM 360 and Data Governance Summit in London including being in a panel discussing best practices for your MDM 360 journey. The rise of Artificial Intelligence (AI) in Master Data Management (MDM) was a main theme at this event.
Informatica has a track record of innovating in new technologies in the data management space while also acquiring promising newcomers in order to fast track their market offering. So it is with AI and MDM at Informatica too. Informatica currently has two tracks:
- clAIre – the clairvoyant component in the Informatica portfolio that “using machine learning and other AI techniques leverages the industry-leading metadata capabilities of the Informatica Intelligent Data Platform to accelerate and automate core data management and governance processes”.
- Informatica Customer 360 Insights which is the new branding of the recent AllSight acquisition. You can learn about that over at The Disruptive Master Data Management Solutions List in the entry about Informatica Customer 360 Insights.
At the Informatica event the synergy between these two tracks was presented as the Intelligent 360 View. Naturally, marketing synergies are the first results of an acquisition. Later we will – hopefully – see actual synergies when the technologies are to be aligned, positioned and delivered to customers who want to be an intelligent enterprise of the future.
As any other IT enabled discipline Master Data Management (MDM) continuously undergo a transformation while adopting emerging technologies. In the following I will focus on five trends that seen today seems to be disruptive:
MDM in the Cloud
According to Gartner the share of cloud-based MDM deployment has increased from 19% in 2017 year to 24 % in 2018 and I am sure that number will increase again this year. But does it come as SaaS (Software as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service)? And what about DaaS (Data as a Service). Learn more in the post MDM, Cloud, SaaS, PaaS, IaaS and DaaS.
Extended MDM Platforms
There is a tendency on the Master Data Management (MDM) market that solutions providers aim to deliver an extended MDM platform to underpin customer experience efforts. Such a platform will not only handle traditional master data, but also reference data, big data (as in data lakes) as well as linking to transactions. Learn more in the post Extended MDM Platforms.
AI and MDM
There is an interdependency between MDM and Artificial Intelligence (AI). AI and Machine Learning (ML) depends on data quality, that is sustained with MDM, as examined in the post Machine Learning, Artificial Intelligence and Data Quality. And you can use AI and ML to solve MDM issues as told in the post Six MDM, AI and ML Use Cases.
IoT and MDM
The scope of MDM will increase with the rise of Internet of Things (IoT) as reported in the post IoT and MDM. Probably we will see the highest maturity for that first in Industrial Internet of Things (IIoT), also referred to as Industry 4.0, as pondered in the post IIoT (or Industry 4.0) Will Mature Before IoT.
Ecosystem wide MDM
Doing Master Data Management (MDM) enterprise wide is hard enough. But it does not stop there. Increasingly every organization will be an integrated part of a business ecosystem where collaboration with business partners will be a part of digitalization and thus we will have a need for working on the same foundation around master data. Learn more in the post Multienterprise MDM.
One of the hottest trends in the Master Data Management (MDM) world today is how to exploit Artificial Intelligence (AI) and ignite that with Machine Learning (ML).
This aspiration is not new. It has been something that have been going on for years and you may argue about when computerized decision support and automation goes from being applying advanced algorithms to being AI. However, the AI and ML theme is getting traction today as part of digital transformation and whatever we call it, there are substantial business outcomes to pursue.
As told in the post Machine Learning, Artificial Intelligence and Data Quality perhaps all use cases for applying AI is dependent on data quality and MDM is playing a crucial role in sustaining data quality efforts.
Some of the use cases for AI and ML in the MDM realm I have come across over the years are:
Translating between taxonomies: As reported in the post Artificial Intelligence (AI) and Multienterprise MDM emerging technologies can help in translating between the taxonomies in use when digital transformation sets a new bar for utilizing master data in business ecosystems.
Transforming unstructured to structured: A lot of data is kept in an unstructured way and to in order to systematically exploit these data in AI supported business process we need make data more structured. AI and ML can help with that too.
Data quality issue prevention: Simple rules for checking integrity and validating data is good – but unfortunately not good enough for ensuring data quality. AI is a way to exploit statistical methods and complex relationships.
Categorizing data: Digital transformation, spiced up with increasing compliance requirements, has made data categorization a must and AI and ML can be an effective way to solve this task that usually is not possible for humans to cover across an enterprise.
Data matching: Establishing a link between multiple descriptions of the same real-world entity across an enterprise and out to third party reference data has always been a pain. AI and ML can help as examined in the post The Art in Data Matching.
Improving insight: The scope of MDM can be enlarged to Extended MDM Platforms where other data as transactions and big data is used to build a 360-degree view of the master data entities. AI and ML is a prerequisite to do that.
The previous post on this blog was called Machine Learning, Artificial Intelligence and Data Quality. In here the it was examined how Artificial Intelligence (AI) is impacted by data quality and how data quality can impact AI.
Master Data Management (MDM) will play a crucial role in sustaining the needed data quality for AI and with the rise of digital transformation encompassing business ecosystems we will also see an increasing need for ecosystem wide MDM – also called multienterprise MDM.
Right now, I am working with a service called Product Data Lake where we strive to utilize AI including using Machine Learning (ML) to understand and map data standards and exchange formats used within product information exchange between trading partners.
The challenge in this area is that we have many different classification systems in play as told in the post Five Product Classification Standards. Besides the industry and cross sector standards we still have many homegrown standards as well.
Some of these standards (as eClass and ETIM) also covers standards for the attributes needed for a given product classification, but still, we have plenty of homegrown standards (at no standards) for attribute requirements as well.
Add to that the different preferences for exchange methods and we got a chaotic system where human intervention makes Sisyphus look like a lucky man. Therefore, we have great expectations about introducing machine learning and artificial intelligence in this space.
Next week, I will elaborate on the multienterprise MDM and artificial theme on the Master Data Management Summit Europe in London.
Using machine learning (ML) and then artificial intelligence (AI) to automate business processes is a hot topic and on the wish list at most organizations. However, many, including yours truly, warn that automating business processes based on data with data quality issues is a risky thing.
In my eyes we need to take a phased approach and double use ML and AI to ensure the right business outcomes from AI automated business processes. ML and AI can be used to rationalize data and overcome data quality issues as exemplified in the post The Art in Data Matching.
Instead of applying ML and AI using a dirty dataset at hand for a given business process, the right way will be to use ML and AI to understand and asses relevant datasets within the organization and then use thereon rationalized data to be understood my machines and used for sustainable automation of business processes.
Most of these rationalized data will be master data, where there is a movement to include ML and AI in Master Data Management solutions by forward looking vendors as examined in the post Artificial Intelligence (AI) and Master Data Management (MDM).