The Future of Disruptive MDM is in the Cloud

Two recent posts on the Gartner blog is about databases in the cloud. The Future of Database Management Systems Is Cloud by Merv Adrian ponders why cloud is now the default platform for managing data and The Future of Database Management Systems Is Cloud by Donald Feinberg does the same. Well, the two posts are identical.

This will also mean that the default platform for Master Data Management (MDM) will be in the cloud. Add to that, that the other disruptive MDM trends also will work best in the cloud.

Disruptive MDM in the Cloud

  • We increasingly see Extended MDM Platforms that also handles reference data and big data. Both these data types are predominantly external in nature and therefore they are better collected, or even better connected, in the cloud.
  • Services for Artificial Intelligence (AI) and Master Data Management (MDM) is delivered by vendors as cloud solutions.
  • Encompassing IoT and MDM means collaboration between many parties and this is, with all the relationships to take care of, only possible with cloud platforms.
  • We will see several other use cases for business ecosystem wide cross company sharing of master data in what Gartner coins as Multienterprise MDM.

Modern Data Management, Paella, Herodotus, Darwin and Einstein

Reltio has a blog series with the tag #moderndatamasters. The posts are interviews with people in the data management world. The other day it was my turn to share my story.

Kate Tickner from Reltio went with me around some serious questions as:

  • How would you define “modern” data management and what does it /should it mean for organisations that adopt it?
  • What are your top 3 tips or resources to share for aspiring modern data masters?
  • Can you tell us a little more about the concepts behind Product Data Lake and your vision for how it could be used in the future?
  • What trends or changes do you predict to the data management arena in the next few years?

You can read the interview here on the Reltio blog.

At the end we touched:

  • What do you like to do outside of work?
  • Which 3 people – living or dead, real or fictional – would you invite to a dinner party and why?
  • What are you cooking?

For the dinner party I would make paella and, based on my interest in history, picked three historical persons, who also have been featured on this blog:

Reltio moderndatamasters

Data Quality and the Climate Issue

The similarities between getting awareness for data quality issues and the climate issue was touched 10 years ago here on this blog in the post Data Quality and Climate Politics.

The challenges are still the same.

There are many examples published where the results of climate change are pictured. A recent one is the image from Greenland showing huskies pulling sleds not over the usual ice, but through water.

Greenland-melting-ice-sheet-0613-01-exlarge-169

(Image taken by Steffen Malskær Olsen, @SteffenMalskaer, here published on CNN)

We also see statistics showing a development towards melting ice masses with rising sea levels as the foreseeable result. However, statistics can always be questioned. Is the ice thickening somewhere else? Has this happened many times before?

These kind of questions shows the layers we must go through getting from data quality to information quality, then decision quality and on top the wisdom in applying the right knowledge whether that is to achieve business outcomes or avoiding climate change.

DIKW data quality

 

RDM: A Small but Important Extension to MDM

Reference Data Management (RDM) is a small but important extension to Master Data Management (MDM). Together with a large extension, being big data and data lakes, mastering reference data is increasingly being part of the offerings from MDM solution vendors as told in the post Extended MDM Platforms.

RDM

Reference Data

Reference data are these smaller lists of values that gives context to master data and ensures that we use the same (or linkable) codes for describing master data entities. Examples are:

Reference data tend to be externally defined and maintained typically by international standardization bodies or industry organizations, but reference data can also be internally defined to meet your specific business model.

3 RDM Solutions from MDM Vendors

Informatica has recently released their first version of a new RDM solution: MDM – Reference 360. This is by the way the first true Software as a Service (SaaS) solution from Informatica in the MDM space. This solution emphasizes on building a hierarchy of reference data lists, the ability to make crosswalks between the lists, workflow (approval) around updates and audit tracks.

Reltio has embraced RDM has an integral part of their Reltio Cloud solution where the “RDM capabilities improves data governance and operational excellence with an easy to use application that creates, manages and provisions reference data for better reporting and analytics.

Semarchy has a solution called Semarchy xDM. The x indicates that this solution encompasses all kinds of enterprise grade data and thus both Master data and Reference data while “xDM extends the agile development concept to its implementation paradigm”.

MDM Trend: Data as a Service

A recent post on this blog was called Five Disruptive MDM Trends. One of the trends mentioned herein is MDM in the cloud and one form of Master Data Management in the cloud in the picture is Data as a Service (DaaS).

DaaS within MDM

Using Data as a Service in the cloud within MDM solutions is a great way of ensuring data quality. You have access to real-time validation and enrichment of master data and you can also use third party and second party services in the on-boarding processes and then avoid typing in data with the unavoidable human errors that else is the most common root cause of data quality issues.

Some of the most common data services useful in MDM are:

Address Verification and Geocoding

When handling location data having a valid and standardized description of postal addresses and in many cases also a code that tells about the geographic position is crucial in MDM.

Postal address verification can either be exploited by a global service such as Loqate from GB Group or AddressDoctor, which is part of the Informatica offering. Alternatively, you can use national services that are better (but also narrowly) aligned with a given address format within a country and the specific extra services available in some countries.

Geocodes can either by latitude and longitude or flat map friendly geocoding systems such as UTM coordinates or WGS84 coordinates.

Business Directory Services

When handling party master data as B2B customers, suppliers and other business partners in is useful to validate and enrich the data with third party reference data and in some cases even onboard through these sources.

Again, there are global and local options. The most commonly used global is Dun & Bradstreet, who operates a database called WorldBase that holds business entities from all over the world in a uniform format and also provides data about the company family trees on a global basis. Alternatively, many countries have a national service provided by each government with formats and data elements specific to that country.

Citizen Directory Services

When handling party master data as B2C customers, employees and other personal data the third-party possibilities are sparser in general, naturally because of privacy concerns.

In Scandinavia, where I live, these data are available from public sources based on either our national ID or a correct name and address.

Data pools and Product Data Lake

When handling product master data and product information there are for some product groups and product attributes in some geographies data pools available. The most commonly used global service is GDSN from GS1.

Alternatively (or supplementary), for all other product groups, product attributes and digital assets and in all other geographies, you can use a service like the one I am working with and is called Product Data Lake.

Data Modelling and Data Quality

There are intersections between data modelling and data quality. In examining those we can use a data quality mind map published recently on this blog:

Data modelling and data quality

Data Modelling and Data Quality Dimensions:

Some data quality dimensions are closely related to data modelling and a given data model can impact these data quality dimensions. This is the case for:

  • Data integrity, as the relationship rules in a traditional entity-relation based data model fosters the integrity of the data controlled in databases. The weak sides are, that sometimes these rules are too rigid to describe actual real-world entities and that the integrity across several databases is not covered. To discover the latter one, we may use data profiling methods.
  • Data validity, as field definitions and relationship rules controls that only data that is considered valid can enter the database.

Some other data quality dimensions must be solved with either extended data models and/or alternative methodologies. This is the case for:

  • Data completeness:
    • A common scenario is that for example a data model born in the United States will set the state field within an address as mandatory and probably to accept only a value from a reference list of 50 states. This will not work in the rest of world. So, in order to not getting crap or not getting data at all, you will either need to extend the model or loosening the model and control completeness otherwise.
    • With data about products the big pain is that different groups of products require different data elements. This can be solved with a very granular data model – with possible performance issues, or a very customized data model – with scalability and other issues as a result.
  • Data uniqueness: A common scenario here is that names and addresses can be spelled in many ways despite that they reflect the same real-world entity. We can use identity resolution (and data matching) to detect this and then model how we link data records with real world duplicates together in a looser or tighter way.

Emerging technologies:

Some of the emerging technologies in the data storing realm are presenting new ways of solving the challenges we have with data quality and traditional entity-relationship based data models.

Graph databases and document databases allows for describing and operating data models better aligned with the real world. This topic was examined in the post Encompassing Relational, Document and Graph the Best Way.

In the Product Data Lake venture I am working with right now we are also aiming to solve the data integrity, data validity and data completeness issues with product data (or product information if you like) using these emerging technologies. This includes solving issues with geographical diversity and varying completeness requirements through a granular data model that is scalable, not only seen within a given company but also across a whole business ecosystem encompassing many enterprises belonging to the same (data) supply chain.

Connecting Silos

The building next to my home office was originally two cement silos standing in an industrial harbor area among other silos. These two silos are now transformed into a connected office building as this area has been developed into a modern residence and commercial quarter.

Master Data Management (MDM) is on similar route.

The first quest for MDM has been to be a core discipline in transforming siloed data stores within a given company into a shared view of the core entities that must be described in the same way across different departmental views. Going from the departmental stage to the enterprise wide stage is examined in the post Three Stages of MDM Maturity.

But as told in this post, it does not stop there. The next transformation is to provide a shared view with trading partners in the business ecosystem(s) where your company operates. Because the shared data in your organization is also a silo when digital transformation puts pressure on each company to become a data integrated part of a business ecosystem.

A concept for doing that is described on the blog page called Master Data Share.

Silos
Connected silos in Copenhagen North Harbor – and connecting data silos enterprise wide and then business ecosystem wide

The Intelligent Enterprise of the Future, Informatica Style

Yesterday I had the pleasure of attending the Informatica MDM 360 and Data Governance Summit in London including being in a panel discussing best practices for your MDM 360 journey. The rise of Artificial Intelligence (AI) in Master Data Management (MDM) was a main theme at this event.

Informatica has a track record of innovating in new technologies in the data management space while also acquiring promising newcomers in order to fast track their market offering. So it is with AI and MDM at Informatica too. Informatica currently has two tracks:

  • clAIre – the clairvoyant component in the Informatica portfolio that “using machine learning and other AI techniques leverages the industry-leading metadata capabilities of the Informatica Intelligent Data Platform to accelerate and automate core data management and governance processes”.
  • Informatica Customer 360 Insights which is the new branding of the recent AllSight acquisition. You can learn about that over at The Disruptive Master Data Management Solutions List in the entry about Informatica Customer 360 Insights.

At the Informatica event the synergy between these two tracks was presented as the Intelligent 360 View. Naturally, marketing synergies are the first results of an acquisition. Later we will – hopefully – see actual synergies when the technologies are to be aligned, positioned and delivered to customers who want to be an intelligent enterprise of the future.

Infa Intelligent Enterprise of the Future

Five Disruptive MDM Trends

As any other IT enabled discipline Master Data Management (MDM) continuously undergo a transformation while adopting emerging technologies. In the following I will focus on five trends that seen today seems to be disruptive:

Disruptive MDM

MDM in the Cloud

According to Gartner the share of cloud-based MDM deployment has increased from 19% in 2017 year to 24 % in 2018 and I am sure that number will increase again this year. But does it come as SaaS (Software as a Service), PaaS (Platform as a Service) or IaaS (Infrastructure as a Service)? And what about DaaS (Data as a Service). Learn more in the post MDM, Cloud, SaaS, PaaS, IaaS and DaaS.

Extended MDM Platforms

There is a tendency on the Master Data Management (MDM) market that solutions providers aim to deliver an extended MDM platform to underpin customer experience efforts. Such a platform will not only handle traditional master data, but also reference data, big data (as in data lakes) as well as linking to transactions. Learn more in the post Extended MDM Platforms.

AI and MDM

There is an interdependency between MDM and Artificial Intelligence (AI). AI and Machine Learning (ML) depends on data quality, that is sustained with MDM, as examined in the post Machine Learning, Artificial Intelligence and Data Quality. And you can use AI and ML to solve MDM issues as told in the post Six MDM, AI and ML Use Cases.

IoT and MDM

The scope of MDM will increase with the rise of Internet of Things (IoT) as reported in the post IoT and MDM. Probably we will see the highest maturity for that first in Industrial Internet of Things (IIoT), also referred to as Industry 4.0, as pondered in the post IIoT (or Industry 4.0) Will Mature Before IoT.

Ecosystem wide MDM

Doing Master Data Management (MDM) enterprise wide is hard enough. But it does not stop there. Increasingly every organization will be an integrated part of a business ecosystem where collaboration with business partners will be a part of digitalization and thus we will have a need for working on the same foundation around master data. Learn more in the post Multienterprise MDM.

Six MDM, AI and ML Use Cases

One of the hottest trends in the Master Data Management (MDM) world today is how to exploit Artificial Intelligence (AI) and ignite that with Machine Learning (ML).

This aspiration is not new. It has been something that have been going on for years and you may argue about when computerized decision support and automation goes from being applying advanced algorithms to being AI. However, the AI and ML theme is getting traction today as part of digital transformation and whatever we call it, there are substantial business outcomes to pursue.

As told in the post Machine Learning, Artificial Intelligence and Data Quality perhaps all use cases for applying AI is dependent on data quality and MDM is playing a crucial role in sustaining data quality efforts.

Some of the use cases for AI and ML in the MDM realm I have come across over the years are:

6 MDM, AI and ML use cases

Translating between taxonomies: As reported in the post Artificial Intelligence (AI) and Multienterprise MDM emerging technologies can help in translating between the taxonomies in use when digital transformation sets a new bar for utilizing master data in business ecosystems.

Transforming unstructured to structured: A lot of data is kept in an unstructured way and to in order to systematically exploit these data in AI supported business process we need make data more structured. AI and ML can help with that too.

Data quality issue prevention: Simple rules for checking integrity and validating data is good – but unfortunately not good enough for ensuring data quality. AI is a way to exploit statistical methods and complex relationships.

Categorizing data: Digital transformation, spiced up with increasing compliance requirements, has made data categorization a must and AI and ML can be an effective way to solve this task that usually is not possible for humans to cover across an enterprise.

Data matching: Establishing a link between multiple descriptions of the same real-world entity across an enterprise and out to third party reference data has always been a pain. AI and ML can help as examined in the post The Art in Data Matching.

Improving insight: The scope of MDM can be enlarged to Extended MDM Platforms where other data as transactions and big data is used to build a 360-degree of the master data entities. AI and ML is a prerequisite to do that.