Welcome to Another Data Lake for Data Sharing

A couple of weeks ago Microsoft, Adobe and SAP announced their Open Data Initiative. While this, as far as we know, is only a statement for now, it of course has attracted some interest based on that it is three giants in the IT industry who have agreed on something – mostly interpreted as agreed to oppose Salesforce.com.

Forming a business ecosystem among players in the market is not new. However, what we usually see is that a group of companies agrees on a standard and then each one of them puts a product or service, that adheres to that standard, on the market. The standard then caters for the interoperability between the products and services.

In this case its seems to be something different. The product or service is operated by Microsoft based on their Azure platform. There will be some form of a common data model. But it is a data lake, meaning that we should expect that data can be provided in any structure and format and that data can be consumed into any structure and format.

In all humbleness, this concept is the same as the one that is behind Product Data Lake.

The Open Data Initiative from Microsoft, Adobe and SAP focuses at customer data and seems to be about enterprise wide customer data. While it technically also could support ecosystem wide customer data, privacy concerns and compliance issues will restrict that scope in many cases.

At Product Data Lake, we do the same for product data. Only here, the scope is business ecosystem wide as the big pain with product data is the flow between trading partners as examined here.

Open Data Initiative SAP Adobe Microsoft

It is time to apply AI to MDM and PIM

The intersection between Artificial Intelligence (AI) and Master Data Management (MDM) – and the associated discipline Product Information Management (PIM) – is an emerging topic.

A use case close to me

In my work at setting up a service called Product Data Lake the inclusion of AI has become an important topic. The aim of this service is to translate between the different taxonomies in use at trading partners for example when a manufacturer shares his product information with a merchant.

In some cases the manufacturer, the provider of product information, may use the same standard for product information as the merchant. This may be deep standards as eCl@ss and ETIM or pure product classification standards as UNSPSC. In this case we can apply deterministic matching of the classifications and the attributes (also called properties or features).

Product Data Syndication

However, most often there are uncovered areas even when two trading partners share the same standard. And then again, the most frequent situation is that the two trading partners are using different standards.

In that case we initially will use human resources to do the linking. Our data governance framework for that includes upstream (manufacturer) responsibility, downstream (merchant) responsibility and our ambassador concept.

As always, applying too much human interaction is costly, time consuming and error prone. Therefore, we are very eagerly training our machines to be able to do this work in a cost-effective way, within a much shorter time frame and with a repeatable and consistent outcome to the benefit of the participating manufacturers, merchants and other enterprises involved in exchanging products and the related product information.

Learning from others

This week I participated in a workshop around exchanging experiences and proofing use cases for AI and MDM. The above-mentioned use case was one of several use cases examined here. And for sure, there is a basis for applying AI with substantial benefits for the enterprises who gets this. The workshop was arranged by Camelot Management Consultants within their Global Community for Artificial Intelligence in MDM.

Making Your MDM Vendor Longlist and Shortlist

Various analyst firms are making more or less periodic reports with vendor rankings and recommendations for the Master Data Management (MDM) market.

The latest one from Constellation Research is their Constellation ShortList™ Master Data Management authored by R “Ray” Wang.

The public part of the Q3 2018 report reveals 6 shortlisted vendors among over 15 evaluated ones.

Shortlist

Some viable solutions among there indeed.

PS: If you would prefer not to start with a generic shortlist, you can compile your relevant longlist and weighted shortlist based on The Disruptive Master Data Management Solutions List.

MDM Hype Cycle, GDSN, Data Quality, Multienterprise MDM and Product Data Syndication

Gartner, the analyst firm, has a hype cycle for Information Governance and Master Data Management.

Back in 2012 there was a hype cycle for just Master Data Management. It looked like this:

Hype cycle MDM 2012
Source: Gartner

I have made a red circle around the two rightmost terms: “Data Quality Tools” and “Information Exchange and Global Data Synchronization”.

Now, 6 years later, the terms included in the cycle are the below:

Hype Cycle MDM 2018
Source: Gartner

The two terms “Data Quality Tools” and “Information Exchange and Global Data Synchronization” are not mentioned here. I do not think it is because the they ever fulfilled their purpose. I think they are being supplemented by something new. One of these terms that have emerged since 2012 is, in red circle, Multienterprise MDM.

As touched in the post Product Data Quality we have seen data quality tools in action for years when it comes to customer (or party) master data, but not that much when it comes to product master data.

Global Data Synchronization has been around the GS1 concept of GDSN (Global Data Synchronization Network) and exchange of product data between trading partners. However, after 40 years in play this concept only covers a fraction of the products traded worldwide and only for very basic product master data. Product data syndication between trading partners for a lot of product information and related digital assets must still be handled otherwise today.

In my eyes Multienterprise MDM comes to the rescue. This concept was examined in the post Ecosystem Wide MDM. You can gain business benefits from extending enterprise wide product master data management to be multienterprise wide. This includes:

  • Working with the same product classifications or being able to continuously map between different classifications used by trading partners
  • Utilizing the same attribute definitions (metadata around products) or being able to continuously map between different attribute taxonomies in use by trading partners
  • Sharing data on product relationships (available accessories, relevant spare parts, updated succession for products, cross-sell information and up-sell opportunities)
  • Having shared access to latest versions of digital assets (text, audio, video) associated with products.

This is what we work for at Product Data Lake – including Machine Learning Enabled Data Quality, Data Classification, Cloud MDM Hub Service and Multienterprise Metadata Management.

MDM Vendor Landscape 2018

The Information Difference MDM Landscape Q2 2018 is out.

The report confirms the trend of increasing uptake of cloud Master Data Management solutions as examined in the recent post called The Rise of Cloud MDM.

According to the report the coexistence of big data and master data is another trend and more and more MDM vendors are embracing all master data domains while though as stated “most vendors have their roots in either customer or product data, and their particular functionality and track record of deployment is usually deeper where the software had its roots”.

The plot of vendors looks like this:MDM Landscape Q2 2018You can read the full report here.

PS: Many of the vendors – and some more – are presented in depth on The Disruptive MDM List.

PPS: If you represent a vendor not on The Disruptive MDM List, you can register here.

The Disruptive MDM List is Growing

The Disruptive MDM Solutions List is a list of available:

  • Master Data Management (MDM) solutions
  • Customer Data Integration (CDI) solutions
  • Product Information Management (PIM) solutions
  • Digital Asset Management (DAM) solutions.

The list will help you when making vendor longlists and shortlists for new implementations of these solutions.

The latest entry is Magnitude MDM. This solution was previously known as Kalido, one of the first MDM solutions to emerge on the nascent market for MDM technology back in the mid 00’s.

You can see the current list of MDM solutions here.

banner

Avoid Duplicates by Avoiding Peer-to-Peer Integrations

When working in Master Data Management (MDM) programs some of the main pain points always on the list are duplicates. As explained in the post Golden Records in Multi-Domain MDM this may be duplicates in party master data (customer, supplier and other roles) as well as duplicates in product master data, assets, locations and more.

Most of the data quality technology available to solve these problems revolves around identifying duplicates.  This is a very intriguing discipline where I have spent some of my best years. However, this is only a remedy to the symptoms of the problem and not a mean to eliminate the root cause as touched in the post The Good, Better and Best Way of Avoiding Duplicates.

The root causes are plentiful and as all challenges they involve technology, processes and people.

Having an IT landscape with multiple applications where master data are a created, updated and consumed is a basic problem and a remedy to that is the main reason of being for Master Data Management (MDM) solutions. The challenge is to implement MDM technology in a way that the MDM solution will not just become another silo of master data but instead be solution for sharing master data within the enterprise – and ultimately in the digital ecosystem around the enterprise.

blind-spot-take-careThe main enemy from a technology perspective is in my experience peer-to-peer system integration solutions. If you have chosen application X to support a business objective and application Y to support another business objective and you learn that there is an integration solution between X and Y available, this is very bad news. Because short term cost and timing considerations will make that option obvious. But in the long run it will cost you dearly if the master data involved are handled in other applications as well. Because then you will have blind spots all over the place where through duplicates will enter.

The only sustainable solution is to build a master data hub where through master data are integrated and thus shared with all applications inside the enterprise and around the enterprise. This hub must encompass a shared master data model and related metadata.

 

Welcome Dynamicweb PIM on the Disruptive MDM and PIM List

This Disruptive Master Data Management Solutions list is a list of available:

  • Master Data Management (MDM) solutions
  • Customer Data Integration (CDI) solutions
  • Product Information Management (PIM) solutions
  • Digital Asset Management (DAM) solutions.

You can use this site as a supplement to the likes of Gartner, Forrester, MDM Institute and others when selecting a MDM / CDI / PIM / DAM solution, not at least because this site will include both larger and smaller disruptive MDM, PIM and similar solutions.

The latest entry on the list is Dynamicweb PIM. This is a mature cloud-based Product Information Management (PIM) solution that can be deployed either as a stand-alone PIM implementation or in their combined all-in-one platform together with content management, ecommerce and marketing and tightly integrated with popular ERP and CRM solutions. This integrated approach offers a short time to value opportunity for midsized companies on the quest to ramp up online sales.

Read more about the Dynamicweb PIM solution here.

Dynamicweb PIM front

Three Remarkable Observations about Reltio

The latest entry on The Disruptive Master Data Management Solutions List is Reltio. I have been following Reltio for more than 5 years and have had the chance to do some hands on lately.

In doing that, I think there are three observations that makes the Reltio Cloud solution a remarkable MDM offering.

More than Master Data

While the Reltio solution emphasizes on master data the platform can include the data that revolves around master data as well. That means you can bring transactions and big data streams to the platform and apply analytics, machine learning, artificial intelligence and those shiny new things in order to go from a purely analytical world for these disciplines to exploit these data and capabilities in the operational world.

The thinking behind this approach is that you can not get a 360-degree on customer, vendor and other party roles as well as 360-degree on products by only having a snapshot compound description of the entity in question. You also need the raw history, the relationships between entities and access to details for various use cases.

In fact, Reltio provides not just operational MDM, but through a module called Reltio IQ also brings continuously mastered data, correlated transactions into an Apache Spark environment for analytics and Machine Learning. This eliminates the traditional friction of synchronizing data models between MDM and analytical environments. It also allows for aggregated results to be synchronized back into the MDM profiles, by storing them as analytical attributes. These attributes are now available for use in operational context, such as marketing segmentation, sales recommendations, GDPR exposure and more.

Multiple Storing Capabilities

There is an ongoing debate in the MDM community these days about if you should use relational database technology or NoSQL technology or graph technology? Reltio utilizes all three of them for the purposes where each approach makes the most sense.

Reference data are handled as relational data. The entities are kept using a wide column store, which is a technique encompassing scalability known from pure column stores but with some of the structure known from relational databases. Finally, the relationships are handled using graph techniques, which has been a recurring subject on this blog.

Reltio calls this multi-model polyglot persistence, and they embrace the latest technologies from multiple clouds such as AWS and Google Cloud Platform (GCP) under the covers.

Survival of the Fit Enough

One thing that MDM solutions do is making a golden record from different systems of records where the same real-world entity is described in many ways and therefore are considered duplicate records. Identifying those records is hard enough. But then comes the task of merging the conflicting values together, so the most accurate values survive in the golden record.

Reltio does that very elegantly by actually not doing it. Survivorship rules can be set up based on all the needed parameters as recency, provenance and more and you may also allow more than one value to survive as touched in the post about the principle of Survival of the Fit Enough.

In Reltio there is no purge of the immediately not surviving values. The golden record is not stored physically. Instead Reltio keeps one (or even more than one) virtual golden record(s) by letting the original source records stay. Therefore, you can easily rollback or update the single view of the truth.

The Reltio platform allows survivorship rules to be customized in rulesets for an unlimited number of roles and personas. In effect supporting multiple personalized versions of the truth. In an operational MDM context this allows sales, marketing, compliance, and other teams to see the data values that they care about most, while collaborating continuously in what Reltio calls the Self-Learning Enterprise.

Going beyond operational MDM

 

Ecosystem Wide Product Information Management

The concept of doing Master Data Management (MDM) not only enterprise wide but ecosystem wide was examined in the post Ecosystem Wide MDM.

As mentioned, product master data is an obvious domain where business outcomes may occur first when stretching your digital transformation to encompass business ecosystems.

The figure below shows the core delegates in the ecosystem wide Product Information Management (PIM) landscape we support at Product Data Lake:

Ecosystem Wide PIM.png

Your enterprise is in the centre. You may have or need an in-house PIM solution where you manipulate and make product information more competitive as elaborated in the post Using Internal and External Product Information to Win.

At Product Data Lake we collaborate with providers of Artificial Intelligence (AI) capabilities and similar technologies in order to improve data quality and analyse product information.

As shown in the top, there may be a relevant data pool with a consensus structure for your industry available, where you exchange some of product information with trading partners. At Product Data Lake we embrace that scenario with our reservoir concept.

Else, you will need to make partnerships with individual trading partners. At Product Data Lake we make that happen with a win-win approach. This means, that providers can push their product information in a uniform way with the structure and with the taxonomy they have. Receivers can pull the product information in a uniform way with the structure and with the taxonomy they have. This product data syndication concept is outlined in the post Sell more. Reduce costs.