Trending Topic: Graph and MDM

Using graph data stores and utilizing the related capabilities has become a trending topic in the Master Data Management (MDM) space. This opportunity was first examined 5 years ago here on the blog in the post Will Graph Databases become Common in MDM? It seems so.

Recently David Borean, Chief Data Science Officer at the disruptive MDM vendor AllSight, wrote the blog post The real reason why Master Data Management needs Graph. In here David confirms the common known understanding of that graph databases are superior compared to relational databases when it comes to handle relationships within master data. But David also brings up how graph databases can support multiple versions of the truth.

graph MDMSeveral other vendors as Semarchy and Reltio are emphasizing on graph in MDM in their market messaging.

Aaron Zornes of The MDM Institute is another proponent of using graph technology within MDM as mentioned over at The Disruptive MDM Solutions blog in the post MDM Fact or Fiction: Who Knows?

What do you think: Will graph databases really brake through in MDM soon? Will it be as stand alone graph technology (as for example from neo4j) or embedded in MDM vendor portfolios?

Seven Flavors of MDM

Master Data Management (MDM) can take many forms. An exciting side of being involved in MDM implementations is that every implementation is a little bit different which also makes room for a lot of different technology options. There is no best MDM solution out there. There are a lot of options where some will be the best fit for a given MDM implementation.

The available solutions also change over the years – typically by spreading to cover more land in the MDM space.

In the following I will shortly introduce the basic stuff with seven flavours of MDM. A given MDM implementation will typically be focused on one of these flavours with some elements of the other flavors and a given piece of technology will have an origin in one of these flavours and in more or less degree encompass some more flavors.

7 flavours

The traditional MDM platform

A traditional MDM solution is a hub for master data aiming at delivering a single source of truth (or trust) for master data within a given organization either enterprise wide or within a portion of an enterprise. The first MDM solutions were aimed at Customer Data Integration (CDI), because having multiple and inconsistent data stores for customer data with varying data quality is a well-known pain point almost everywhere. Besides that, similar pain points exist around vendor data and other party roles, product data, assets, locations and other master data domains and dedicated solutions for that are available.

Product Information Management (PIM)

Special breed of solutions for Product Information Management aimed at having consistent product specifications across the enterprise to be published in multiple sales channels have been around for years and we have seen a continuously integration of the market for such solutions into the traditional MDM space as many of these solutions have morphed into being a kind of MDM solution.

Digital Asset Management (DAM)

Not at least in relation to PIM we have a distinct discipline around handling digital assets as text documents, audio files, video and other rich media data that are different from the structured and granular data we can manage in data models in common database technologies. A post on this blog examines How MDM, PIM and DAM Stick Together.

Big Data Integration

The rise of big data is having a considerable influence on how MDM solutions will look like in the future. You may handle big data directly inside MDM og link to big data outside MDM as told in the post about The Intersection of MDM and Big Data.

Application Data Management (ADM)

Another area where you have to decide where master data stops and handling other data starts is when it comes to transactional data and other forms data handled in dedicated applications as ERP, CRM, PLM (Product Lifecycle Management) and plenty of other industry specific applications. This conundrum was touched in a recent post called MDM vs ADM.

Multi-Domain MDM

Many MDM implementations focus on a single master data domain as customer, vendor or product or you see MDM programs that have a multi-domain vision, overall project management but quite separate tracks for each domain. We have though seen many technology vendors preparing for the multi-domain future either by:

  • Being born in the multi-domain age as for example Semarchy
  • Acquiring the stuff as for example Informatica and IBM
  • Extend from PIM as for example Riversand and Stibo Systems

MDM in the cloud

MDM follows the source applications up into the cloud. New MDM solutions naturally come as a cloud solution. The traditional vendors introduce cloud alternatives to or based on their proven on-promise solutions. There is only one direction here: More and more cloud MDM – also as customer as business partner engagement will take place in the cloud.

Ecosystem wide MDM

Doing MDM enterprise wide is hard enough. But it does not stop there. Increasingly every organization will be an integrated part of a business ecosystem where collaboration with business partners will be a part of digitalization and thus we will have a need for working on the same foundation around master data as reported in the post Ecosystem Wide MDM.

Welcome Enterworks, Contentserv and SyncForce on The Disruptive MDM List

I am happy to welcome three new entries on The Disruptive Master Data Management Solutions List.

This site is meant to be a list of available:

  • Master Data Management (MDM) solutions
  • Customer Data Integration (CDI) solutions
  • Product Information Management (PIM) solutions
  • Digital Asset Management (DAM) solutions

Organizations on the look for a solution of the above kind can use this site as an alternative to the likes of Gartner, Forrester, MDM Institute and others, not at least because this site will include the market leaders as well as smaller and disruptive solutions with specific use case, geographical, industry or other best of breed capabilities.

The new entries are:

ew

  • EnterWorks who is among the market leaders in multi-domain master data solutions for acquiring, managing and transforming a company’s multi-domain master data into persuasive and personalized content for marketing, sales, digital commerce and new market opportunities.
  • Contentserv thumbCONTENTSERV who offers a real-time Product Experience Platform. This integrated and product centric solution seamlessly combines the functionalities of multi domain Master Data Management, Product Information Management & Marketing Content Management.
  • SyncForce-plus-iconSyncForce who makes your product portfolio digitally available with a click of a button, in every shape and form, both internal and external, so you can shift your attention from fire fighting to building successful business with your trading partners.

You can visit the list here.

New logos 20180313

 If you are a vendor, you can register your solution here.

Why it is not a Product Data Warehouse, but a Product Data Lake

There is a need for a new solution to sharing product information between trading partners. Product Data Lake is that new solution. Using the term data lake as a part of the name for the solution is very deliberate. Here is why:

Volume

When setting up a warehouse, and a data warehouse, you have to estimate the storing size and the throughput. There will be a limit to how much data you can store and how much data you can upload and download within a given period.

Our vision is that Product Data Lake will be the process driven key service for exchanging any sort of product information within business ecosystems all over the world, with the aim of optimally assist self-service purchase of every kind of product.

In order to achieve that vision, we need to be able to scale up drastically. Therefore, we use a document-oriented database called MongoDB to store product information.

Even if you choose to implement a Product Data Lake instance for a single business ecosystem, you will benefit from the high scalability.

Velocity

Business ecosystems changes all the time. You need to rapidly be able to adapt your data management, not at least when it comes to exchanging product information.

Swapping trading partners is one thing. That often means dealing with other product information requirements and opportunities and adhering to other standards.

We will also see business ecosystems in new shapes in the future. There will be fewer nodes between manufacturers and point-of-sales and point-of-sales will more likely be online marketplaces.

However, the changes will not happen as a big bang but in varying pace for each industry, geography and organization.

The rigid consensus structure of a data warehouse, and product information exchange solutions that resembles a data warehouse, will not cope with that change. The data lake concept, in the form of Product Data Lake, will.

In Product Data Lake you as a provider upload product information in your structure and format and you as a receiver download in your structure and format. The linking and transformation takes place inside Product Data Lake using linked metadata.

Variety

While everyone agrees that a common standard for all product information is the best answer we must on the other hand accept, that using a common standard for every kind of product and every piece of information needed is quite utopic. We haven’t even a common uniquely spelled term in English for standardization/standarisation.

Also, we must foresee that one organization will mature in a different pace than another organisation in the same business ecosystem.

These observations are the reasons behind the launch of Product Data Lake. In Product Data Lake we encompass the use of (in prioritized order):

  • The same standard in the same version
  • The same standard in different versions
  • Different standards
  • No standards
Learn about some of these standards in the post Five Product Classification Standards.
big data pdl.png

Which MDM and/or PIM Solution to Choose?

More and more organizations are implementing Master Data Management (MDM) and Product Information Management (PIM) solutions.

When the implementation comes to the phase where you must choose one or more solutions and you go for the buy option (which is recommended), it can be hard to get a view on the available solutions. You can turn to the Gartner option, but their Quadrant only shows the more expensive options and Gartner is a bit old school as reported here.

An additional option will be to see how the vendors themselves present their solutions in a crisp way. This is what is going on at The Disruptive Master Data Management Solutions List.

mdmlist20180222

As a solution provider you can register your solution on this site in order to be a solution considered by organizations looking for a:

  • Master Data Management (MDM) solution
  • Customer Data Integration (CDI) solution
  • Product Information Management (PIM) solution
  • Digital Asset Management (DAM) solution

Registration takes place here.

Master Data or

Data Lakes in Business Ecosystems

The concept of a data lake has until now mainly been used to describe how data gathered by a given organization can be organized in order to provide for analytical purposes. The data lake concept is closely tied to the term big data, which means that a data lake caters for handling huge volumes of data, with high velocity and where data comes in heaps of varieties. A data lake is much more agile than data marts and data warehouses, where you have to determine the purpose of the use of data beforehand.

In my eyes the idea about that every organization should gather all the data of interest behind its own firewall does not make sense. Some data will for sure be only for eyes of people behind the corporate walls. Here you should indeed put your private data into your own data lake. But a lot of data are common known data that everyone should not spend time on collecting and eventually cleansing. Here you should collaborate within your industry or other sphere around data lakes for public data.

Perhaps most importantly you should share data lakes with other members of your business ecosystem. You probably already do share data within business ecosystems with your trading partners. Until now such sharing has resembled the concepts of data marts and data warehouses being very purpose specific exchanges of data build on common beforehand understood standards for data.

Right now I am working on a cloud service called the Product Data Lake. Here we host a data lake for sharing product data in the business ecosystems of manufacturers, distributors, merchants and large end users of product information. You can join our journey by following us here on LinkedIn at the Product Data Lake LinkedIn Company Page.

bridge into lake.png

Product Data Lake Version 1.4 is Live

Our February 2018 version of the Product Data Lake cloud service is live. New capabilities include:

  • Subscriber clusters
  • Put APIs

Subscriber Clusters

As a Product Data Lake customer, you can be a subscriber to our public cloud (www.productdatalake.com) or install the Product Data Lake software on your private cloud.

Now there is a hybrid option: Being a member of a subscriber cluster. A subscriber cluster is an option for example for an affiliated group of companies, where you can share product data internally while at the same time you can share product data with trading partners from outside your group using the same account.

Put APIs

Already existing means to feed Product Data Lake include FTP file drops, traditional file upload from your desktop or network drives or actually entering data into Product Data Lake. Now you can also use our APIs for system to system data exchange.

Get the Overview

Get the full Product Data Lake Overview here (opens a PDF file).

Put.png

 

Where a Major Tool is Not So Cool

During my engagements in selecting and working with the major data management tools on the market, I have from time to time experienced that they often lack support for specialized data management needs in minor markets.

Two such areas I have been involved with as a Denmark based consultant are:

  • Address verification
  • Data masking

Address verification:

The authorities in Denmark offers a free of charge access to very up to data and granular accurate address data that besides the envelope form of an address also comes with a data management friendly key (usually referred to as KVHX) on the unit level for each residential and business address within the country. Besides the existence of the address you also have access to what activity that takes place on the address as for example if it is a single-family house, a nursing home, a campus and other useful information for verification, matching and other data management activities.

If you want to verify addresses with the major international data managements tools I have come around, much of these goodies are gone, as for example:

  • Address reference data are refreshed only once per quarter
  • The key and the access to more information is not available
  • A price tag for data has been introduced

Data Masking:

In Denmark (and other Scandinavian countries) we have a national identification number (known as personnummer) used much more intensively than the national IDs known from most other countries as told in the post Citizen ID within seconds.

The data masking capabilities in major data management solutions comes with pre-build functions for national IDs – but only covering major markets as the United States Social Security Number, the United Kingdom NINO and the kind of national id in use in a few other large western countries.

So, GDPR compliance is just a little bit harder here even when using a major tool.

Data Masking National ID.png
From IBM Data Masking documentation

6 Decades of the LEGO® Brick and the 2nd Decade of MDM

28th January 2018 marks the 60th anniversary of the iconic LEGO® brick.

As I was raised close to the LEGO headquarter in Billund, Denmark, I also remember having a considerable amount of LEGO® bricks to play with as a child back in the 60’s in the first decade of the current LEGO® brick design. At that time the brick was a brick, where you had to combine a few sizes and colours of bricks into resembling a usable thing from the real world. Since then the range of shapes and colours of the pieces from the Lego factory have grown considerably.

MDM BlocksMaster Data Management (MDM) went into the 2nd decade some years ago as reported in the post Happy 10 Years Birthday MDM Solutions. MDM has some basic building blocks, as proposed by former Gartner analyst John Radcliffe  back in 00’s and touched in the post The Need for a MDM Vision.

These blocks indeed look like the original LEGO® bricks.

Through the 2nd decade of MDM and in coming decades we will probably see a lot of specialised blocks in many shapes describing and covering the people, process and technology parts of MDM. Let us hope that they will all stick well together as the LEGO® bricks have done for the past 60 years.

PS: Some if the sticking together is described in the post How MDM, PIM and DAM Stick Together.

Welcome AllSight on the Disruptive MDM List

I am thrilled to welcome AllSight as the next disruptive MDM solution on The Disruptive Master Data Management Solutions list.

AllSight2I resonate very well with the AllSight Advantage that is: “The hardest part about understanding the customer is representing them within archaic systems designed to manage ‘customer records’.  AllSight manages all customer data in its original format.  It creates a realistic and accurate likeness of who your customer actually is.  Really knowing your customer is the first step to being intelligent about your customers.”

A true disruptive approach in my eyes.

Check out the full Disruptive Master Data Management Solutions list here.