Trending Topic: Graph and MDM

Using graph data stores and utilizing the related capabilities has become a trending topic in the Master Data Management (MDM) space. This opportunity was first examined 5 years ago here on the blog in the post Will Graph Databases become Common in MDM? It seems so.

Recently David Borean, Chief Data Science Officer at the disruptive MDM vendor AllSight, wrote the blog post The real reason why Master Data Management needs Graph. In here David confirms the common known understanding of that graph databases are superior compared to relational databases when it comes to handle relationships within master data. But David also brings up how graph databases can support multiple versions of the truth.

graph MDMSeveral other vendors as Semarchy and Reltio are emphasizing on graph in MDM in their market messaging.

Aaron Zornes of The MDM Institute is another proponent of using graph technology within MDM as mentioned over at The Disruptive MDM Solutions blog in the post MDM Fact or Fiction: Who Knows?

What do you think: Will graph databases really brake through in MDM soon? Will it be as stand alone graph technology (as for example from neo4j) or embedded in MDM vendor portfolios?

Data Pool vs Data Lake

Within Product Information Management (PIM) – or Product Master Data Management if you like – there is a concept of a data pool.

Recently Justine Rodian of Stibo Systems made a nice blog post with the title Master Data Management Definitions: The Complete A-Z of MDM. Herein Justine explains a lot of terms within Master Data Management (MDM). A data pool is described as this:

“A data pool is a centralized repository of data where trading partners (e.g., retailers, distributors or suppliers) can obtain, maintain and exchange information about products in a standard format. Suppliers can, for instance, upload data to a data pool that cooperating retailers can then receive through their data pool.”

Now, during the last couple of year I have been working on the concept of applying the data lake approach to product information exchange between trading partners. Justine describes a data lake this way:

“A data lake is a place to store your data, usually in its raw form without changing it. The idea of the data lake is to provide a place for the unaltered data in its native format until it’s needed…..” 

Product Data Lake
MacRitchie Reservoir in Singapore

For a provider of product information, typically a manufacturer, the benefit of interacting via a data lake opposite to a data pool is that they do not have to go through standardization before uploading and thus have to shoehorn the data into a specific form and thereby almost certainly leave out important information and being depending on consensus between competing manufacturers.

For a receiver of information, typically a merchant as a retailer and B2B dealer, the benefit of interacting via a data lake opposite to a data pool is that they can request the data in the form they will use to be most competitive and thereby sell more and reduce costs in product information sharing. This will be further accelerated if the merchant uses several data pools.

In Product Data Lake we even combine the best of the two approaches by encompassing data pools in our reservoir concept – to stay in the water body lingo. Here data pools are refreshed with modern data management technology and less rigid incoming and outgoing streams as announced in the post Product Data Lake Version 1.3 is Live.

Ecosystem Wide MDM

Doing Master Data Management (MDM) enterprise wide is hard enough. The ability to control master data across your organization is essential to enable digitalization initiatives and ensure the competitiveness of your organization in the future.

But it does not stop there. Increasingly every organization will be an integrated part of a business ecosystem where collaboration with business partners will be a part of digitalization and thus we will have a need for working on the same foundation around master data.

The different master data domains will have different roles to play in such endeavors. Party master will be shared in some degree but there are both competitive factors, data protection and privacy factors to be observed as well.

MDM Ecosystem

Product master data – or product information if you like – is an obvious master data domain where you can gain business benefits from extending master data management to be ecosystem wide. This includes:

  • Working with the same product classifications or being able to continuously map between different classifications used by trading partners
  • Utilizing the same attribute definitions (metadata around products) or being able to continuously map between different attribute taxonomies in use by trading partners
  • Sharing data on product relationships (available accessories, relevant spare parts, updated succession for products, cross-sell information and up-sell opportunities)
  • Having access to latest versions of digital assets (text, audio, video) associated with products

The concept of ecosystem wide Multi-Domain MDM is explored further is the article about Master Data Share.

MDM vs ADM

The term Application Data Management (ADM) has recently been circulating in the Master Data Management (MDM) world as touched in The Disruptive MDM List blog post MDM Fact or Fiction: Who Knows?

Not at least Gartner, the analyst firm, has touted this as one of two Disruptive Forces in MDM Land. However, Gartner is not always your friend when it comes to short, crisp and easy digestible definitions and explanations of the terms they promote.

In my mind the two terms MDM and ADM relates as seen below:

ADM MDM.png

So, ADM takes care of a lot of data that we do not usually consider being master data within a given application while MDM takes care of master data across multiple applications.

The big question is how we handle the intersection (and sum of intersections in the IT landscape) when it comes to applying technology.

If you have an IT landscape with a dominant application like for example SAP ECC you are tempted to handle the master data within that application as your master data hub or using a vendor provided tightly integrated tool as for example SAP MDG. For specific master data domains, you might for example regard your CRM application as your customer master data hub. Here MDM and ADM melts into one process and technology platform.

If you have an IT landscape with multiple applications, you should consider implementing a specific MDM platform that receives master data from and provides master data to applications that takes care of all the other data used for specific business objectives. Here MDM and ADM will be in separated processes using best-of-breed technology.

Product Data Completeness

Completeness is one of the most frequently mentioned data quality dimensions as touched in the post How to Improve Completeness of Data.

ChecklistWhile every data quality dimension applies to all domains of Master Data Management (MDM), some different dimensions apply a bit more to one of the domains or the intersections of the domains as explained in the post Multi-Domain MDM and Data Quality Dimensions.

With product master data (or product information if you like) completeness is often a big pain. One reason is that completeness means different requirements for different categories of products as pondered in the post Hierarchical Completeness within Product Information Management.

At Product Data Lake we develop a range of cloud service offerings that will help you improve completeness of product data. These are namely:

  • Measuring completeness against these industry standards that have attribute requirements such as eClass and ETIM
  • For manufacturers measuring completeness against downstream trading partner requirements (if not fully governed by an industry standard).
  • For merchants measuring incoming completeness when pulling from merchants.
  • Measuring against completeness required by marketplaces.
  • Transforming product information to meet conformity and thereby ability to populate according to requirements
  • Translating product information in order to populate attributes in more languages
  • Transferring product information by letting manufacturers push it in their way and letting merchants pull it their way as described in the post Using Pull or Push to Get to the Next Level in Product Information Management.

Product Information on Demand

Video on demand has become a popular way to watch television series, films and other entertainment and Netflix is probably the most known brand for delivering that.

The great thing about watching video on demand is that you do not have to enjoy the service at the exact same time as everyone else, as it was the case back in the days when watching TV or going to the movies were the options available.

At Product Data Lake we will bring that convenience to business ecosystems, as the situation today with broadcasting product information in supply chains very much resembles the situation we had before video on demand came around in the TV/Movie world.

As a provider of product information (being a manufacturer or upstream distributor), you will push your product information into Product Data lake, when you have the information available. Moreover, you will only do that once for each product and piece of information. No more coming to each theatre near your audience and extensive reruns of old stuff.

As a receiver of product information (being a downstream distributor, reseller or large end user), you will pull product information when you need it. That will be when you take a new product into your range or do a special product sale as well as when you start to deal with a new piece of information. No more having to be home at a certain time when your supplier does the show or waiting in ages for a rerun when you missed it.

Learn more about how Product Data Lake makes your life in Product Information Management (PIM) easier by following us here on LinkedIn.

Product Data Lake

 

Three Major Sectors within Product Information Exchange

When working with Product Information Management (PIM) and not at least with product information exchange between trading partners, I have noticed three major sectors where the requirements and means differs quite a bit.

These sectors are:

  • Food, beverage at pharmaceuticals: These are highly regulated sectors where the rules for taxonomy, completeness and exchange formats are advanced. Exchange standards and underpinning services as GS1/GDSN are well penetrated at least for basic data elements among major players. This sector counts for circa 1/6 of the world trade.
  • Fashion, books and mainstream electronics: The products within this sector can be described with common accepted taxonomies and do not differ that much though there certainly are room for more common adhered standards in some areas. The trade here is becoming more penetrated by marketplaces with their specific product information requirements. This sector counts for circa 1/6 of the world trade.
  • The rest (including building materials, special electronics, machinery, homeware): This is a diverse segment of products groups and the product groups themselves are diverse. The requirements for product information completeness and other data quality dimensions are overwhelming and the choice of standards are many, so most often two trading partners will be on different pages. This sector counts for circa 2/3 of the world trade.

Note: Automotive (vehicles) is a special vertical, where the main products (for example cars) resembles mainstream electronics and all the spare parts resembles special electronics. Some retailers (like department stores) covers all sectors and therefore need hybrid solutions to their product information exchange handling challenges.

The main drivers for better product information handling are compliance – not at least within food, beverage and pharmaceuticals – and self-service purchasing (as in ecommerce), where the latter has raged many years within fashion, books and mainstream electronics and now also is raising in more B2B (business-to-business) biased product groups as building materials, special electronics and machinery.

Learn more about how to tackle these diverse needs in product information exchange in the article and discussion about Product Data Lake.

segments.png

Embracing Standards versus Imposing Standards

When working with Product Information Management (PIM) and the recurring challenges in exchanging product information between trading partners the idea about everyone adhering to the same standard is a tempting idea.
This idea is also governing the many product data pools around. However, there are some serious considerations against this idea, namely:
  • Being on the same standard and not to say on the same version within your business ecosystem is quite utopic (being that within your own organization is hard enough).
  • It is not desirable to have the same product information as your competitors if you are going to compete on other factors than price.
In my eyes it is a better idea to forget about imposing a rigid standard for everyone and instead embrace the many available standards for product information where your organization utilize those being best for you at the given time and your various trading partners utilize those being best for them at a given time.
The solution for that is Product Data Lake.
Sell more Reduce costs

Why it is not a Product Data Warehouse, but a Product Data Lake

There is a need for a new solution to sharing product information between trading partners. Product Data Lake is that new solution. Using the term data lake as a part of the name for the solution is very deliberate. Here is why:

Volume

When setting up a warehouse, and a data warehouse, you have to estimate the storing size and the throughput. There will be a limit to how much data you can store and how much data you can upload and download within a given period.

Our vision is that Product Data Lake will be the process driven key service for exchanging any sort of product information within business ecosystems all over the world, with the aim of optimally assist self-service purchase of every kind of product.

In order to achieve that vision, we need to be able to scale up drastically. Therefore, we use a document-oriented database called MongoDB to store product information.

Even if you choose to implement a Product Data Lake instance for a single business ecosystem, you will benefit from the high scalability.

Velocity

Business ecosystems changes all the time. You need to rapidly be able to adapt your data management, not at least when it comes to exchanging product information.

Swapping trading partners is one thing. That often means dealing with other product information requirements and opportunities and adhering to other standards.

We will also see business ecosystems in new shapes in the future. There will be fewer nodes between manufacturers and point-of-sales and point-of-sales will more likely be online marketplaces.

However, the changes will not happen as a big bang but in varying pace for each industry, geography and organization.

The rigid consensus structure of a data warehouse, and product information exchange solutions that resembles a data warehouse, will not cope with that change. The data lake concept, in the form of Product Data Lake, will.

In Product Data Lake you as a provider upload product information in your structure and format and you as a receiver download in your structure and format. The linking and transformation takes place inside Product Data Lake using linked metadata.

Variety

While everyone agrees that a common standard for all product information is the best answer we must on the other hand accept, that using a common standard for every kind of product and every piece of information needed is quite utopic. We haven’t even a common uniquely spelled term in English for standardization/standarisation.

Also, we must foresee that one organization will mature in a different pace than another organisation in the same business ecosystem.

These observations are the reasons behind the launch of Product Data Lake. In Product Data Lake we encompass the use of (in prioritized order):

  • The same standard in the same version
  • The same standard in different versions
  • Different standards
  • No standards
Learn about some of these standards in the post Five Product Classification Standards.
big data pdl.png

The Old PIM World and The New PIM World

Standoff both sides narrow

Product Information Management (PIM) is challenged by the fact that product data is the kind of data that usually flows cross company. The most common route starts with that the hard facts about a product originates at the manufacturer. Then the information may be used on the brand’s own website, propagated to a marketplace (online shop-in-shop) and propagated downstream to distributors and merchants.

The challenge to the manufacturer is that this represent many ways of providing product information, not at least when it comes to distributors and merchants, as these will require different structurers and formats using various standards and not being on the same maturity level.

Looking at this from the downstream side as a merchant you have the opposite challenge. Manufacturers provide product information in different structurers and formats using various standards and are not on the same maturity level.

Supply chain participants can challenge this in a passive or an active way. Unfortunately, many have chosen – or are about to choose – the passive way. It goes like this:

  • As a manufacturer, we have a product data portal where trading partners who wants to do business with us, who obviously is the best manufacturer in our field, can download the product information we have in our structure and format using the standards we have found best.
  • As a merchant we have a supplier product data portal where trading partners who wants to do business with us, the leading player in our field, can upload the product information we for the time being will require in our structure and format using the standard(s) we have found best.
  • As a distributor, you could take both these standpoints.

This approach seems to work if you are bigger than your trading partner. And many times, one will be bigger than the other. But unless you are very big, you will in many cases not be the biggest. And in all cases where you are the biggest, you will not be a company being easy to do business with, which eventually will decide how big you will stay.

Using (often local) industry product data hubs is a workaround, but the challenges shines through and often it leads to that everyone will only benefit from what anyone can do and still calls for many touch points when doing business internationally and across several product data groups.

The better way is the active way creating a win-win situation for all trading partners as described in the article about Product Data Lake Business Benefits.