Three Essential Trends in Data Management for 2024

On the edge of the New Year, it is time to guess what will be the hot topics in data management next year. My top three candidates are:

  • Continued Enablement of Augmented Data Management
  • Embracing Data Ecosystems
  • Data Management and ESG

Continued Enablement of Augmented Data Management

The term augmented data management is still a hyped topic in the data management world. “Augmented” is here used to describe an extension of the capabilities that is now available for doing data management with these characteristics:

  • Inclusion of Machine Learning (ML) and Artificial Intelligence (AI) methodology and technology to handle data management challenges that until now have been poorly solved using traditional methodology and technology.
  • Encompassing graph approaches and technology to scale and widen data management coverage towards data that is less structured and have more variation than data that until now has been formally managed as an asset.
  • Aiming at automating data management tasks that until now have been solved in manual ways or simply not been solved at all due to the size and complexity of the work involved.

It is worth noticing that the Artificial Intelligence theme lately has been dominated by generative AI and namely ChatGPT. However, for data management generative AI will in my eyes not be the most frequently used AI flavor. Learn more about data management and AI in the post Three Augmented Data Management Flavors.

Embracing Data Ecosystems

The strength of data ecosystems was latest examined here on the blog in the post From Platforms to Ecosystems.

Data ecosystems include:

  • The infrastructure that connects ecosystem participants and help organizations transform from local and linear ways of doing business toward virtual and exponential operations.
  • A single source of truth for ecosystem participants that becomes a single source of truth across business partner ecosystems by providing all ecosystem participants with access to the same data.
  • Business model and process transformation across industries to support agile reconfiguration of business models and processes through information exchange inside and between ecosystems.

In short, your organization cannot grow faster than your competitors by hiding all data behind your firewall. You must share relevant data within your business ecosystem in an effective manner.

Data Management and ESG

ESG stands for Environmental, Social and Governance. This is often called sustainability. In a business context, sustainability is about how your products and services contribute to sustainable development.

When working as a data management consultant I have seen more and more companies having ESG on top of the agenda and therefore embarking on programs to infuse ESG concepts into data management. If you can tie a proposed data management effort to ESG, you have a good chance of getting that effort approved and funded.

Capturing ESG data is very much about sharing data with your business partners. This includes getting new product data elements from upstream trading partners and providing such data to downstream trading partners. These new data elements are often not covered through traditional ways of exchanging product data. Getting the traditional product information through data supply chains is already challenged so adding the new ESG dimension is a daunting task for many organizations.

Therefore, we are ramping up to also cover ESG data in the collaborative product data syndication service I am involved in and is called Product Data Lake.

From Platforms to Ecosystems

Earlier this year Gartner published a report with the title Top Trends in Data and Analytics, 2023. The report is currently available on the Parsionate site here.

The report names three opportunities within this theme:

  • Think Like a Business,
  • From Platforms to Ecosystems and
  • Don’t Forget the Humans

While thinking like a business and don’t forget the humans are universal opportunities that have always been here and will always be, the move from platforms to ecosystems is a current opportunity worth a closer look.

Here data sharing, according to Gartner, is essential. Some recommended actions are to

  • Consider adopting data fabric design to enable a single architecture for data sharing across heterogeneous internal and external data sources.
  • Brand data reusability and resharing as a positive for business value, including ESG (Environmental, Social and Governance) efforts.

Data Fabric is the Gartner buzzword that resembles the competing non-Gartner buzzword Data Mesh. According to Gartner, organizations use data fabrics to capture data assets, infer new relationships in datasets and automate actions on data.

Data sharing can be internal and external.

In my mind there are two pillars in internal data sharing:

  • MDM (Master Data Management) with the aim of sharing harmonized core data assets as for example business partner records and product records across multiple lines of business, geographies, and organizational disciplines.
  • Knowledge graph approaches where MDM is supplemented by modern capabilities in detecting many more relationships (and entities) than before as explained in the post MDM and Knowledge Graph.

In external data sharing we see solutions for:

External data sharing is on the rise, however at a much slower pace than I had anticipated. The main obstacle seems to be that the internal data sharing is still not mature in many organizations and that external data sharing require interaction between at least two data mature organizations.

4 Concepts in the Gartner Hype Cycle for Digital Business Capabilities that will Shape MDM

Some months ago, Gartner published the latest Hype Cycle for Digital Business Capabilities.

The hype cycle includes 4 concepts that in my mind will shape the future of Master Data Management (MDM) and data management in all. These are:

  • Industrie 4.0
  • Business Ecosystems
  • Digital Twin
  • Machine Customer

Industrie 4.0

You will find Industrie 4.0 near the trough of disillusionment almost ready to climb the slope of enlighten. Several of the recent MDM blue prints I have worked with have Industrie 4.0 as an overarching theme.

Industrie 4.0 is about using intelligent devices in manufacturing and thus closely connected to the term Industrial Internet of Things (IIoT). The impact of industry 4.0 is across the whole supply chain encompassing not only product manufacturing companies but also for example product merchants and product service providers.

With intelligent devices in the supply chain product MDM will evolve from handling data about product models to handle data about each instance of a product.

Business Ecosystems

The concept of business ecosystems has just passed the peak of inflated expectations.

In a modern business environment, no organization can do everything – or even most things – themselves. Therefore, any enterprise needs to partner with other organizations when working on new digital powered business models.

This also calls for increasing sharing of data, including master data, with business partners. This leads to the rise of interenterprise MDM, which by the way is at about the same position in the Hype Cycle for Data Analytics and MDM.

An example of interenterprise data sharing is Product Data Syndication.

Digital Twin

On the climbing side of the peak of inflated expectations we find the concept of a digital twin.

A digital twin is a virtual representation of a real-world entity such as an asset, person, organization, or process. This fits somehow with what MDM is doing which traditionally has been providing virtual descriptions of customers, suppliers, and products.

With the digital twin flavour, you can sharpen and extend MDM in two ways:

  • Have a more real-world on customers and suppliers by looking at those has roles of business partners along with handling many other external and internal organizational entities
  • Putting more asset types than direct products under the MDM umbrella with improved data governance as a result

Machine Customers

A bit further down the climbing side of the peak you will see the concept of the machine customer.

The expectation is that more and more buying tasks will be automated so there will be no human interaction in the bulk of purchasing processes.

This will only be possible if the products involved at those who sell them are digitally described in sufficient details and categorized the same way on the selling and buying side.

This seems like a job for Master Data Management and the adjacent Product Information Management (PIM) discipline where the buying side needs the right capabilities not only for direct trading products but also indirect supplies.

Also, the concept of augmented MDM will play a role here by applying Artificial Intelligence (AI) to the MDM and PIM side of enabling the machine customer.

The Full Report

You can download the full hype cycle report including the complete visual cycle from the parsionate website: Gartner Hype Cycle for Digital Business Capabilities.

What is Product Data Syndication (PDS)?

Product Information Management (PIM) has a sub discipline called Product Data Syndication (PDS).

While PIM basically is about how to collect, enrich, store and publish product information within a given organization, PDS is about how to share product information between manufacturers, merchants and marketplaces.

Marketplaces

Marketplaces is the new kid on the block in this world. Amazon and Alibaba are the most known ones, however there are plenty of them internationally, within given product groups and nationally. Merchants can provide product information related to the goods they are selling on a marketplace. A disruptive force in the supply (or value) chain world is that today manufacturers can sell their goods directly on marketplaces and thereby leave out the merchants. It is though still only a fraction of trade that has been diverted this way.

Each marketplace has their requirements for how product information should be uploaded encompassing what data elements that are needed, the requested taxonomy and data standards as well as the data syndication method.

Data Pools

One way of syndicating (or synchronizing) data from manufacturers to merchants is going through a data pool. The most known one is the Global Data Synchronization Network (GDSN) operated by GS1 through data pool vendors, where 1WorldSync is the dominant one. In here trading partners are following the same classification, taxonomy and structure for a group of products (typically food and beverage) and their most common attributes in use in a given geography.

There are plenty of other data pools available emphasizing on given product groups either internationally or nationally. The concept here is also that everyone will use the same taxonomy and have the same structure and range of data elements available.

Data Standards

Product classifications can be used to apply the same data standards. GS1 has a product classification called GPC. Some marketplaces use the UNSPSC classification provided by United Nations and – perhaps ironically – also operated by GS1. Other classifications, that in addition encompass the attribute requirements too, are eClass and ETIM.

A manufacturer can have product information in an in-house ERP, MDM and/or PIM application. In the same way a merchant (retailer or B2B dealer) can have product information in an in-house ERP, MDM (Master Data Management) and/or PIM application. Most often a pair of manufacturer and merchant will not use the same data standard, taxonomy, format and structure for product information.

1-1 Product Data Syndication

Data pools have not substantially penetrated the product data flows encompassing all product groups and all the needed attributes and digital assets. Besides that, merchants also have a desire to provide unique product information and thereby stand out in the competition with other merchants selling the same products.

Thus, the highway in product data syndication is still 1-1 exchange. This highway has these lanes:

  • Exchanging spreadsheets typically orchestrated as that the merchant request the manufacturer to fill in a spreadsheet with the data elements defined by the merchant.
  • A supplier portal, where the merchant offers an interface to their PIM environment where each manufacturer can upload product information according to the merchant’s definitions.
  • A customer portal, where the manufacturer offers an interface where each merchant can download product information according to the manufacturer’s definitions.
  • A specialized product data syndication service where the manufacturer can push product information according to their definitions and the merchant can pull linked and transformed product information according to their definitions.

In practice, the chain from manufacturer to the end merchant may have several nodes being distributors/wholesalers that reloads the data by getting product information from an upstream trading partner and passing this product information to a downstream trading partner.

Data Quality Implications

Data quality is as always a concern when information producers and information consumers must collaborate, and in a product data syndication context the extended challenge is that the upstream producer and the downstream consumer does not belong to the same organization. This ecosystem wide data quality and Master Data Management (MDM) issue was examined in the post Watch Out for Interenterprise MDM.

MDM Terms on the Move in the Gartner Hype Cycle

The latest Gartner Hype Cycle for Data and Analytics Governance and Master Data Management includes some of the MDM trends that have been touched here on the blog.

If we look at the post peak side, there are these five terms in motion:

  • Single domain MDM represented by the two most common domains being MDM of Product Data and MDM of Customer Data.
  • Multidomain MDM.
  • Interenterprise MDM, which before was coined Multienterprise MDM by Gartner and as I like to coin Ecosystem Wide MDM.
  • Data Hub Strategy which I like to coin Extended MDM.
  • Cloud MDM.
Source: Gartner

The hype cycle from last year was examined in the post MDM Terms in Use in the Gartner Hype Cycle.

Compared to last year this has happened to MDM:

  • Multidomain MDM has moved on from the Trough of Disillusionment to climbing up the Slope of Enlightenment. I have been waiting for this to happen for 10 years – both in the hype cycle and in the real-world – since I founded the Multi-Domain MDM Group on LinkedIn back then.
  • Interinterprise MDM has swapped place with Cloud MDM, so this term is now ahead of Cloud MDM. It is though hard to imagine Interenterprise MDM without Cloud MDM, and MDM in the cloud will also according Gartner reach the the Plateau of Productivity before ecosystem wide MDM. The promise of this is also in accordance with a poll I made as told in the post Interenterprise MDM Will be Hot.

You can get the full report from the MDM consultancy parsionate here.

Privacy and Confidentiality Concerns in Interenterprise Data Sharing

Exchange of data between enterprises – aka interenterprise data sharing – is becoming a hot topic in the era of digital transformation. As told in the post Data Quality and Interenterprise Data Sharing this approach is the cost-effective way to ensure data quality for the fast-increasing amount of data every organization has to manage when introducing new digital services.

McKinsey Digital recently elaborated on this theme in an article with the title Harnessing the power of external data. As stated in the article: “Organizations that stay abreast of the expanding external-data ecosystem and successfully integrate a broad spectrum of external data into their operations can outperform other companies by unlocking improvements in growth, productivity, and risk management.”

The arguments against interenterprise data sharing I hear most often revolves around privacy and confidentiality concerns.

Let us have a look at this challenge within the two most common master data domains: Party data and product data.

Party Data

The firm CDQ talk about the case for sharing party data in the post Data Sharing: A Brief History of a Crazy Idea. As said in here: The pain can be bigger than the concern.

Privacy through the enforced data privacy and data protection regulations as GDPR must (and should) be adhered to and sets a very strict limit for exchanging Personal Identifiable Information only leaving room for the legitimate cases of data portability.

However, information about organizations can be shared not only as exploitation of public third-party sources as business directories but also as data pools between like-minded organizations. Here you must think about if your typos in company names, addresses and more really are that confidential.

Product Data

The case for exchanging product data is explained in the post The Role of Product Data Syndication in Interenterprise MDM.

Though the vast amount of product data is meant to become public the concerns about confidentiality also exist with product data. Trading prices is an obvious area. The timing of releasing product data is another concern.

In the Product Data Lake syndication service I work with there are measures to ensure the right level of confidentiality. This includes encryption and controlling with whom you share what and when you do it.

Data governance plays a crucial role in orchestrating interenterprise data sharing with the right approach to data privacy and confidentiality. How this is done in for example product data syndication is explained in the page about Product Data Lake Documentation and Data Governance.

Data Quality and Interenterprise Data Sharing

When working with data quality improvement there are three kinds of data to consider:

First-party data is the data that is born and managed internally within the enterprise. This data has traditionally been in focus of data quality methodologies and tools with the aim of ensuring that data is fit for the purpose of use and correctly reflects the real-world entity that the data is describing.  

Third-party data is data sourced from external providers who offers a set of data that can be utilized by many enterprises. Examples a location directories, business directories as the Dun & Bradtstreet Worldbase and public national directories and product data pools as for example the Global Data Synchronization Network (GDSN).

Enriching first-party data with third-party is a mean to ensure namely better data completeness, better data consistency, and better data uniqueness.

Second-party data is data sourced directly from a business partner. Examples are supplier self-registration, customer self-registration and inbound product data syndication. Exchange of this data is also called interenterprise data sharing.

The advantage of using second-party in a data quality perspective is that you are closer to the source, which all things equal will mean that data better and more accurately reflects the real-world entity that the data is describing.

In addition to that, you will also, compared to third-party data, have the opportunity to operate with data that exactly fits your operating model and make you unique compared to your competitors.

Finally, second-party data obtained through interenterprise data sharing, will reduce the costs of capturing data compared to first-party data, where else the ever-increasing demand for more elaborate high-quality data in the age of digital transformation will overwhelm your organization.    

The Balancing Act

Getting the most optimal data quality with the least effort is about balancing the use of internal and external data, where you can exploit interenterprise data sharing through combining second-party and third-party data in the way that makes most sense for your organization.

As always, I am ready to discus your challenge. You can book a short online session for that here.

The Role of Product Data Syndication in Interenterprise MDM

Interenterprise Master Data Management is on the rise as reported in the post Watch Out for Interenterprise MDM. Interenterprise MDM is about how organizations can collaborate by sharing master data with business partners in order to optimize own master data and create new data driven revenue models together with business partners.

One of the most obvious places to start with Interenterprise MDM is Product Data Syndication (PDS). While PDS until now has been mostly applied when syndicating product data to marketplaces there is a huge potential in streamlining the flow of product from manufacturers to merchants and end users of product information.

Inbound and Outbound Product Data Syndication

There are two scenarios in interenterprise Product Data Syndication:

  • Outbound, where your organization as being part of a supply chain will provide product information to your range of customers. The challenge is that with no PDS functionality in between you must cater for many (hundreds or thousands) different structures, formats, taxonomies and exchange methods requested by your customers.
  • Inbound, where your organization as being part of a supply chain will receive product information from your range of suppliers. The challenge is that with no PDS functionality in between you must cater for many (hundreds or thousands) different structures, formats, taxonomies and exchange methods coming in.

Learn more in the post Inbound and Outbound Product Data Syndication.

4 Main Use Cases for Collaborative PDS

There are these four main use cases for exchanging product data in supply chains:

  • Exchanging product data for resell products where manufacturers and brands are forwarding product information to the end point-of-sale at a merchant. With the rise of online sales both in business-to-consumer (B2C) and business-to-business (B2B) the buying decisions are self-service based, which means a dramatic increase in the demand for product data throughput.
  • Exchanging product data for raw materials and packaging. Here there is a rising demand for automating the quality assurance process, blending processes in organic production and controlling the sustainability related data by data lineage capabilities.  
  • Exchanging product data for parts used in MRO (Maintenance, Repair and Operation). As these parts are becoming components of the Industry 4.0 / Industrial Internet of Things (IIoT) wave, there will be a drastic demand for providing rich product information when delivering these parts.
  • Exchanging product data for indirect products, where upcoming use of Artificial Intelligence (AI) in all procurement activities also will lead to requirements for availability of product information in this use case.  

Learn more in the post 4 Supplier Product Data Onboarding Scenarios.

Collaborative PDS at Work

In the Product Data Lake venture I am working on now, we have made a framework – and a piece of Software as a Service – that is able to leverage the concepts of inbound and outbound PDS and enable the four mentioned use cases for product data exchange.

The framework is based on reusing popular product data classifications (as GPC, UNSPSC, ETIM, eClass, ISO) and attribute requirement standards (as ETIM and eClass). Also, trading partners can use their preferred data exchange method (FTP file drop – as for example BMEcat, API or plain import/export) on each side.

All in all, the big win is that each upstream provider (typically a manufacturer / brand) can upload one uniform product catalogue to the Product Data Lake and each downstream receiver (a merchant or user organization) can download a uniform product catalogues covering all suppliers.   

Watch Out for Interenterprise MDM

In the recent Gartner Magic Quadrant for Master Data Management Solutions there is a bold statement:

By 2023, organizations with shared ontology, semantics, governance and stewardship processes to enable interenterprise data sharing will outperform those that don’t.

The interenterprise data sharing theme was covered a couple of years ago here on the blog in the post What is Interenterprise Data Sharing?

Interenterprise data sharing must be leveraged through interenterprise MDM, where master data are shared between many companies as for example in supply chains. The evolution of interenterprise MDM and the current state of the discipline was touched in the post MDM Terms In and Out of The Gartner 2020 Hype Cycle.

In the 00’s the evolution of Master Data Management (MDM) started with single domain / departmental solutions dominated by Customer Data Integration (CDI) and Product Information Management (PIM) implementations. These solutions were in best cases underpinned by third party data sources as business directories as for example the Dun & Bradstreet (D&B) world base and second party product information sources as for example the GS1 Global Data Syndication Network (GDSN).

In the previous decade multidomain MDM with enterprise-wide coverage became the norm. Here the solution typically encompasses customer-, vendor/supplier-, product- and asset master data. Increasingly GDSN is supplemented by other forms of Product Data Syndication (PDS). Third party and second party sources are delivered in the form of Data as a Service that comes with each MDM solution.

In this decade we will see the rise of interenterprise MDM where the solutions to some extend become business ecosystem wide, meaning that you will increasingly share master data and possibly the MDM solutions with your business partners – or else you will fade in the wake of the overwhelming data load you will have to handle yourself.

So, watch out for not applying interenterprise MDM.

PS: That goes for MDM end user organizations and MDM platform vendors as well.

4 Supplier Product Data Onboarding Scenarios

When working with Product Information Management (PIM) and Product Master Data Management (Product MDM) one of the most important and challenging areas is how you effectively onboard product master data / product information for products that you do not produce inhouse.

There are 4 main scenarios for that:

  • Onboarding product data for resell products
  • Onboarding product data for raw materials and packaging
  • Onboarding product data for parts used in MRO (Maintenance, Repair and Operation)
  • Onboarding product data for indirect products

Onboarding product data for resell products

This scenario is the main scenario for distributors/wholesalers, retailers and other merchants. However, most manufactures also have a range of products that are not produced inhouse but are essential supplements when selling own produced products.

The process involves getting the most complete set of product information available from the supplier in order to fit the optimal set of product information needed to support a buying decision by the end customer. With the increase of online sales, the buying decision today is often self-serviced. This has dramatically increased the demand for product information throughput.

Onboarding product data for raw materials and packaging

This scenario exists at manufacturers of products. Here the objective is to get product information needed to do quality assurance and in organic production apply the right blend in order to produce a consistent finished product.

Also, the increasing demand for measures of sustainability is driving the urge for information on the provenance of the finished product and the packaging including the origin of the ingredients and circumstances of the production of these components.  

Onboarding product data for parts used in MRO

Product data for parts used in Maintenance, Repair and Operation is a main scenario at manufacturers related to running the production facilities. However, most organizations have facility management around logistic facilities, offices, and other constructions where products for MRO are needed.

With the rise of the Internet of Things (IoT) these products are becoming more and more intelligent and are operated in an automatic way. For that, product information is needed in an until now unseen degree.

Onboarding product data for indirect products

Every organization needs products and services as furniture, office supplies, travel services and much more. The need for onboarding product data for these purchases is still minimal compared to the above-mentioned scenarios. However, a foreseeable increased use of Artificial Intelligence (AI) in procurement operations will ignite the requirement for product data onboarding for this scenario too in the coming years.

The Need for Collaborative Product Data Syndication

The sharp rise of the need product data onboarding calls for increased collaboration between suppliers and Business-to-Business (B2B) customers. It is here worth noticing, that many organizations have both roles in one or the other scenario. The discipline that is most effectively applied to solve the challenges is Product Data Syndication. This is further explained in the post Inbound and Outbound Product Data Syndication.