What is Collaborative Product Data Syndication?

Product Data Syndication (PDS) is a sub discipline within Product Information Management (PIM) as explained in the post What is Product Data Syndication (PDS)?

Collaborative PDS can be achieved at scale with a specialized product data syndication service where the manufacturer can push product information according to their definitions and the merchant can pull linked and transformed product information according to their definitions.

With Collaborative Product Data Syndication, you can get the best of two worlds:

  • You can have the market standard that makes you not falling behind your competitors.
  • However, you can also have unique content coming through that puts you ahead of your competitors.

The advantages of collaborative PDS versus other PDS approaches was examined in the post Collaborative Product Data Syndication vs Data Pools and Marketplaces.

The Product Data Lake solution I am involved with utilizes that data lake concept to handle the complexities of having many different data standards for product information in play within supply chains and encompass the many different preferences for exchange methods.

Our approach is not to reinvent the wheel, but to collaborate with partners in the industry. This includes:
·       Experts within a type of product as building materials and sub-sectors in this industry, machinery, chemicals, automotive, furniture and home-ware, electronics, work clothes, fashion, books and other printed materials, food and beverage, pharmaceuticals and medical devices. You may be a specialist in certain standards for product data. You will link the taxonomy in use at two trading partners or within a larger business ecosystem.
·       Product data cleansing specialists who have proven track records in optimizing product master data and product information. You will prepare the product data portfolio at a trading partner and extend the service to other trading partners or within a larger business ecosystem.
·       System integrators who can integrate product data syndication flows into Product Information Management (PIM) and other solutions at trading partners and consult on the surrounding data quality and data governance issues. You will enable the digital flow of product information between two trading partners or within a larger business ecosystem.
·       Tool vendors who can offer in-house Product Information Management (PIM) / Master Data Management (MDM) solutions or similar solutions in the ERP and Supply Chain Management (SCM) sphere. You will be able to provide, supplement or replace customer data portals at manufacturers and supplier data portals at merchants and thus offer truly automated and interactive product data syndication functionality.
·       Technology providers with data governance solutions, data quality management solutions and Artificial Intelligence (AI) / machine learning capacities for classifying and linking product information to support the activities made by other delegates and subscribers.
·       Reservoirs, as Product Data Lake is a unique opportunity for service providers with product data portfolios (data pools and data portals) for utilizing modern data management technology and offer a comprehensive way of collecting and distributing product data within the business processes used by subscribers.

The Disruptive MDM/PIM/DQM List 2022: Reltio

Today it is time to present the fourth vendor to be on The Disruptive MDM / PIM / DQM List in 2022. That is Reltio.

I have been following Reltio here on the blog since 2013 as this MDM vendor has grown from an entrepreneur to a recognized solution provider recently manifested as being a leader in the Forrester MDM Wave and receiving 120 M USD in funding last month.

Reltio is a multi-domain cloud-native MDM solution covering a broad range of MDM capabilities.

You can learn more about Reltio here.

MDM and Knowledge Graph

As examined in a previous post with the title Data Fabric and Master Data Management, the use of the knowledge graph approach is on the rise.

Utilizing a knowledge graph has an overlap with Master Data Management (MDM).

If we go back 10 years MDM and Data Quality Management had a small niche discipline that was called (among other things) entity resolution as explored in the post Non-Obvious Entity Relationship Awareness. The aim of this was the same that today can be delivered in a much larger scale using knowledge graph technology.

During the past decade there have been examples of using graph technology for MDM as for example mentioned in the post Takeaways from MDM Summit Europe 2016. However, most attempts to combine MDM and graph have been to visualize the relationships in MDM using a graph presentation.

When utilizing knowledge graph approaches you will be able to detect many more relationships than those that are currently managed in MDM. This fact is the foundation for a successful co-existence between MDM and knowledge graph with these synergies:

  • MDM hubs can enrich knowledge graph with proven descriptions of the entities that are the nodes (vertices) in the knowledge graph.
  • Additional detected relationships (edges) and entities (nodes) from the knowledge graph that are of operational and/or general analytic interest enterprise wide can be proven and managed in MDM.

In this way you can create new business benefits from both MDM and knowledge graph.

The Disruptive MDM/PIM/DQM List 2022: Magnitude Software

In the round of presenting the solutions for The Disruptive MDM / PIM / DQM List 2022 the next vendor is Magnitude Software.

Magnitude Software has two solutions on the list:

  • Kalido MDM where you can define and model critical business information from any domain – customer, product, financial, vendor, supplier, location and more – to create and manage accurate, integrated, and governed data that business users trust.
  • Agility Multichannel PIM which has the capabilities to get products to market faster with a simple-to-use, comprehensive Product Information Management solution that makes it easy to support commerce across digital and traditional channels.

Learn more about Kalido MDM here and Agility Multichannel PIM here.

What’s in a Data Governance Framework?

When you are going to implement data governance one key prerequisite is to work with a framework that outlines the key components of the implementation and ongoing program.

There are many frameworks available. A few are public while most are legacy frameworks provided by consultancy companies.

Anyway, the seven main components that you will (or should) see in a data governance framework are these:

  • Vision and mission: Formalizing a statement of the desired outcome, the business objectives to be reached and the scope covered.
  • Organization: Outlaying how the implementation and the continuing core team is to be organized, their mandate and job descriptions as well as outlaying the forums needed for business engagement.
  • Roles and responsibilities: Assigning the wider roles involved across the business often set in a RACI matrix with responsible, accountable, to be consulted and to be informed roles for data domains and the critical data elements within.
  • Business Glossary: Creation and maintenance of a list of business terms and their definitions that must be used to ensure the same vocabulary are used enterprise-wide when operating with and analyzing data.
  • Data Policies and Data Standards: Documentation of the overarching data policies enterprise-wide and for each data domain and the standards for the critical data elements within.
  • Data Quality Measurement: Identification of the key data quality indicators that support general key performance indicators in the business and the desired goals for these.
  • Data Innovation Roadmap: Forecasting the future need of new data elements and relationships to be managed to support key business drivers as for example digitalization and globalization.

Other common components in and around a data governance framework are the funding/business case, data management maturity assessment, escalation procedures and other processes.

What else have you seen or should be seen in a data governance framework?   

The Disruptive MDM/PIM/DQM List 2022: Contentserv

One of the recurring entries on The Disruptive MDM/PIM/DQM List is Contentserv.

Contentserv operates under the slogan: Futurize your customers’ product experience.

Using Contentserv, you will be able to develop the groundbreaking product experiences your customers expect — across multiple channels. Contentserv help you unleash the potential of your product information, using our unique combination of advanced technologies.

Contetserv has combined multiple data management technologies in a single platform for controlling the total product experience. The platform facilitates collecting data from suppliers, enriching it into high-grade content, and then personalizing it for use in targeted marketing and promotions.

Learn more about the Contentserv Product Experience Platform here.

PS: You can also find some compelling success stories from Contentserv on the Case Study List here.

What is Product Data Syndication (PDS)?

Product Information Management (PIM) has a sub discipline called Product Data Syndication (PDS).

While PIM basically is about how to collect, enrich, store and publish product information within a given organization, PDS is about how to share product information between manufacturers, merchants and marketplaces.

Marketplaces

Marketplaces is the new kid on the block in this world. Amazon and Alibaba are the most known ones, however there are plenty of them internationally, within given product groups and nationally. Merchants can provide product information related to the goods they are selling on a marketplace. A disruptive force in the supply (or value) chain world is that today manufacturers can sell their goods directly on marketplaces and thereby leave out the merchants. It is though still only a fraction of trade that has been diverted this way.

Each marketplace has their requirements for how product information should be uploaded encompassing what data elements that are needed, the requested taxonomy and data standards as well as the data syndication method.

Data Pools

One way of syndicating (or synchronizing) data from manufacturers to merchants is going through a data pool. The most known one is the Global Data Synchronization Network (GDSN) operated by GS1 through data pool vendors, where 1WorldSync is the dominant one. In here trading partners are following the same classification, taxonomy and structure for a group of products (typically food and beverage) and their most common attributes in use in a given geography.

There are plenty of other data pools available emphasizing on given product groups either internationally or nationally. The concept here is also that everyone will use the same taxonomy and have the same structure and range of data elements available.

Data Standards

Product classifications can be used to apply the same data standards. GS1 has a product classification called GPC. Some marketplaces use the UNSPSC classification provided by United Nations and – perhaps ironically – also operated by GS1. Other classifications, that in addition encompass the attribute requirements too, are eClass and ETIM.

A manufacturer can have product information in an in-house ERP, MDM and/or PIM application. In the same way a merchant (retailer or B2B dealer) can have product information in an in-house ERP, MDM (Master Data Management) and/or PIM application. Most often a pair of manufacturer and merchant will not use the same data standard, taxonomy, format and structure for product information.

1-1 Product Data Syndication

Data pools have not substantially penetrated the product data flows encompassing all product groups and all the needed attributes and digital assets. Besides that, merchants also have a desire to provide unique product information and thereby stand out in the competition with other merchants selling the same products.

Thus, the highway in product data syndication is still 1-1 exchange. This highway has these lanes:

  • Exchanging spreadsheets typically orchestrated as that the merchant request the manufacturer to fill in a spreadsheet with the data elements defined by the merchant.
  • A supplier portal, where the merchant offers an interface to their PIM environment where each manufacturer can upload product information according to the merchant’s definitions.
  • A customer portal, where the manufacturer offers an interface where each merchant can download product information according to the manufacturer’s definitions.
  • A specialized product data syndication service where the manufacturer can push product information according to their definitions and the merchant can pull linked and transformed product information according to their definitions.

In practice, the chain from manufacturer to the end merchant may have several nodes being distributors/wholesalers that reloads the data by getting product information from an upstream trading partner and passing this product information to a downstream trading partner.

Data Quality Implications

Data quality is as always a concern when information producers and information consumers must collaborate, and in a product data syndication context the extended challenge is that the upstream producer and the downstream consumer does not belong to the same organization. This ecosystem wide data quality and Master Data Management (MDM) issue was examined in the post Watch Out for Interenterprise MDM.

The Disruptive MDM/PIM/DQM List 2022: Datactics

A major rework of The Disruptive MDM/PIM/DQM List is in the making as the number of visitors keep increasing and so do the number of requests for individual solution lists.

It is good to see that some of the most innovative solution providers commit to be part of the list also next year.

One of those is Datactics.

Datactics is a veteran data quality solution provider who is constantly innovating in this space. This year Datactics was one of the rare new entries in The Gartner Magic Quadrant for Data Quality Solutions 2021.

It will be exciting to follow the ongoing development at Datactics, who is operating under the slogan: “Democratising Data Quality”.

You can learn more about how their self-service data quality and matching solution looks like here.

Core Datactics capabilities

Data Fabric and Master Data Management

Data fabric has been named a key strategic technology trend in 2022 by Gartner, the analyst firm.

According to Gartner, “by 2024, data fabric deployments will quadruple efficiency in data utilization while cutting human-driven data management tasks in half”.

Master Data Management (MDM) and data fabric are overlapping disciplines as examined in the post Data Fabric vs MDM. I have seen data strategies where MDM is put as a subset to data fabric and data strategies where they are separate tracks.

In my head, there is a common theme being data sharing.

Then there is a different focus, where data fabric seems to be focusing on data integration. MDM is also about data integration, but more about data quality. Data fabric takes care of all data while MDM obviously is about master data, though the coverage of business entities within MDM seems to be broadening.

Another term closely tied to data fabric – and increasingly with MDM as well – is knowledge graph. Knowledge graph is usually considered a mean to achieve a good state of data fabric. In the same way you can use a knowledge graph approach to achieve a good state of MDM when it comes to managing relationships – if you include a data quality facet.

What is your take on data fabric and MDM?

Five Pairs of Data Quality Dimensions

Data quality dimensions are some of the most used terms when explaining why data quality is important, what data quality issues can be and how you can measure data quality. Ironically, we sometimes use the same data quality dimension term for two different things or use two different data quality dimension terms for the same thing. Some of the troubling terms are:

Validity / Conformity – same same but different

Validity is most often used to describe if data filled in a data field obeys a required format or are among a list of accepted values. Databases are usually well in doing this like ensuring that an entered date has the day-month-year sequence asked for and is a date in the calendar or to cross check data values against another table and see if the value exist there.

The problems arise when data is moved between databases with different rules and when data is captured in textual forms before being loaded into a database.

Conformity is often used to describe if data adheres to a given standard, like an industry or international standard. This standard may due to complexity and other circumstances not or only partly be implemented as database constraints or by other means. Therefore, a given piece of data may seem to be a valid database value but not being in compliance with a given standard.

Sometimes conformity is linked to the geography in question. For example a postal code will be conform depending on the country where the address is in. Therefore, a the postal code 12345 is conform in Germany, but not in United Kingdom.

Accuracy / Precision – true, false or not sure

The difference between accuracy and precision is a well-known statistical subject.

In the data quality realm accuracy is most often used to describe if the data value corresponds correctly to a real-world entity. If we for example have a postal address of the person “Robert Smith” being “123 Main Street in Anytown” this data value may be accurate because this person (for the moment) lives at that address.

But if “123 Main Street in Anytown” has 3 different apartments each having its own mailbox, the value does not, for a given purpose, have the required precision.

If we work with geocoordinates we have the same challenge. A given accurate geocode may have the sufficient precision to tell the direction to the nearest supermarket is, but not precise enough to know in which apartment the out-of-milk smart refrigerator is.

Timeliness / Currency – when time matters

Timeliness is most often used to state if a given data value is present when it is needed. For example, you need the postal address of “Robert Smith” when you want to send a paper invoice or when you want to establish his demographic stereotype for a campaign.

Currency is most often used to state if the data value is accurate at a given time – for example if “123 Main Street in Anytown” is the current postal address of “Robert Smith”.

Uniqueness / Duplication – positive or negative

Uniqueness is the positive term where duplication is the negative term for the same issue.

We strive to have uniqueness by avoiding duplicates. In data quality lingo duplicates are two (or more) data values describing the same real-world entity. For example, we may assume that

  • “Robert Smith at 123 Main Street, Suite 2 in Anytown”

is the same person as

  • “Bob Smith at 123 Main Str in Anytown”

Completeness / Existence – to be, or not to be

Completeness is most often used to tell in what degree all required data elements are populated.

Existence can be used to tell if a given dataset has all the needed data elements for a given purpose defined.

So “Bob Smith at 123 Main Str in Anytown” is complete if we need name, street address and city, but only 75 % complete if we need name, street address, city and preferred colour and preferred colour is an existent data element in the dataset.

Data Quality Management 

Master Data Management (MDM) solutions and specialized Data Quality Management (DQM) tools have capabilities to asses data quality dimensions and improve data quality within the different data quality dimensions.

Check out the range of the best solutions to cover this space on The Disruptive MDM &PIM &DQM List.