Linked Product Data Quality

Some years ago the theme of Linked Data Quality was examined here on the blog.

As stated in the post a lot of product data is already out there waiting to be found, categorized, matched and linked.

Doing this is at the core of the Product Data Lake venture I am involved with. What we aim to do is linking product information stored using different taxonomies at trading partners, preferable by referencing international and industry standards as eCl@ss, ETIM, UNSPSC, Harmonized System, GPC and more.

Our approach is not to reinvent the wheel, but to collaborate with partners in the industry. This include:

  • Experts within a type of product as building materials and sub-sectors in this industry, machinery, chemicals, automotive, furniture and home-ware, electronics, work clothes, fashion, books and other printed materials, food and beverage, pharmaceuticals and medical devices. You may be a specialist in certain standards for product data. As an ambassador you will link the taxonomy in use at two trading partners or within a larger business ecosystem.
  • Product data cleansing specialists who have proven track records in optimizing product master data and product information. As an ambassador you will prepare the product data portfolio at a trading partner and extend the service to other trading partners or within a larger business ecosystem.
  • System integrators who can integrate product data syndication flows into Product Information Management (PIM) and other solutions at trading partners and consult on the surrounding data quality and data governance issues. As an ambassador, you will enable the digital flow of product information between two trading partners or within a larger business ecosystem.
  • Tool vendors who can offer in-house Product Information Management (PIM) / Master Data Management (MDM) solutions or similar solutions in the ERP and Supply Chain Management (SCM) sphere. As an ambassador you will able to provide, supplement or replace customer data portals at manufacturers and supplier data portals at merchants and thus offer truly automated and interactive product data syndication functionality.
  • Technology providers with data governance solutions, data quality management solutions and Artificial Intelligence (AI) / machine learning capacities for classifying and linking product information to support the activities made by ambassadors and subscribers.
  • Reservoirs, as Product Data Lake is a unique opportunity for service providers with product data portfolios (data pools and data portals) for utilizing modern data management technology and offer a comprehensive way of collecting and distributing product data within the business processes used by subscribers.

See more on the Product Data Link site, on the Product Data Link showcase page on LinkedIn or get in contact right away:

 

Become a Product Data lake ambassador!

cropped-badgesmall.png

 

 

Welcome to Another Data Lake for Data Sharing

A couple of weeks ago Microsoft, Adobe and SAP announced their Open Data Initiative. While this, as far as we know, is only a statement for now, it of course has attracted some interest based on that it is three giants in the IT industry who have agreed on something – mostly interpreted as agreed to oppose Salesforce.com.

Forming a business ecosystem among players in the market is not new. However, what we usually see is that a group of companies agrees on a standard and then each one of them puts a product or service, that adheres to that standard, on the market. The standard then caters for the interoperability between the products and services.

In this case its seems to be something different. The product or service is operated by Microsoft based on their Azure platform. There will be some form of a common data model. But it is a data lake, meaning that we should expect that data can be provided in any structure and format and that data can be consumed into any structure and format.

In all humbleness, this concept is the same as the one that is behind Product Data Lake.

The Open Data Initiative from Microsoft, Adobe and SAP focuses at customer data and seems to be about enterprise wide customer data. While it technically also could support ecosystem wide customer data, privacy concerns and compliance issues will restrict that scope in many cases.

At Product Data Lake, we do the same for product data. Only here, the scope is business ecosystem wide as the big pain with product data is the flow between trading partners as examined here.

Open Data Initiative SAP Adobe Microsoft

Digitalization has Put Data in the Forefront

20 years ago, when I started working as a contractor and entrepreneur in the data management space, data was not on the top agenda at many enterprises. Fortunately, that has changed.

An example is displayed by Schneider Electric CEO Jean-Pascal Tricoire in his recent blog post on how digitization and data can enable companies to be more sustainable. You can read it on the Schneider Electric Blog in the post 3 Myths About Sustainability and Business.

Manufacturers in the building material sector naturally emphasizes on sustainability. In his post Jean-Pascal Tricoire says: “The digital revolution helps answering several of the major sustainability challenges, dispelling some of the lingering myths regarding sustainability and business growth”.

One of three myths dispelled is: Sustainability data is still too costly and time-consuming to manage.

From my work with Master Data Management (MDM) and Product Information Management (PIM) at manufacturers and merchants in the building material sector I know that managing the basic product data, trading data and customer self-service ready product data is hard enough. Taking on sustainability data will only make that harder. So, we need to be smarter in our product data management. Smart and sustainable homes and smart sustainable cities need smart product data management.

In his post Jean-Pascal Tricoire mentions that Schneider Electric has worked with other enterprises in their ecosystem in order to be smarter about product data related to sustainability. In my eyes the business ecosystem theme is key in the product data smartness quest as pondered in the post about How Manufacturers of Building Materials Can Improve Product Information Efficiency.

MDMDG 2013 wordle

 

It is time to apply AI to MDM and PIM

The intersection between Artificial Intelligence (AI) and Master Data Management (MDM) – and the associated discipline Product Information Management (PIM) – is an emerging topic.

A use case close to me

In my work at setting up a service called Product Data Lake the inclusion of AI has become an important topic. The aim of this service is to translate between the different taxonomies in use at trading partners for example when a manufacturer shares his product information with a merchant.

In some cases the manufacturer, the provider of product information, may use the same standard for product information as the merchant. This may be deep standards as eCl@ss and ETIM or pure product classification standards as UNSPSC. In this case we can apply deterministic matching of the classifications and the attributes (also called properties or features).

Product Data Syndication

However, most often there are uncovered areas even when two trading partners share the same standard. And then again, the most frequent situation is that the two trading partners are using different standards.

In that case we initially will use human resources to do the linking. Our data governance framework for that includes upstream (manufacturer) responsibility, downstream (merchant) responsibility and our ambassador concept.

As always, applying too much human interaction is costly, time consuming and error prone. Therefore, we are very eagerly training our machines to be able to do this work in a cost-effective way, within a much shorter time frame and with a repeatable and consistent outcome to the benefit of the participating manufacturers, merchants and other enterprises involved in exchanging products and the related product information.

Learning from others

This week I participated in a workshop around exchanging experiences and proofing use cases for AI and MDM. The above-mentioned use case was one of several use cases examined here. And for sure, there is a basis for applying AI with substantial benefits for the enterprises who gets this. The workshop was arranged by Camelot Management Consultants within their Global Community for Artificial Intelligence in MDM.

Share or be left out of business

Enterprises are increasingly going to be part of business ecosystems where collaboration between legal entities not belonging to the same company family tree will be the norm.

This trend is driven by digital transformation as no enterprise possibly can master all the disciplines needed in applying a digital platform to traditional ways of doing business.

Enterprises are basically selfish. This is also true when it comes to Master Data Management (MDM). Most master data initiatives today revolve around aligning internal silos of master data and surrounding processes to fit he business objectives within an enterprise as a whole. And that is hard enough.

However, in the future that is not enough. You must also be able share master data in the business ecosystems where your enterprise will belong. The enterprises that, in a broad sense, gets this first will survive. Those who will be laggards are in danger of being left out of business.

This is the reason of being for Master Data Share.

Master Data Share or be OOB

Three Flavors of Data Monetization

The term data monetization is trending in the data management world.

Data monetization is about harvesting direct financial results from having access to data that is stored, maintained, categorized and made accessible in an optimal manner. Traditionally data management & analytics has contributed indirectly to financial outcome by aiming at keeping data fit for purpose in the various business processes that produced value to the business. Today the best performers are using data much more directly to create new services and business models.

In my view there are three flavors of data monetization:

  • Selling data: This is something that have been known to the data management world for years. Notable examples are the likes of Dun & Bradstreet who is selling business directory data as touched in the post What is a Business Directory? Another examples is postal services around the world selling their address directories. This is the kind of data we know as third party data.
  • Wrapping data around products: If you have a product – or a service – you can add tremendous value to these products and services and make them more sellable by wrapping data, potentially including third party data, around those products and services. These data will thus become second party data as touched in the post Infonomics and Second Party Data.
  • Advanced analytics and decision making: You can combine third party data, second party data and first party data (your own data) in order to make advanced analytics and fast operational decision making in order to sell more, reduce costs and mitigate risks.

Please learn more about data monetization by downloading a recent webinar hosted by Information Builders, their expert Rado Kotorov and yours truly here.

Data Monetization

Product Data Syndication Freedom

When working with product data syndication in supply chains the big pain is that data standards in use and the preferred exchange methods differ between supply chain participants.

As a manufacturer you will have hundreds of re-sellers who probably have data standards different from you and most likely wants to exchange data in a different way than you do.

As a merchant you will have hundreds of suppliers who probably have data standards different from you and most likely wants to exchange data in a different way than you do.

The aim of Product Data Lake is to take that pain away from both the manufacturer side and the merchant side. We offer product data syndication freedom by letting you as manufacturer push product information using your data standards and your preferred exchange method and letting you as a merchant pull product information using your data standards and your preferred exchange method.

Product Data SyndicationIf you want to know more. Get in contact here:

Avoid Duplicates by Avoiding Peer-to-Peer Integrations

When working in Master Data Management (MDM) programs some of the main pain points always on the list are duplicates. As explained in the post Golden Records in Multi-Domain MDM this may be duplicates in party master data (customer, supplier and other roles) as well as duplicates in product master data, assets, locations and more.

Most of the data quality technology available to solve these problems revolves around identifying duplicates.  This is a very intriguing discipline where I have spent some of my best years. However, this is only a remedy to the symptoms of the problem and not a mean to eliminate the root cause as touched in the post The Good, Better and Best Way of Avoiding Duplicates.

The root causes are plentiful and as all challenges they involve technology, processes and people.

Having an IT landscape with multiple applications where master data are a created, updated and consumed is a basic problem and a remedy to that is the main reason of being for Master Data Management (MDM) solutions. The challenge is to implement MDM technology in a way that the MDM solution will not just become another silo of master data but instead be solution for sharing master data within the enterprise – and ultimately in the digital ecosystem around the enterprise.

blind-spot-take-careThe main enemy from a technology perspective is in my experience peer-to-peer system integration solutions. If you have chosen application X to support a business objective and application Y to support another business objective and you learn that there is an integration solution between X and Y available, this is very bad news. Because short term cost and timing considerations will make that option obvious. But in the long run it will cost you dearly if the master data involved are handled in other applications as well. Because then you will have blind spots all over the place where through duplicates will enter.

The only sustainable solution is to build a master data hub where through master data are integrated and thus shared with all applications inside the enterprise and around the enterprise. This hub must encompass a shared master data model and related metadata.

 

Welcome Dynamicweb PIM on the Disruptive MDM and PIM List

This Disruptive Master Data Management Solutions list is a list of available:

  • Master Data Management (MDM) solutions
  • Customer Data Integration (CDI) solutions
  • Product Information Management (PIM) solutions
  • Digital Asset Management (DAM) solutions.

You can use this site as a supplement to the likes of Gartner, Forrester, MDM Institute and others when selecting a MDM / CDI / PIM / DAM solution, not at least because this site will include both larger and smaller disruptive MDM, PIM and similar solutions.

The latest entry on the list is Dynamicweb PIM. This is a mature cloud-based Product Information Management (PIM) solution that can be deployed either as a stand-alone PIM implementation or in their combined all-in-one platform together with content management, ecommerce and marketing and tightly integrated with popular ERP and CRM solutions. This integrated approach offers a short time to value opportunity for midsized companies on the quest to ramp up online sales.

Read more about the Dynamicweb PIM solution here.

Dynamicweb PIM front

Three Remarkable Observations about Reltio

The latest entry on The Disruptive Master Data Management Solutions List is Reltio. I have been following Reltio for more than 5 years and have had the chance to do some hands on lately.

In doing that, I think there are three observations that makes the Reltio Cloud solution a remarkable MDM offering.

More than Master Data

While the Reltio solution emphasizes on master data the platform can include the data that revolves around master data as well. That means you can bring transactions and big data streams to the platform and apply analytics, machine learning, artificial intelligence and those shiny new things in order to go from a purely analytical world for these disciplines to exploit these data and capabilities in the operational world.

The thinking behind this approach is that you can not get a 360-degree on customer, vendor and other party roles as well as 360-degree on products by only having a snapshot compound description of the entity in question. You also need the raw history, the relationships between entities and access to details for various use cases.

In fact, Reltio provides not just operational MDM, but through a module called Reltio IQ also brings continuously mastered data, correlated transactions into an Apache Spark environment for analytics and Machine Learning. This eliminates the traditional friction of synchronizing data models between MDM and analytical environments. It also allows for aggregated results to be synchronized back into the MDM profiles, by storing them as analytical attributes. These attributes are now available for use in operational context, such as marketing segmentation, sales recommendations, GDPR exposure and more.

Multiple Storing Capabilities

There is an ongoing debate in the MDM community these days about if you should use relational database technology or NoSQL technology or graph technology? Reltio utilizes all three of them for the purposes where each approach makes the most sense.

Reference data are handled as relational data. The entities are kept using a wide column store, which is a technique encompassing scalability known from pure column stores but with some of the structure known from relational databases. Finally, the relationships are handled using graph techniques, which has been a recurring subject on this blog.

Reltio calls this multi-model polyglot persistence, and they embrace the latest technologies from multiple clouds such as AWS and Google Cloud Platform (GCP) under the covers.

Survival of the Fit Enough

One thing that MDM solutions do is making a golden record from different systems of records where the same real-world entity is described in many ways and therefore are considered duplicate records. Identifying those records is hard enough. But then comes the task of merging the conflicting values together, so the most accurate values survive in the golden record.

Reltio does that very elegantly by actually not doing it. Survivorship rules can be set up based on all the needed parameters as recency, provenance and more and you may also allow more than one value to survive as touched in the post about the principle of Survival of the Fit Enough.

In Reltio there is no purge of the immediately not surviving values. The golden record is not stored physically. Instead Reltio keeps one (or even more than one) virtual golden record(s) by letting the original source records stay. Therefore, you can easily rollback or update the single view of the truth.

The Reltio platform allows survivorship rules to be customized in rulesets for an unlimited number of roles and personas. In effect supporting multiple personalized versions of the truth. In an operational MDM context this allows sales, marketing, compliance, and other teams to see the data values that they care about most, while collaborating continuously in what Reltio calls the Self-Learning Enterprise.

Going beyond operational MDM