Is blockchain technology useful within MDM?

This question was raised on this blog back in January this year in the post Tough Questions About MDM.

Since then the use of the term blockchain has been used more and more in general and related to Master Data Management (MDM). As you know, we love new fancy terms in our else boring industry.

blockchainHowever, there are good reasons to consider using the blockchain approach when it comes to master data. A blockchain approach can be coined as centralized consensus, which can be seen as opposite to centralized registry. After the MDM discipline has been around for more than a decade, most practitioners agree that the single source of truth is not practically achievable within a given organization of a certain size. Moreover, in the age of business ecosystems, it will be even harder to achieve that between trading partners.

This way of thinking is at the backbone of the MDM venture called Product Data Lake I’m working with right now. Yes, we love buzzwords. As if cloud computing, social network thinking, big data architecture and preparing for Internet of Things wasn’t enough, we can add blockchain approach as a predicate too.

In Product Data Lake this approach is used to establish consensus about the information and digital assets related to a given product and each instance of that product (physical asset or thing) where it makes sense. If you are interested in how that develops, why not follow Product Data Lake on LinkedIn.

Bookmark and Share

Data Management Platforms for Business Ecosystems

The importance of looking at your enterprise as a part of business ecosystems was recently stressed by Gartner, the analyst firm, as reported in an article with the very long title stating: Gartner Says CIOs Need to Take a Leadership Role in Creating a Business Ecosystem to Drive a Digital Platform Strategy.

In my eyes, this trend will have a huge impact on how data management platforms should be delivered in the future. Until now much of the methodology and technology for data management platforms have been limited to how these things are handled within the corporate walls. We will need a new breed of data management platforms build for business ecosystems.

pdl-top-narrow

Such platforms will have the characteristics of other new approaches to handling data. They will resemble social networks where you request and accept connections. They will embrace data as big data and data lakes, where every purpose of data consumption are not cut in stone before collecting data. These platforms will predominately be based in the cloud.

Right now I am working with putting such a data management service up in the cloud. The aim is to support product data sharing for business ecosystems. I will welcome you, and your trading partners, as subscriber to the service. If you help trading partners with Product Information Management (PIM) there is a place for you as ambassador. Anyway, please start with following Product Data Lake on LinkedIn.

Adding Things to Product Data Lake

Product Data Lake went live last month. Nevertheless, we are already planning the next big things in this cloud service for sharing product data. One of them is exactly things. Let me explain.

Product data is usually data about a product model, for example a certain brand and model of a pair of jeans, a certain brand and model of a drilling machine or a certain brand and model of a refrigerator. Handling product data on the model level within business ecosystems is hard enough and the initial reason of being for Product Data Lake.

stepping_stones_oc

However, we are increasingly required to handle data about each instance of a product model. Some use cases I have come across are:

  • Serialization, which is numbering and tracking of each physical product. We know that from having a serial number on our laptops and another example is how medicine packs now will be required to be serialized to prevent fraud as described in the post Spectre vs James Bond and the Unique Product Identifier.
  • Asset management. Asset is kind of the fourth domain in Master Data Management (MDM) besides party, product and location as touched in the post Where is the Asset. Also Gartner, the analyst firm, usually in theory (and also soon in practice in their magic quadrants) classifies product and asset together as thing opposite to party. Anyway, in asset management you handle each physical instance of the product model.
  • Internet of Things (IoT) is, according to Wikipedia, the internetworking of physical devices, vehicles (also referred to as “connected devices” and “smart devices”), buildings and other items—embedded with electronics, software, sensors, actuators, and network connectivity that enable these objects to collect and exchange data.

Fulfilling the promise of IoT, and the connected term Industry 4.0, certainly requires common understood master data from the product model over serialization and asset management as reported in the post Data Quality 3.0 as a stepping-stone on the path to Industry 4.0.

Bookmark and Share

Data Warehouse vs Data Lake, Take 2

The differences between a data warehouse and a data lake has been discussed a lot as for example here and here.

To summarize, the main point in my eyes is: In a data warehouse the purpose and structure is determined before uploading data while the purpose with and structure of data can be determined before downloading data from a data lake. This leads to that a data warehouse is characterized by rigidity and a data lake is characterized by agility.

take-2Agility is a good thing, but of course, you have to put some control on top of it as reported in the post Putting Context into Data Lakes.

Furthermore, there are some great opportunities in extending the use of the data lake concept beyond the traditional use of a data warehouse. You should think beyond using a data lake within a given organization and vision how you can share a data lake within your business ecosystem. Moreover, you should consider not only using the data lake for analytical purposes but commence on a mission to utilize a data lake for operational purposes.

The venture I am working on right now have this second take on a data lake. The Product Data Lake exists in the context of sharing product information between trading partners in an agile and process driven way. The providers of product information, typically manufacturers and upstream distributors, uploads product information according to the data management maturity level of that organization. This information may very well for now be stored according to traditional data warehouse principles. The receivers of product information, typically downstream distributors and retailers, download product information according to the data management maturity level of that organization. This information may very well for now end up in a data store organized by traditional data warehouse principles.

As I have seen other approaches for sharing product information between trading partners these solutions are built on having a data warehouse like solution between trading partners with a high degree of consensus around purpose and structure. Such solutions are in my eyes only successful when restricted narrowly in a given industry probably within a given geography for a given span of time.

By utilizing the data lake concept in the exchange zone between trading partners you can share information according to your own pace of maturing in data management and take advantage of data sharing where it fits in your roadmap to digitalization. The business ecosystems where you participate are great sources of data for both analytical and operational purposes and we cannot wait until everyone agrees on the same purpose and structure. It only takes two to start the tango.

Bookmark and Share

Connecting Product Information

In our current work with the Product Data Lake cloud service, we are introducing a new way to connect product information that are stored at two different trading partners.

When doing that we deal with three kinds of product attributes:

  • Product identification attributes
  • Product classification attributes
  • Product features

Product identification attributes

The most common used notion for a product identification attribute today is GTIN (Global Trade Item Number). This numbering system has developed from the UPC (Universal Product Code) being most popular in North America and the EAN (International Article Number formerly European Article Number).

Besides this generally used system, there are heaps of industry and geographical specific product identification systems.

In principle, every product in a given product data store, should have a unique value in a product identification attribute.

When identifying products in practice attributes as a model number at a given manufacturer and a product description are used too.

Product classification attributes

A product classification attribute says something about what kind of product we are talking about. Thus, a range of products in a given product data store will have the same value in a product classification attribute.

As with product identification, there is no common used standard. Some popular cross-industry classification standards are UNSPSC (United Nations Products and Service Code®) and eCl@ss, but many other standards exists too as told in the post The World of Reference Data.

Besides the variety of standards a further complexity is that these standards a published in versions over time and even if two trading partners use the same standard they may not use the same version and they may have used various versions depending on when the product was on-boarded.

Product features

A product feature says something about a specific characteristic of a given product. Examples are general characteristics as height, weight and colour and specific characteristics within a given product classification as voltage for a power tool.

Again, there are competing standards for how to define, name and identify a given feature.

pdl-tagsThe Product Data Lake tagging approach

In the Product Data Lake we use a tagging system to typify product attributes. This tagging system helps with:

  • Linking products stored at two trading partners
  • Linking attributes used at two trading partners

A product identification attribute can be tagged starting with = followed by the system and optionally the variant off the system used. Examples will be ‘=GTIN’ for a Global Trading Item Number and ‘=GTIN-EAN13’ for a 13 character EAN number. An industry geographical tag could be ‘=DKVVS’ for a Danish plumbing catalogue number (VVS nummer). ‘=MODEL’ is the tag of a model number and ‘=DESCRIPTION’ is the tag of the product description.

A product classification tag starts with a #. ‘#UNSPSC’ is for a United Nations Products and Service Code where ‘#UNSPSC-19’ indicates a given main version.

A product feature is tagged with the feature id, an @ and the feature (sometimes called property) standard. ‘EF123456@ETIM’ will be a specific feature in ETIM (an international standard for technical products). ‘ABC123@ECLASS’ is a reference to a property in eCl@ss.

Bookmark and Share

Data Quality 3.0 as a stepping-stone on the path to Industry 4.0

The title of this blog post is a topic on my international keynote at the Stammdaten Management Forum 2016 in Düsseldorf, Germany on the 8th November 2016. You can see the agenda for this conference that starts on the 7th and end the on 9th here.

stepping_stones_ocData Quality 3.0 is a term I have used over the years here on the blog to describe how I see data quality, along with other disciplines within data management, changing. This change is about going from focusing on internal data stores and cleansing within them to focusing on external sharing of data and using your business ecosystem and third party data to drastically speed up data quality improvement.

Industry 4.0 is the current trend of automation and data exchange in manufacturing technologies. When we talk about big data most will agree that success with big data exploitation hinges on proper data quality within master data management. In my eyes, the same can be said about success with industry 4.0. The data exchange that is the foundation of automation must be secured by common understood master data.

So this is the promising way forward: By using data exchange in business ecosystems you improve data quality of master data. This improved master data ensures the successful data exchange within industry 4.0.

Bookmark and Share

Ways of Sharing Product Data in Business Ecosystems

Sharing product data within business ecosystems of manufacturers, distributors, retailers and end users has grown dramatically during the last years driven by the increased use of e-commerce and other customer self-service sales approaches.

At Product Data Lake we recently had a survey about how companies shares product data today. The figures were as seen below:

our survey

The result shows that there are different approaches out there. Spreadsheets still rules the world though closely, in this survey, followed by external data portals. Direct system to system approaches are also present while supplier portals seems to be not that common.

At the Product Data Lake we aim to embrace those different approaches. Well, regarding use of spreadsheets and digital asset files via eMail our embracement is meant to be that of a constrictor snake. The Product Data Lake is the solution to end the hailstorms of spreadsheets with product data within cross company supply chains.

For external data portals, the Product Data Lake offers the concept of a data reservoir. A data reservoir in the Product Data Lake can be with an industry focus or with a special focus on certain data elements as for example sustainability data as described in the post Sustainability Data in PIM.

Direct systems to system exchange can be orchestrated through the Product Data Lake and supplier portals can served by the Product Data Lake. In that way existing investments in those approaches, that typically are implemented to serve basic data elements shared with your top trading partners, can be supplemented by a method that caters for exchange with all your trading partners and covering all data elements and digital assets.

Bookmark and Share

Emerging Database Technologies for Master Data

The MDM Landscape Q2 2016 from Information Difference is out. MDM vendors usually celebrate these yearly analyst reports with tweets and posts about their prominent position, like Informatica trailed by Stibo Systems for being in the top right corner and Agility Mulitichannel closely followed by Orchestra Networks for having the happiest customers.

The Information DifferenceBut the market analysis and the trends observed is good stuff as well.

This year I noticed the trend in the underlying technology used by MDM vendors to store the master data. The report says: “Some vendors have also decided to cut their ties with the relational database platform that has traditionally been the core storage mechanism for master data. Certain types of analysis e.g. of relationships between data, can be well handled by other types of emerging databases, such as graph databases like Neo4J and NoSQL databases like MongoDB. One vendor has recently switched its underlying platform entirely away from relational, and others have similar plans.”

While we usually see graph databases and NoSQL databases as something to use for analytical purposes, the trend of moving master data platforms to these technologies implies that operational purposes will be based on these technologies too.

This is close to me as the master data service I’m work with right now is based on storing data for operational purposes in MongoDB (in the cloud).

Bookmark and Share

Sustainability Data in PIM

The collection of product data to be handled within PIM (Product Information Management) systems are ever increasing. End customers want more and more data to support purchase decisions.

This theme was pondered in the post Self-Service Ready Product Data.

One new kind of product data to beware of in the future is information about sustainability measures related to a given product. This is information about the environmental impact and the social impact from producing and consuming a product.

As the founder of the Product Data Lake, a solution for exchanging product data in business ecosystems, I am very pleased that sustainability information will be included as an important kind of product data ready to be exchanged between trading partners.

EA
Earth Accounting

This is due to a cooperation with Earth Accounting. The Product Data Lake will be an integrated part of the information cooperative, where the Product Data Lake will facilitate forward looking manufacturers in providing their own sustainability measures along with all other kind of product data and where progressive distributors and retailers can receive and eventually publish sustainability data along with all other self-service ready product data.

Bookmark and Share