Liliendahl on Data Quality

Approaches to Sharing Product Information in Business Ecosystems

17th October 201628th December 2016Henrik Gabs LiliendahlLeave a comment

One of the most promising aspects of digitalization is sharing information in business ecosystems. In the Master Data Management (MDM) realm, we will in my eyes see a dramatic increase in sharing product information between trading partners as touched in the post Data Quality 3.0 as a stepping-stone on the path to Industry 4.0.

Standardization (or standardisation)

A challenge in doing that is how we link the different ways of handling product information within each organization in business ecosystems. While everyone agrees that a common standard is the best answer we must on the other hand accept, that using a common standard for every kind of product and every piece of information needed is quite utopic. We haven’t even a common uniquely spelled term in English.

Also, we must foresee that one organization will mature in a different pace than another organisation in the same business ecosystem.

Product Data Lake

These observations are the reasons behind the launch of Product Data Lake. In Product Data Lake we encompass the use of (in prioritized order):

The same standard in the same version
The same standard in different versions
Different standards
No standards

In order to link the product information and the formats and structures at two trading partners, we support the following approaches:

Automation based on product information tagged with a standard as explained in the post Connecting Product Information.
Ambassadorship, which is a role taken by a product information professional, who collaborates with the upstream and downstream trading partner in linking the product information. Read more about becoming a Product Data Lake ambassador here.
Upstream responsibility. Here the upstream trading partner makes the linking in Product Data Lake.
Downstream responsibility. Here the downstream trading partner makes the linking in Product Data Lake.

Data Governance

Regardless of the mix of the above approaches, you will need a cross company data governance framework to control the standards used and the rules that applies to the exchange of product information with your trading partners. Product Data Lake have established a partnership with one of the most recommended authorities in data governance: Nicola Askham – the Data Governance Coach.

For a quick overview please have a look at the Cross Company Data Governance Framework.

Please request more information here.

Sharing Metadata

7th October 20168th October 2016Henrik Gabs LiliendahlLeave a comment

In short, metadata is data about data. Handling metadata is an important facet of data management including in data governance, data quality management and Master Data Management (MDM). When it comes to the new trends in data management as big data and handling data in data lakes, the importance of metadata management will in my eyes become even more obvious.

In a current venture (Product Data Lake) we are working on building in metadata management for business ecosystems, meaning that trading partners can share product information either using the same metadata or linking their different metadata.

Using international, national and industry standards for product information will be the perfect solution within business ecosystem sharing of metadata and indeed this is the preferred option we support. However, there are many competing standards for product information and they come in developing versions, so having everyone on the same page at the same time is quite utopic.

Add to that everyone do not speak English – and even not the same variant of English. Metadata originates and should exist in the languages that is used in trading partnerships.

In Product Data Lake we have started out with these principles:

Product attributes can be tagged with an attribute type telling about what standard (if any) in terms of product identification, product classification or product feature it adheres to. More about that in the post Connecting Product Information.
Attribute short and long descriptions can be represented in different languages.
Trading partners can link their product attributes and have visibility in the Product Data Lake of the standards and descriptions used in the different languages they exist.

I will very much welcome your input to this quest and if you want to be involved please do not hesitate to be in touch with me here or on Xing, Viadeo or LinkedIn.

Sign Up is Open

2nd October 2016Henrik Gabs LiliendahlLeave a comment

Over the recent one and a half year many of the posts on this blog has been about Product Data Lake, a cloud service for sharing product data in the business ecosystems of manufacturers, distributors, retailers and end users of product information.

From my work as a data quality and Master Data Management (MDM) consultant, I have seen the need for a service to solve data quality issues, when it comes to product master data. My observation has been that the root cause of these issues are found in the way that trading partners exchange product information and digital assets.

It is the aim of Product Data Lake to ensure:

Completeness of product information by enabling trading partners to exchange product data in a uniform way
Timeliness of product information by connecting trading partners in a process driven way
Conformity of product information by encompassing various international standards for product information
Consistency of product information by allowing upstream trading partners and downstream trading partners to interact with in-house structure of product information
Accuracy of product information by ensuring transparency of product information across the supply chain.

You can learn more about how Product Data Lake works on the documentation site.

pdl-how-much-small Become a:

Product Data Lake upstream subscriber, if you are a manufacturer or an upstream distributor
Product Data Lake downstream subscriber, if you are a distributor, retailer or end user
Product Data Lake midstream subscriber, if you sit in the middle of a supply chain or you sell direct goods and also is a large end user of indirect goods and/or raw materials
Product Data Lake ambassador, if you work with Product Information Management (PIM) services
Product Data Lake reservoir, if you host a product information database.

Data Warehouse vs Data Lake, Take 2

17th September 2016Henrik Gabs Liliendahl5 Comments

The differences between a data warehouse and a data lake has been discussed a lot as for example here and here.

To summarize, the main point in my eyes is: In a data warehouse the purpose and structure is determined before uploading data while the purpose with and structure of data can be determined before downloading data from a data lake. This leads to that a data warehouse is characterized by rigidity and a data lake is characterized by agility.

take-2 Agility is a good thing, but of course, you have to put some control on top of it as reported in the post Putting Context into Data Lakes.

Furthermore, there are some great opportunities in extending the use of the data lake concept beyond the traditional use of a data warehouse. You should think beyond using a data lake within a given organization and vision how you can share a data lake within your business ecosystem. Moreover, you should consider not only using the data lake for analytical purposes but commence on a mission to utilize a data lake for operational purposes.

The venture I am working on right now have this second take on a data lake. The Product Data Lake exists in the context of sharing product information between trading partners in an agile and process driven way. The providers of product information, typically manufacturers and upstream distributors, uploads product information according to the data management maturity level of that organization. This information may very well for now be stored according to traditional data warehouse principles. The receivers of product information, typically downstream distributors and retailers, download product information according to the data management maturity level of that organization. This information may very well for now end up in a data store organized by traditional data warehouse principles.

As I have seen other approaches for sharing product information between trading partners these solutions are built on having a data warehouse like solution between trading partners with a high degree of consensus around purpose and structure. Such solutions are in my eyes only successful when restricted narrowly in a given industry probably within a given geography for a given span of time.

By utilizing the data lake concept in the exchange zone between trading partners you can share information according to your own pace of maturing in data management and take advantage of data sharing where it fits in your roadmap to digitalization. The business ecosystems where you participate are great sources of data for both analytical and operational purposes and we cannot wait until everyone agrees on the same purpose and structure. It only takes two to start the tango.

Connecting Product Information

14th September 201617th September 2016Henrik Gabs LiliendahlLeave a comment

In our current work with the Product Data Lake cloud service, we are introducing a new way to connect product information that are stored at two different trading partners.

When doing that we deal with three kinds of product attributes:

Product identification attributes
Product classification attributes
Product features

Product identification attributes

The most common used notion for a product identification attribute today is GTIN (Global Trade Item Number). This numbering system has developed from the UPC (Universal Product Code) being most popular in North America and the EAN (International Article Number formerly European Article Number).

Besides this generally used system, there are heaps of industry and geographical specific product identification systems.

In principle, every product in a given product data store, should have a unique value in a product identification attribute.

When identifying products in practice attributes as a model number at a given manufacturer and a product description are used too.

Product classification attributes

A product classification attribute says something about what kind of product we are talking about. Thus, a range of products in a given product data store will have the same value in a product classification attribute.

As with product identification, there is no common used standard. Some popular cross-industry classification standards are UNSPSC (United Nations Products and Service Code®) and eCl@ss, but many other standards exists too as told in the post The World of Reference Data.

Besides the variety of standards a further complexity is that these standards a published in versions over time and even if two trading partners use the same standard they may not use the same version and they may have used various versions depending on when the product was on-boarded.

Product features

A product feature says something about a specific characteristic of a given product. Examples are general characteristics as height, weight and colour and specific characteristics within a given product classification as voltage for a power tool.

Again, there are competing standards for how to define, name and identify a given feature.

The Product Data Lake tagging approach

In the Product Data Lake we use a tagging system to typify product attributes. This tagging system helps with:

Linking products stored at two trading partners
Linking attributes used at two trading partners

A product identification attribute can be tagged starting with = followed by the system and optionally the variant off the system used. Examples will be ‘=GTIN’ for a Global Trading Item Number and ‘=GTIN-EAN13’ for a 13 character EAN number. An industry geographical tag could be ‘=DKVVS’ for a Danish plumbing catalogue number (VVS nummer). ‘=MODEL’ is the tag of a model number and ‘=DESCRIPTION’ is the tag of the product description.

A product classification tag starts with a #. ‘#UNSPSC’ is for a United Nations Products and Service Code where ‘#UNSPSC-19’ indicates a given main version.

A product feature is tagged with the feature id, an @ and the feature (sometimes called property) standard. ‘EF123456@ETIM’ will be a specific feature in ETIM (an international standard for technical products). ‘ABC123@ECLASS’ is a reference to a property in eCl@ss.

Data Quality 3.0 as a stepping-stone on the path to Industry 4.0

6th September 20166th September 2016Henrik Gabs LiliendahlLeave a comment

The title of this blog post is a topic on my international keynote at the Stammdaten Management Forum 2016 in Düsseldorf, Germany on the 8^th November 2016. You can see the agenda for this conference that starts on the 7^th and end the on 9^th here.

stepping_stones_oc Data Quality 3.0 is a term I have used over the years here on the blog to describe how I see data quality, along with other disciplines within data management, changing. This change is about going from focusing on internal data stores and cleansing within them to focusing on external sharing of data and using your business ecosystem and third party data to drastically speed up data quality improvement.

Industry 4.0 is the current trend of automation and data exchange in manufacturing technologies. When we talk about big data most will agree that success with big data exploitation hinges on proper data quality within master data management. In my eyes, the same can be said about success with industry 4.0. The data exchange that is the foundation of automation must be secured by common understood master data.

So this is the promising way forward: By using data exchange in business ecosystems you improve data quality of master data. This improved master data ensures the successful data exchange within industry 4.0.

Ways of Sharing Product Data in Business Ecosystems

4th September 201618th July 2017Henrik Gabs LiliendahlLeave a comment

Sharing product data within business ecosystems of manufacturers, distributors, retailers and end users has grown dramatically during the last years driven by the increased use of e-commerce and other customer self-service sales approaches.

At Product Data Lake we recently had a survey about how companies shares product data today. The figures were as seen below:

our survey

The result shows that there are different approaches out there. Spreadsheets still rules the world though closely, in this survey, followed by external data portals. Direct system to system approaches are also present while supplier portals seems to be not that common.

At the Product Data Lake we aim to embrace those different approaches. Well, regarding use of spreadsheets and digital asset files via eMail our embracement is meant to be that of a constrictor snake. The Product Data Lake is the solution to end the hailstorms of spreadsheets with product data within cross company supply chains.

For external data portals, the Product Data Lake offers the concept of a data reservoir. A data reservoir in the Product Data Lake can be with an industry focus or with a special focus on certain data elements as for example sustainability data as described in the post Sustainability Data in PIM.

Direct systems to system exchange can be orchestrated through the Product Data Lake and supplier portals can served by the Product Data Lake. In that way existing investments in those approaches, that typically are implemented to serve basic data elements shared with your top trading partners, can be supplemented by a method that caters for exchange with all your trading partners and covering all data elements and digital assets.

Launching too early or too late

27th August 20161st September 2016Henrik Gabs LiliendahlLeave a comment

Today the 28^th August 2016 is one month away from the official launch of the Product Data Lake.

When to launch is an essential question for every start-up. Launching too early with an immature product is one common pitfall and launching too late with a complex product that does not fit the market is another common pitfall for a start-up.

At Product Data Lake we hope we have struck the right balance. You can see what we have chosen to put up in the cloud in this document.

Right now both the technical team at Larion in Ho Chi Min City and the commercial team in Copenhagen is working hard to get the last details in place for the launch that will happen as told on LinkedIn in the post Meet The Product Data Lake.

One thing we have in place is the company’s vehicle fleet. As you can see, this is according to us being both environmental and economically responsible.

Cykler

Ecommerce Su…ffers without Data Quality

17th August 201612th June 2017Henrik Gabs Liliendahl2 Comments

Inadequate data quality is the enemy of any business. Proof of that for ecommerce too was revealed in a recent survey from the Danish E-commerce Association (FDIH). Over 7,000 respondents were asked if they would turn away from a web-shop, if the product information is incomplete or the product image is bad.

FDIH survey

52 % answered that they totally agree. 29 % more agreed, making it 81 % in all who would leave. 12 % was not sure. 4 % disagreed and 3 % totally disagreed.

The importance of the maintenance and publishing of adequate product information in order to support self-service sales approaches has been pondered on this blog many times as for example in the post Self-service Ready Product Data.

Having product Images of good quality is a part of that and add to that you often see missing product images as reported in the post Image Coming Soon.

By the way: The root cause of incomplete product information and images is lack of agile and process driven sharing of this in business ecosystems. The remedy to that is the Product Data Lake and we will be at the Danish E-Commerce Association event in Copenhagen the 13^th October 2016. More information about this event here.

Emerging Database Technologies for Master Data

30th July 2016Henrik Gabs LiliendahlLeave a comment

The MDM Landscape Q2 2016 from Information Difference is out. MDM vendors usually celebrate these yearly analyst reports with tweets and posts about their prominent position, like Informatica trailed by Stibo Systems for being in the top right corner and Agility Mulitichannel closely followed by Orchestra Networks for having the happiest customers.

The Information Difference But the market analysis and the trends observed is good stuff as well.

This year I noticed the trend in the underlying technology used by MDM vendors to store the master data. The report says: “Some vendors have also decided to cut their ties with the relational database platform that has traditionally been the core storage mechanism for master data. Certain types of analysis e.g. of relationships between data, can be well handled by other types of emerging databases, such as graph databases like Neo4J and NoSQL databases like MongoDB. One vendor has recently switched its underlying platform entirely away from relational, and others have similar plans.”

While we usually see graph databases and NoSQL databases as something to use for analytical purposes, the trend of moving master data platforms to these technologies implies that operational purposes will be based on these technologies too.

This is close to me as the master data service I’m work with right now is based on storing data for operational purposes in MongoDB (in the cloud).

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph