Five Product Classification Standards

When working with Product Master Data Management (MDM) and Product Information Management (PIM) one important facet is classification of products. You can use your own internal classification(s), being product grouping and hierarchy management, within your organization and/or you can use one or several external classification standards.

Five External Standards

Some of the external standards I have come across are:

UNSPSC

The United Nations Standard Products and Services Code® (UNSPSC®), managed by GS1 US™ for the UN Development Programme (UNDP), is an open, global, multi-sector standard for classification of products and services. This standard is often used in public tenders and at some marketplaces.

GPC

GS1 has created a separate standard classification named GPC (Global Product Classification) within its network synchronization called the Global Data Synchronization Network (GDSN).

Commodity Codes / Harmonized System (HS) Codes

Commodity codes, lately being worldwide harmonized and harmonised, represent the key classifier in international trade. They determine customs duties, import and export rules and restrictions as well as documentation requirements. National statistical bureaus may require these codes from businesses doing foreign trade.

eClass

eCl@ss is a cross-industry product data standard for classification and description of products and services emphasizing on being a ISO/IEC compliant industry standard nationally and internationally. The classification guides the eCl@ss standard for product attributes (in eClass called properties) that are needed for a product with a given classification.

ETIM

ETIM develops and manages a worldwide uniform classification for technical products. This classification guides the ETIM standard for product attributes (in ETIM called features) that are needed for a product with a given classification.

pdl-whyThe Competition and The Neutral Hub

If you click on the links to some of these standards you may notice that they are actually competing against each other in the way they represent themselves.

At Product Data Lake we are the neutral hub in the middle of everyone. We cover your internal grouping and tagging to any external standard. Our roadmap includes more close integration to the various external standards embracing both product classification and product attribute requirements in multiple languages where provided. We do that with the aim of letting you exchange product information with your trading partners, who probably do the classification differently from you.

Data Born Companies and the Rest of Us

harriThis post is a new feature here on this blog, being guest blogging by data management professionals from all over the world. First up is Harri Juntunen, Partner at Twinspark Consulting in Finland:

Data and clever use of data in business has had and will have significant impact on value creation in the next decade. That is beyond reasonable doubt. What is less clear is, how this is going to happen? Before we answer the question, I think it is meaningful to make a conceptual distinction between data born companies and the rest of us.

Data born born companies are companies that were conceived from data. Their business models are based  on monetising clever use of data. They have organised everything from their customer service to operations to be capable of maximally harness data. Data and capabilities to use data to create value is their core competency. These companies are the giants of data business: Google, Facebook, Amazon, Über, AirBnB. The standard small talk topics in data professionals’ discussions.

However, most of the companies are not data born. Most of the companies were originally established to serve a different purpose. They were founded to serve some physical needs and actually maintaining them physically, be it food, spare parts or factories. Obviously, all of these companies in  e.g. manufacturing and maintenance of physical things need data to operate. Yet, these companies are not organised around the principles of data born companies and capabilities to harness data as the driving force of their businesses.

We hear a lot of stories and successful examples about how data born companies apply augmented intelligence and other latest technology achievements. Surely, technologies build around of data are important. The key question to me is: what, in practice, is our capability to harness all of these opportunities in companies that are not data born?

In my daily practice I see excels floating around and between companies. A lot of manual work caused by unstandardised data, poor governance and bad data quality. Manual data work simply prevents companies to harness the capabilities created by data born companies. Yet, most of the companies follow the data born track without sufficient reflection. They adopt the latest technologies used by the data born companies. They rephrase same slogans: automation, advanced analytics, cognitive computing etc. And yet, they are not addressing the fundamental and mundane issues in their own capabilities to be able to make business and create value with data. Humans are doing machine’s job.

Why? Many things relate to this, but data quality and standardization are still pressing problems in every day practice in many companies. Let alone between companies. We can change this. The rest of us can reborn from data just by taking a good look of our mundane data practices instead of aspiring to go for the next big thing.

P.S. The Google Brain team had reddit a while ago and they were asked “what do you think is underrated?

The answer:

“Focus on getting high-quality data. “Quality” can translate to many things, e.g. thoughtfully chosen variables or reducing noise in measurements. Simple algorithms using higher-quality data will generally outperform the latest and greatest algorithms using lower-quality data.”

https://www.reddit.com/r/MachineLearning/comments/4w6tsv/ama_we_are_the_google_brain_team_wed_love_to/

About Harri Juntunen:

Harri is seasoned data provocateur and ardent advocate of getting the basics right. Harri says: People and data first, technology will follow.

You can contact Harri here:

+358 50 306 9296

harri.juntunen@twinspark.fi

www.twinspark.fi

 

Approaches to Sharing Product Information in Business Ecosystems

One of the most promising aspects of digitalization is sharing information in business ecosystems. In the Master Data Management (MDM) realm, we will in my eyes see a dramatic increase in sharing product information between trading partners as touched in the post Data Quality 3.0 as a stepping-stone on the path to Industry 4.0.

Standardization (or standardisation)

A challenge in doing that is how we link the different ways of handling product information within each organization in business ecosystems. While everyone agrees that a common standard is the best answer we must on the other hand accept, that using a common standard for every kind of product and every piece of information needed is quite utopic. We haven’t even a common uniquely spelled term in English.

Also, we must foresee that one organization will mature in a different pace than another organisation in the same business ecosystem.

Product Data Lake

These observations are the reasons behind the launch of Product Data Lake. In Product Data Lake we encompass the use of (in prioritized order):

  • The same standard in the same version
  • The same standard in different versions
  • Different standards
  • No standards

In order to link the product information and the formats and structures at two trading partners, we support the following approaches:

  • Automation based on product information tagged with a standard as explained in the post Connecting Product Information.
  • Ambassadorship, which is a role taken by a product information professional, who collaborates with the upstream and downstream trading partner in linking the product information. Read more about becoming a Product Data Lake ambassador here.
  • Upstream responsibility. Here the upstream trading partner makes the linking in Product Data Lake.
  • Downstream responsibility. Here the downstream trading partner makes the linking in Product Data Lake.

cross-company-data-governanceData Governance

Regardless of the mix of the above approaches, you will need a cross company data governance framework to control the standards used and the rules that applies to the exchange of product information with your trading partners. Product Data Lake have established a partnership with one of the most recommended authorities in data governance: Nicola Askham – the Data Governance Coach.

For a quick overview please have a look at the Cross Company Data Governance Framework.

Please request more information here.

Bookmark and Share

Data Warehouse vs Data Lake, Take 2

The differences between a data warehouse and a data lake has been discussed a lot as for example here and here.

To summarize, the main point in my eyes is: In a data warehouse the purpose and structure is determined before uploading data while the purpose with and structure of data can be determined before downloading data from a data lake. This leads to that a data warehouse is characterized by rigidity and a data lake is characterized by agility.

take-2Agility is a good thing, but of course, you have to put some control on top of it as reported in the post Putting Context into Data Lakes.

Furthermore, there are some great opportunities in extending the use of the data lake concept beyond the traditional use of a data warehouse. You should think beyond using a data lake within a given organization and vision how you can share a data lake within your business ecosystem. Moreover, you should consider not only using the data lake for analytical purposes but commence on a mission to utilize a data lake for operational purposes.

The venture I am working on right now have this second take on a data lake. The Product Data Lake exists in the context of sharing product information between trading partners in an agile and process driven way. The providers of product information, typically manufacturers and upstream distributors, uploads product information according to the data management maturity level of that organization. This information may very well for now be stored according to traditional data warehouse principles. The receivers of product information, typically downstream distributors and retailers, download product information according to the data management maturity level of that organization. This information may very well for now end up in a data store organized by traditional data warehouse principles.

As I have seen other approaches for sharing product information between trading partners these solutions are built on having a data warehouse like solution between trading partners with a high degree of consensus around purpose and structure. Such solutions are in my eyes only successful when restricted narrowly in a given industry probably within a given geography for a given span of time.

By utilizing the data lake concept in the exchange zone between trading partners you can share information according to your own pace of maturing in data management and take advantage of data sharing where it fits in your roadmap to digitalization. The business ecosystems where you participate are great sources of data for both analytical and operational purposes and we cannot wait until everyone agrees on the same purpose and structure. It only takes two to start the tango.

Bookmark and Share

Connecting Product Information

In our current work with the Product Data Lake cloud service, we are introducing a new way to connect product information that are stored at two different trading partners.

When doing that we deal with three kinds of product attributes:

  • Product identification attributes
  • Product classification attributes
  • Product features

Product identification attributes

The most common used notion for a product identification attribute today is GTIN (Global Trade Item Number). This numbering system has developed from the UPC (Universal Product Code) being most popular in North America and the EAN (International Article Number formerly European Article Number).

Besides this generally used system, there are heaps of industry and geographical specific product identification systems.

In principle, every product in a given product data store, should have a unique value in a product identification attribute.

When identifying products in practice attributes as a model number at a given manufacturer and a product description are used too.

Product classification attributes

A product classification attribute says something about what kind of product we are talking about. Thus, a range of products in a given product data store will have the same value in a product classification attribute.

As with product identification, there is no common used standard. Some popular cross-industry classification standards are UNSPSC (United Nations Products and Service Code®) and eCl@ss, but many other standards exists too as told in the post The World of Reference Data.

Besides the variety of standards a further complexity is that these standards a published in versions over time and even if two trading partners use the same standard they may not use the same version and they may have used various versions depending on when the product was on-boarded.

Product features

A product feature says something about a specific characteristic of a given product. Examples are general characteristics as height, weight and colour and specific characteristics within a given product classification as voltage for a power tool.

Again, there are competing standards for how to define, name and identify a given feature.

pdl-tagsThe Product Data Lake tagging approach

In the Product Data Lake we use a tagging system to typify product attributes. This tagging system helps with:

  • Linking products stored at two trading partners
  • Linking attributes used at two trading partners

A product identification attribute can be tagged starting with = followed by the system and optionally the variant off the system used. Examples will be ‘=GTIN’ for a Global Trading Item Number and ‘=GTIN-EAN13’ for a 13 character EAN number. An industry geographical tag could be ‘=DKVVS’ for a Danish plumbing catalogue number (VVS nummer). ‘=MODEL’ is the tag of a model number and ‘=DESCRIPTION’ is the tag of the product description.

A product classification tag starts with a #. ‘#UNSPSC’ is for a United Nations Products and Service Code where ‘#UNSPSC-19’ indicates a given main version.

A product feature is tagged with the feature id, an @ and the feature (sometimes called property) standard. ‘EF123456@ETIM’ will be a specific feature in ETIM (an international standard for technical products). ‘ABC123@ECLASS’ is a reference to a property in eCl@ss.

Bookmark and Share

Putting Context into Data Lakes

The term data lake has become popular along with the raise of big data. A data lake is a new of way of storing data that is more agile than what we have been used to in data warehouses. This is mainly based on the principle that you should not have thought through every way of consuming data before storing the data.

This agility is also the main reason for fear around data lakes. Possible lack of control and standardization leads to warnings about that a data lake will quickly develop into a data swamp.

LakeIn my eyes we need solutions build on the data lake concept if we want business agility – and we do want that. But I also believe that we need to put data in data lakes in context.

Fortunately, there are many examples of movements in that direction. A recent article called The Informed Data Lake: Beyond Metadata by Neil Raden has a lot of good arguments around a better context driven approach to data lakes.

As reported in the post Multi-Domain MDM 360 and an Intelligent Data Lake the data management vendor Informatica is on that track too.

In all humbleness, my vision for data lakes is that a context driven data lake can serve purposes beyond analytical use within a single company and become a driver for business agility within business ecosystems like cross company supply chains as expressed in the LinkedIn Pulse post called Data Lakes in Business Ecosystems.

Bookmark and Share