Liliendahl on Data Quality

Multi-Side MDM

22nd March 2016Henrik Gabs Liliendahl3 Comments

As reported in the post Gravitational Waves in the MDM World there is a tendency in the MDM (Master Data Management) market and in MDM programmes around to encompass both the party domain and the product domain.

The party domain is still often treated as two separate domains, being the vendor (or supplier) domain and the customer domain. However, there are good reasons for seeing the intersection of vendor master data and customer master data as party master data. These reasons are most obvious when we look at the B2B (business-to-business) part of our master data, because:

You will always find that many real world entities have a vendor role as well as a customer role to you
The basic master data has the same structure (identification, names, addresses and contact data
You need the same third party validation and enrichment capabilities for customer roles and vendor roles.

These reasons also applies to other party roles as examined in the post 360° Business Partner View.

When we look at the product domain we also have a huge need to connect the buy side and the sell side of our business – and the make side for that matter where we have in-house production.

Multi-Side MDM

Multi-Domain MDM has a side effect, so to speak, about bringing the sell-side together with the buy- and make-side. PIM (Product Information Management), which we often see as the ancestor to product MDM, has the same challenge. Here we also need to bring the sell-side and and the buy-side together – on three frontiers:

Bringing the internal buy-side and sell-side together not at least when looking at product hierarchies
Bringing our buy-side in synchronization with our upstream vendors/suppliers sell-side when it comes to product data
Bringing our sell-side in synchronization with our downstream customers buy-side when it comes to product data

A Quick Tour around the Product Data Lake

13th March 201619th April 2016Henrik Gabs Liliendahl5 Comments

The Product Data Lake is a cloud service for sharing product data in the eco-systems of manufacturers, distributors, retailers and end users of product information.

PDL tour 01 As an upstream provider of products data, being a manufacturer or upstream distributor, you have these requirements:

When you introduces new products to the market, you want to make the related product data and digital assets available to your downstream partners in a uniform way
When you win a new downstream partner you want the means to immediately and professionally provide product data and digital assets for the agreed range
When you add new products to an existing agreement with a downstream partner, you want to be able to provide product data and digital assets instantly and effortless
When you update your product data and related digital assets, you want a fast and seamless way of pushing it to your downstream partners
When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pushing it to your downstream partners.

The Product Data Lake facilitates these requirements by letting you push your product data into the lake in your in-house structure that may or may not be fully or partly compliant to an international standard.

PDL tour 02

As an upstream provider, you may want to push product data and digital assets from several different internal sources.

The product data lake tackles this requirement by letting you operate several upload profiles.

PDL tour 03

As a downstream receiver of product data, being a downstream distributor, retailer or end user, you have these requirements:

When you engage with a new upstream partner you want the means to fast and seamless link and transform product data and digital assets for the agreed range from the upstream partner
When you add new products to an existing agreement with an upstream partner, you want to be able to link and transform product data and digital assets in a fast and seamless way
When your upstream partners updates their product data and related digital assets, you want to be able to receive the updated product data and digital assets instantly and effortless
When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pulling it from your upstream partners
If you have a backlog of product data and digital asset collection with your upstream partners, you want a fast and cost effective approach to backfill the gap.

The Product Data Lake facilitates these requirements by letting you pull your product data from the lake in your in-house structure that may or may not be fully or partly compliant to an international standard.

PDL tour 04

In the Product Data Lake, you can take the role of being an upstream provider and a downstream receiver at the same time by being a midstream subscriber to the Product Data Lake. Thus, Product Data Lake covers the whole supply chain from manufacturing to retail and even the requirements of B2B (Business-to-Business) end users.

PDL tour 05

The Product Data Lake uses the data lake concept for big data by letting the transformation and linking of data between many structures be done when data are to be consumed for the first time. The goal is that the workload in this system has the resemblance of an iceberg where 10% of the ice is over water and 90 % is under water. In the Product Data Lake manually setting up the links and transformation rules should be 10 % of the duty and the rest being 90 % of the duty will be automated in the exchange zones between trading partners.

PDL tour 06

TwoLine Blue

Visiting the Product Information Castle

6th March 20166th March 2016Henrik Gabs Liliendahl2 Comments

If you have ever visited some of the many castles around in Europe you may have noticed that there are many architectural similarities. You may also compare these basic structures of a castle with how we can imagine the data architecture related to Product Information Management (PIM).

In my vision of a product information castle there is a main building with five floors of product information. There is a basement for pricing information where we often will find the valuable things as the crown jewels and other treasures. The hierarchy tower combines the pricing information and the different levels of product information. Besides the main castle, we find the logistic stables.

PIM0 — Hierarchy, pricing and logistic is part of whole picture

What we do not see on this figure is the product lifecycle management wall around the castle area.

Now, let us get back to the main building and examine what is on each of the floors in the building.

PIM01 — Ground PIM level: Basic product data

On the ground level, we find the basic product data that typically is the minimum required for creating a product in any system of record. Here we find the primary product identification number or code that is the internal key to all other product data structures and transactions related to the product. Then there usually is a short product description. This description helps internal employees identifying a product and distinguishing that product from other products. If an upstream trading partner produces the product, we may find the identification of that supplier here. If the product is part of internal production, we may have a material type telling about if it is a raw material, semi-finished product, finished good or packing material.

Except for semi-finished products, we may find more things on the next floor.

This level has product data related to trading the product. We may have a unique Global Trade Item Number (GTIN) that may be in the form of an International Article Number (EAN) or a Universal Product Code (UPC). Here we have commodity codes and a lot of other product data that supports buying, receiving, selling and delivering the product.

Most castles were not build in one go. Many castles started modestly in maybe just two floors and a tiny tower. In the same way, our product information management solutions for finished and trading goods usually are built on the top of an elder ERP solution holding the basic and trading data.

PIM03 — PIM Level 3: Basic product recognition data

On the third level, we find the two grand ballrooms of product information. These ballrooms were introduced when eCommerce started to grow up.

The extended product description is needed because the usual short product description used internally have no meaning to an outsider as told in the post Customer Friendly Product Master Data. Some good best practices for governing the extended product description is to have a common structure of how the description is written, not to use abbreviations and to have a strict vocabulary as reported in the post Toilet Seats and Data Quality.

Having a product image is pivotal if you want to sell something without showing the real product face-to-face with the customer or other end user. A missing product image is a sign of a broken business process for collecting product data as pondered in the post Image Coming Soon.

PIM04 — PIM Level 4: Self-service product data

On the fourth level, we have three main chambers: Product attributes, basic product relations and standard digital assets.This data are the foundation of customer self-service and should, unless you are the manufacturer, be collected from the manufacturer via supplier self-service.

Product attributes are also sometimes called product properties or product features. These are up to thousands of different data elements that describes a product. Some are very common for most products like height, length, weight and colour. Some are very specific to the product category. This challenge is actually the reason of being for dedicated Product Information Management (PIM) solutions as told in the post MDM Tools Revealed.

Basic product relations are the links between a product and other products like a product that have several different accessories that goes with the product or a product being a successor of another now decommissioned product.

Standard digital assets are documents like installation guides, line drawings and data sheets as examined in the post Digital Assets and Product MDM.

PIM05 — PIM Level 5: Competitive product data

On the upper fifth floor we find elements like on the fourth floor but usually these are elements that you won’t necessarily apply to all products but only to your top products where you want to stand out from the crowd and distance yourself from your competitors.

Special content are descriptions of and stories about the product above the hard features. Here you tell about why the product is better than other products and in which circumstances the product can to be used. A common aim with these descriptions is also Search Engine Optimization (SEO).

X-sell (cross-sell) and up-sell product relations applies to your particular mix of products and may be made subjective as for example to look at up-sell from a profit margin point of view. X-sell and up-sell relations may be defined from upstream by you or your upstream trading partners but also dripping down on the roof from the behaviour of your downstream trading partners / customers as manifested in the classic webshop message: “Those who bought product A also bought / looked at product B”.

Advanced digital assets are broader and more lively material than the hard fact line drawings and other documents. Increasingly newer digital media types as video are used for this purpose.

All in all the rooftop takes us to the upper side of the cloud.

Hoenzollern Castle in Southern Germany

Big data and PIM: A match made in space

27th February 2016Henrik Gabs LiliendahlLeave a comment

Product Information Management (PIM) have over the recent years emerged as an important technology enabled discipline for every company taking part in a supply chain. These companies are manufacturers, distributor, retailers and large end users of tangible products requiring a drastic increased variety of product data to be used in ecommerce and other self-service based ways of doing business.

At the same time we have seen the raise of big data. Now, if you look at every single company, product data handled by PIM platforms perhaps does not count as big data. Sure, the variety is a huge challenge and the reason of being for PIM solutions as they handle this variety better than traditional Master Data Management (MDM) solutions and ERP solutions.

The variety is about very different requirements in data quality dimensions based on where a given product sits in the product hierarchy. Measuring completeness has to be done for the concrete levels in the hierarchy, as a given attribute may be mandatory for one product but absolutely ridiculous for another product. An example is voltage for a power tool versus for a hammer. With consistency, there may be attributes with common standards (for example product name) but many attributes will have specific standards for a given branch in the hierarchy.

Product information also encompasses digital assets, being PDF files with product sheets, line drawings and lots of other stuff, product images and an increasing amount of videos with installation instructions and other content. The volume is then already quite big.

Image coming soon — A missing product image is a sign of a broken product data business process

Volume and velocity really comes into the game when we look at eco-systems of manufacturers, distributors and retailers. The total flow of product data can then be described with the common characteristics of big data: Volume, velocity and variety. Even if you look at it for a given company and their first degree of separation with trading partners, we are talking about big data where there is an overwhelming throughput of new product links between trading partners and updates to product information that are – or not least should have been – exchanged.

Within big data we have the concept of a data lake. A key success factor of a data lake solution is minimizing the use of spreadsheets. In the same way, we can use a data lake, sitting in the exchange zone between trading partners, for product information as elaborated further in the post Gravitational Collapse in the PIM Space.

It is not all about People or Processes or Technology

23rd February 2016Henrik Gabs LiliendahlLeave a comment

People Processes Technology When following the articles, blog posts and other inspirational stuff in the data management realm you frequently stumble upon sayings about a unique angle towards what it is all about, like:

It is all about people, meaning that if you can change and control the attitude of people involved in data management everything will be just fine. The problem is that people have been around for thousands of years and we have not nailed that one yet – and probably will not do that isolated in the data management realm. But sure, a lot of consultancy fees will go down that drain still.
It is all about processes. Yes it is. The only problem is that processes are dependent on people and technology.
It is all about technology. Well, no one actually says so. However, relying on that sentiment – and that shit does happen, is a frequent reason why data management initiatives goes wrong.

The trick is to find a balance between a priceworthy people focused approach, a heartfelt process way of going forward and a solid methodology to exploit technology in the good cause of better data management all aligned with achieving business benefits.

How hard can it be?

Gravitational Collapse in the PIM Space

20th February 201615th March 2016Henrik Gabs Liliendahl4 Comments

The previous post on this blog was called Gravitational Waves in the MDM World. Building further on space science, I would like to use the concept of gravitational collapse, which is the process that happens when a star or other space object is born. In this process, a myriad of smaller objects are gathered into a more dense object.

PIM (Product Information Management) is part of the larger MDM (Master Data Management) world. PIM solutions offered today serves very well the requirements for organizing and supporting the handling of product information inside each organization.

However, there is an instability when observing two trading partners. Today, the most common mean to share product data is to exchange one or several spreadsheets with product identification and product attributes (sometimes also called properties or features). Such spreadsheets may also contain links to digital assets being product images, line drawing documents, installation videos and other rich media stuff.

PIM1

As an upstream provider of product data, being a manufacturer or upstream distributor, you have these requirements:

When you introduces new products to the market, you want to make the related product data and digital assets available to your downstream partners in a uniform way
When you win a new downstream partner you want the means to immediately and professionally provide product data and digital assets for the agreed range
When you add new products to an existing agreement with a downstream partner, you want to be able to provide product data and digital assets instantly and effortless
When you update your product data and related digital assets, you want a fast and seamless way of pushing it to your downstream partners
When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pushing it to your downstream partners.
You may want to push product data and digital assets from several different internal sources.

As a downstream receiver of product data, being a downstream distributor, retailer or end user, you have these requirements:

When you engage with a new upstream partner you want the means to fast and seamless link and transform product data and digital assets for the agreed range from the upstream partner
When you add new products to an existing agreement with an upstream partner, you want to be able to link and transform product data and digital assets in a fast and seamless way
When your upstream partners updates their product data and related digital assets, you want to be able to receive the updated product data and digital assets instantly and effortless
When you choose to use a new product data attribute or digital asset type, you want a fast and seamless way of pulling it from your upstream partners
If you have a backlog of product data and digital asset collection with your upstream partners, you want a fast and cost effective approach to backfill the gap.

Fulfilling this with exchanging spreadsheets (and other peer-to-peer solutions) in the eco-system of trading partners is a chaotic mess.

PIM2

If you look at it from upstream being a manufacturer or upstream distributor the challenge is that you probably have hundreds of downstream receivers of product information. Each one requires their form of spreadsheet or other interface. They may even ask you to use their specific supplier portal meaning hundreds of different learning exercises on your side.

As a downstream receiver of product information being a downstream distributor, retailer or end user you have the opposite challenges. You probably have hundreds of upstream providers. If you go for having your own supplier portal you need to teach each of your suppliers and you have the software license and others burdens.

There is a need for a service that sits between the upstream and downstream trading partners. This service should help the upstream trading partners being manufactures and upstream distributors with sharing product data to many different downstream trading partners as well as it should eliminate or reduce the downstream trading partners need for implementing and maintaining supplier portals.

PIM3

In the end such a service will collapse the doomed galaxy of spreadsheets into an agile process driven service for sharing product data – called the Product Data Lake.

PIM4

Gravitational Waves in the MDM World

14th February 201615th February 2016Henrik Gabs LiliendahlLeave a comment

One of the big news this week was the detection of gravitational waves. The big thing about this huge step in science is that we now will be able to see things in space, we could not see before. These are things we have plenty of clues about, but we cannot measure them because they do not emit electromagnetic radiation and the light from them is absorbed or reflected by cosmic bodies or dust before it reaches our telescopes.

We have kind of the same in the MDM (Master Data Management) world. We know that there is such a thing called multi-domain Master Data Management but our biggest telescope, the Gartner magic quadrants, only until now clearly identified customer Master Data Management and product Master Data Management as latest touched in the post The Perhaps Second Most Important MDM Quadrant 2015 is Out.

Indeed, many MDM programmes that actually does encompass all MDM domains do split the efforts into traditional domains as customer, vendor and product with separate teams observing their part of the sky. It takes a lot to advocate for that despite vendors belongs to the buy side and customers belongs to the sell side of the organization, there are strong ties between these objects. We can detect gravity in terms of that a vendor and a customer can be the same real world entity and vendors and customers have the same basic structure being a party.

GW MDM

Products do behave differently depending on the industry where your organization belongs. You may make products utilizing raw materials you buy and transform into finished products you sell or/and you may buy and sell the same physical product as a distributor, retailer or other value adding node in the supply chain. In order to handle the drastic increased demand for product data related to eCommerce, PIM (Product Information Management) has been known for long and many organizations everywhere in supply chains have already established PIM capabilities inside their organization with or without and inside or outside product Master Data Management.

What we still need to detect is a good system for connecting the PIM portion of sell sides upstream and buy sides downstream in supply chains. Right now we only see a blurred galaxy of spreadsheets as examined in the post Excellence vs Excel.

Copy and Paste versus Inheritance within MDM

9th February 201616th February 2016Henrik Gabs Liliendahl2 Comments

A common seen user requirement for Master Data Management (MDM) solutions is an ability to copy the content of the attributes of an existing entity when creating a new entity. For example when creating a new product you may find it nice to copy all the field values from an existing similar product to the new product and then just change what is different for the new product. Just like using copy and paste in excel or other so called productivity tools.

We all know the dangers of copy and paste and there are plenty of horror stories out there of the harsh consequences like when copying and pasting in a job application and forgetting to change the name of the targeted employer. You know: “I have always dreamed about working for IBM” when applying at Oracle.

The exact same bad things may happen when doing copy and paste when working with master data. You may forget to change exactly that one important piece of information because you miss guidance on the copied data within your system of entry.

Yes No Using an inheritance approach is a better way. This approach is for product master data based on having a mature hierarchy management in place. When creating a new product you place your product in the hierarchy where it will inherit the attributes common for products on the same branch of the hierarchy and leave it for you to fill in the exact attributes that is specific for the new product. If a new product requires a new branch in the hierarchy, you are forced to think about the common attributes for this branch through.

For party (customer, supplier and other business partner) master data you may inherit from the outside world taking advantage of fetching what is already digitalized, which includes names, addresses and other contact data, and leaving for you to fill in the party master data that is specific to your way of doing business.

The ups and downs with anecdotal evidence in data management

6th February 20167th February 2016Henrik Gabs LiliendahlLeave a comment

Anecdotes are powerful when working with getting awareness of opportunities in the data quality, data governance and Master Data Management (MDM) realm. Such anecdotes are most often either external or internal data and information train-wrecks, while the success stories are more seldom – at least until now in my experience.

Using anecdotal evidence is useful when identifying major pain points with potential for improvement and are indispensable when striving to get a common understanding about the issues to be solved.

However, it is within data management as in all other disciplines dangerous to jump to conclusions based on anecdotal evidence. We do need some more scientific evidence to nail the collection of issues and the prioritizing of proper solutions.

Hippo The anecdotal evidences with the highest weights are those included according to the HiPPO (Highest Paid Person’s Opinion) principle as examined in the post When Rhino Hunt and the HiPPO Principle makes a Perfect Storm. Here we may have a clash between getting executive sponsorship and support for a given programme and actually doing the right things, based on scientific evidence, within the programme.

What are your experiences and lessons learned? How have you managed to balance anecdotal evidence and scientific evidence in data management?

Tough Questions About MDM

30th January 201630th January 2016Henrik Gabs Liliendahl4 Comments

This week I had the pleasure of speaking in Copenhagen at an event about The Evolution of MDM. The best speaking experiences is when there are questions and responses from the attendees. At this event, such lovely interuptions took us around some of the tough questions about Master Data Management (MDM), like

Is the single source of truth really achievable?
Does MDM belong within IT in the organization?
Is blockchain technology useful within MDM?

Single source of truth

Many seasoned MDM practitioners has experienced attempts to implement a single source of truth for a given MDM domain within a given organization and seen the attempt failed miserably. The obstacles are plentiful including business units with different goals and IT landscapes with heterogenic capabilities.

I think there is a common sentiment in the data management realm about to lower that bar a bit. Perhaps a single place of trust is a more realistic goal as examined in the post Three Stages of MDM Maturity.

MDM in IT

We all know that MDM should belong to the business part of the organization and anchoring MDM (and BI and CRM and so many other disciplines) in the IT part of the organization is a misunderstanding. However, we often see that MDM is placed in the IT department because IT already spans the needs of marketing, sales, logistics, finance and so on.

My take is that the actual vision, goals and holistic business involvement trumps the formal organizational anchoring. Currently I work with two MDM programmes, one anchored in IT and one in finance. As an MDM practitioner, you have to deal with business and IT anyway.

Blockchain

Blockchain is a new technology disrupting business these days. Recently Andrew White of Gartner blogged about how blockchain thinking could go where traditional single view of master data approaches haven’t been able to go. The blog post is called Why not Blockchain Data Synchronization? As Andrew states: “The next year could be very interesting, and very disrupted.”

PS: My slides from the event are available here: MDM before, now and in the future.

	Henrik Gabs Lilienda… on Balancing the Business Partner…
	Jeppe Thing Sørensen on Balancing the Business Partner…
	peolsolutions on MDM, Cloud, SaaS, PaaS, IaaS a…
	Henrik Gabs Lilienda… on Is the Holiday Season called C…
	Michael D. on Is the Holiday Season called C…
	Jay Ram on The Disruptive MDM List is…
	Henrik Gabs Lilienda… on The Intersection of Data Obser…
	Shanker on The Intersection of Data Obser…
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on Data Matching Efficiency
	Bhavani Shanker on Data Matching Efficiency
	Henrik Gabs Lilienda… on From Platforms to Ecosyst…
	Michael Fieg on From Platforms to Ecosyst…
	From Platforms to Ec… on What is Collaborative Product…
	From Platforms to Ec… on MDM and Knowledge Graph