I am afraid that Gartner does not help

“The average financial impact of poor data quality on organizations is $9.7 million per year.” This is a quote from Gartner, the analyst firm, used by them to promote their services in building a business case for data quality.

AverageWhile this quote rightfully emphasizes on that a lot of money is at stake, the quote itself holds a full load of data and information quality issues.

On the pedantic side, the use of the $ sign in international communication is problematic. The $ sign represents a lot of different currencies as CAD, AUD, HKD and of course also USD.

Then it is unclear on what basis this average is measured. Is it among the +200 million organizations in the Dun & Bradstreet Worldbase? Is it among organizations on a certain fortune list? In what year?

Even if you knew that this is an average in a given year for the likes of your organization, such an average would not help you justify allocation of resources for a data quality improvement quest in your organization.

I know the methodology provided by Gartner actually is designed to help you with specific return on investment for your organization. I also know from being involved in several business cases for data quality (as well as Master Data Management and data governance) that accurately stating how any one element of your data may affect your business is fiendishly difficult.

I am afraid that there is no magic around as told in the post Miracle Food for Thought.

How to exchange product information with trading partners?

In the era of digitalization, you need to exchange product information with your trading partners in an agile and automated way. At Product Data Lake we are determined to offer a world class service for that. But what exactly are your needs?

PDL How it worksWhether you are a company participating in a cross company supply chain or you help your clients in doing that, you can help us to help you by taking this survey.

Thanks a lot in advance.

The Pros and Cons of Master Data (Management)

As a comment on my LinkedIn status about my previous post Jan van Til asks:

Wouldn’t the world be far better of without the concept of master data? What problems are solved by it? What problems are introduced by it? The balance? So … why do we keep toiling with master data?”

What problems are solved?

A common definition of data quality is that data are fit for the intended purpose of use. However, with master data we have multiple purposes of use of these core data entities. In an enterprise architecture with no focus on master data, the same real world construct will be described in many different databases in many different ways.

This causes a myriad of challenges. There are no one face to our customers, which makes us look stupid. We may offer the same product unintentionally with different prices and with different features, which is stupid. Our reporting, business intelligence, data science will be based on ambiguous data and therefore potentially cause stupid decisions.

If we have data quality issues with no centralized management of master data we will fund data cleansing and prevention initiatives many times for the same real world construct – and probably with varying outcome. In a centralized master data management set up, we can avoid reinventing the wheel as explained in the post The Database versus the Hub.

What problems are introduced?

Implementing Master Data Management (MDM) is not without tears. This involves the whole enterprise, and will increasingly caused by the rise of digitalization involve the business ecosystem as well. You will have to spend money and resources, which are not always easily justified.

MDM is a complex discipline involving many stakeholders. There is a high risk of running over budget and time and missing the goals.

As MDM is the remedy against data silos, you may end up with MDM as just another data silo within your enterprise. And still, your MDM hub may be a data silo in your business ecosystem.

The balance

Think big, start small. Make it agile. A good example of an agile approach to MDM is proposed by the MDM vendor Semarchy as described here on What is Evolutionary MDM™?

While you need a strong inner hub for master data in your enterprise, do not try to impose that hub on everything and everyone as for example examined in the post A Different End-to-End Solution for Product Information Management (PIM).

Multi-Side MDM

What is a Master Data Entity?

What is a customer? What is a product? You encounter these common questions when working with Master Data Management (MDM).

The overall question about what master data is has been discussed on this blog often as for example in the post A Master Data Mind Map.

Master Data

The two common questions posed as start of this blog post is said to be very dangerous. Well, here are my experiences and opinions:

What is a customer?

In my eyes, customer is a role you can assign to a party. Therefore, the party is the real master data entity. A party can have many other roles as employee, supplier and other kinds of business partner roles. More times than you usually imagine, the party can have several roles at the same time. Examples are customers also being employees and suppliers who are also customers.

From a data quality point of view, it does not have to matter if a party is a customer or not at a certain time. If your business rules requires you to register that party because the party has placed an order, got an invoice, paid an invoice or pre-paid an amount, you will need to take care of the quality of the information you have stored. You will also have to care about the privacy, not at least if the party is a natural person.

Uniqueness is the most frequent data quality issue when it comes to party master data. Again, it is essential to detect or better prevent if the same party is registered twice or more whether that party is a customer according to someone’s definition or not.

What is a product?

Also with products business rules dictates if you are going to register that product. If you are a reseller of products, you should register a product that you promote (being in your range). You could register a product, if you resell that product occasionally (sometimes called specials). If you are a manufacturer, you should register your finished products, your semi-finished products and the used raw materials. Most companies are actually both a reseller and a manufacturer in some degree. Despite of that degree practically all companies also deals with indirect goods as spare parts, office supplies and other stuff you could register as a product within your organisation in the same way your supplier probably have.

What we usually defines as a product is most often what rather should be called a product model. That means we register information about things that are made in the same way and up by the same ingredients and branded similarly. A thing, as each physical instance of a product model, will increasingly have business rules that requires it to be registered as told in the post Adding Things to Product Data Lake.

Big Data Fitness

A man with one watch knows what time it is, but a man with two watches is never quite sure. This old saying could be modernized to, that a person with one smart device knows the truth, but a person with two smart devices is never quite sure.

An example from my own life is measuring my daily steps in order to motivate me to be more fit. Currently I have two data streams coming in. One is managed by the app Google Fit and one is managed by the app S Health (from Samsung).

This morning a same time shot looked like this:

Google Fit:

google-fit

S Health:

s-health

So, how many steps did I take this morning? 2,047 or 2413?

The steps are presented on the same device. A smartphone. They are though measured on two different devices. Google Fit data are measured on the smartphone itself while S Health data are measured on a connected smartwatch. Therefore, I might not be wearing these devices in the exact same way. For example, I am the kind of Luddite that do not bring the phone to the loo.

With the rise of the Internet of Things (IoT) and the expected intensive use of the big data streams coming from all kinds of smart devices, we will face heaps of similar cases, where we have two or more sets of data telling the same story in a different way.

A key to utilize these data in the best fit way is to understand from what and where these data comes. Knowing that is achieved through modern Master Data Management (MDM).

At Product Data Lake we in all humbleness are supporting that by sharing data about the product models for smart devices and in the future by sharing data about each device as told in the post Adding Things to Product Data Lake.

The Rise of Business Ecosystems in Data Management

There are many signs showing that we are entering the age of business ecosystems. A recent example is an article from Digital McKinsey. This read worthy article is called Adopting an ecosystem view of business technology.

In here, the authors emphasizes on the need to adapt traditional IT functions to the opportunities and challenges of emerging technologies that embraces business ecosystems. I fully support that sentiment.

In my eyes, some of the emerging technologies we see are in large misunderstood as something meant for being behind the corporate walls. My favorite example is the data lake concept. I do not think a data lake will be an often seen success solely within a single company as explained in the post Data Lakes in Business Ecosystems.

The raise of technology for business ecosystems will also affect the data management roles we know today. For example, a data steward will be a lot more focused towards external data than before as elaborated in the post The Future of Data Stewardship.

Encompassing business ecosystems in data management is of course a huge challenge we have to face while most enterprises still have not reached an acceptable maturity when it comes internal data and information governance. However, letting the outside in will also help in getting data and information right as told in the post Data Sharing Is The Answer To A Single Version Of The Truth.

biz-eco

Golden Records in Multi-Domain MDM

The term golden record is a core concept within Master Data Management (MDM). A golden record is a representation of a real world entity that may be compiled from multiple different representations of that entity in a single or in multiple different databases within the enterprise system landscape.

GoldIn Multi-domain MDM we work with a range of different entity types as party (with customer, supplier, employee and other roles), location, product and asset. The golden record concept applies to all of these entity types, but in slightly different ways.

Party Golden Records

Having a golden record that facilitates a single view of customer is probably the most known example of using the golden record concept. Managing customer records and dealing with duplicates of those is the most frequent data quality issue around.

If you are not able to prevent duplicate records from entering your MDM world, which is the best approach, then you have to apply data matching capabilities. When identifying a duplicate you must be able to intelligently merge any conflicting views into a golden record.

In lesser degree we see the same challenges in getting a single view of suppliers and, which is one of my favourite subjects, you ultimately will want to have a single view on any business partner, also where the same real world entity have both customer, supplier and other roles to your organization.

Location Golden Records

Having the same location only represented once in a golden record and applying any party, product and asset record, and ultimately golden record, to that record may be seen as quite academic. Nevertheless, striving for that concept will solve many data quality conundrums.

GoldLocation management have different meanings and importance for different industries. One example is that a brewery makes business with the legal entity (party) that owns a bar, café, restaurant. However, even though the owner of that place changes, which happens a lot, the brewery is still interested in being the brand served at that place. Also, the brewery wants to keep records of logistics around that place and the historic volumes delivered to that place. Utility and insurance is other examples of industries where the location golden record (should) matter a lot.

Knowing the properties of a location also supports the party deduplication process. For example, if you have two records with the name “John Smith” on the same address, the probability of that being the same real world entity is dependent on whether that location is a single-family house or a nursing home.

Product Golden Record

Product Information Management (PIM) solutions became popular with the raise of multi-channel where having the same representation of a product in offline and online channels is essential. The self-service approach in online sales also drew the requirements of managing a lot more product attributes than seen before, which again points to a solution of handling the product entity centralized.

In large organizations that have many business units around the world you struggle with having a local view and a global view of products. A given product may be a finished product to one unit but a raw material to another unit. Even a global SAP rollout will usually not clarify this – rather the contrary.

GoldWhile third party reference data helps a lot with handling golden records for party and location, this is lesser the case for product master data. Classification systems and data pools do exist, but will certainly not take you all the way. With product master data we must, in my eyes, rely more on second party master data meaning sharing product master data within the business ecosystems where you are present.

Asset (or Thing) Golden Records

In asset master data management you also have different purposes where having a single view of a real world asset helps a lot. There are namely financial purposes and logistic purposes that have to aligned, but also a lot of others purposes depending on the industry and the type of asset.

With the raise of the Internet of Things (IoT) we will have to manage a lot more assets (or things) than we usually have considered. When a thing (a machine, a vehicle, an appliance) becomes intelligent and now produces big data, master data management and indeed multi-domain master data management becomes imperative.

You will want to know a lot about the product model of the thing in order to make sense of the produced big data. For that, you need the product (model) golden record. You will want to have deep knowledge of the location in time of the thing. You cannot do that without the location golden records. You will want to know the different party roles in time related to the thing. The owner, the operator, the maintainer. If you want to avoid chaos, you need party golden records.

Infonomics and Second Party Data

The term infonomics does not yet run unmarked through my English spellchecker, but there are some information available on Wikipedia about infonomics. Infonomics is closely related to the often-mentioned phrases in data management about seeing data / information as an asset.

Much of what I have read about infonomics and seeing data / information as an asset is related to what we call first party data. That is data that is stored and managed within your own company.

Some information is also available in relation to third party data. That is data we buy from external parties in order to validate, enrich or even replace our own first party data. An example is a recent paper from among others infonomic guru Doug Laney of Gartner (the analyst firm). This paper has a high value if you want to buy it as seen here.

Anyway, the relationship between data as an asset and the value of data is obvious when it comes to third party data, as we pay a given amount of money for data when acquiring third party data.

Second party data is data we exchange with our trading and other business partners. One example that has been close to me during the recent years is product information that follows exchange of goods in cross company supply chains. Here the value of the goods is increasingly depending on the quality (completeness and other data quality dimensions) of the product information that follows the goods.

In my eyes, we will see an increasing focus on infonomics when it comes to exchanging goods – and the related second party data – in the future. Two basic factors will be:

pdl-top-narrow

We Need More Product Data Lake Ambassadors

ambassador

Product Data Lake is the new solution to sharing product information between trading partners. While we see many viable in-house solutions to Product Information Management (PIM), there is a need for a solution to exchange product information within cross company supply chains between manufacturers, distributors and retailers.

Completeness of product information is a huge issue for self-service sales approaches as seen in ecommerce. 81 % of e-shoppers will leave a webshop with lacking product information. The root cause of missing product information is often an ineffective cross company data supply chain, where exchange of product data is based on sending spreadsheets back and forth via email or based on biased solutions as PIM Supplier Portals.

However, due to the volume of product data, the velocity required to get data through and the variety of product data needed today, these solutions are in no way adequate or will work for everyone. Having a not working environment for cross company product data exchange is hindering true digital transformation at many organizations within trade.

As a Product Information Management professional or as a vendor company in this space, you can help manufacturers, distributors and retailers in being successful with product information completeness by becoming a Product Data Lake ambassador.

The Product Data Lake encompasses some of the most pressing issues in world-wide sharing of product data:

The first forward looking professionals and vendors in the Product Information Management realm have already joined. I would love to see you as well as our next ambassador.

Interested? Get in contact:

← Back

Thank you for your response. ✨

PIM Supplier Portals: Are They Good or Bad?

A recent discussion on the LinkedIn Multi-Domain MDM group is about vendor / supplier portals as a part of Product Information Management implementations.

A supplier portal (or vendor portal if you like) is usually an extension to a Product Information Management (PIM) solution. The idea is that the suppliers of products, and thus providers of product information, to you as a downstream participant (distributor or retailer) in a supply chain, can upload their product information into your PIM solution and thus relieving you of doing that. This process usually replace the work of receiving spreadsheets from suppliers in the many situations where data pools are not relevant.

In my opinion and experience, this is a flawed concept, because it is hostile to the supplier. The supplier will have hundreds of downstream receivers of products and thus product information. If all of them introduced their own supplier portal, they will have to learn and maintain hundreds of them. Only if you are bigger than your supplier is and is a substantial part of their business, they will go with you.

Broken data supply chainAnother concept, which is the opposite, is also emerging. This is manufacturers and upstream distributors establishing PIM customer portals, where suppliers can fetch product information. This concept is in my eyes flawed exactly the opposite way.

And then let us imagine that every provider of product information had their PIM customer portal and every receiver had their PIM supplier portal. Then no data would flow at all.

What is your opinion and experience?