What’s a Six Pack?

I have earlier written about my Right the First Time enrolment at the local fitness club and how I geekingly are using the dashboard on the workout equipment to follow my Fitness Data.  

But it is probably (or actually certainly) too early to talk about the term “six pack” related to these efforts.

So let’s talk about a “six pack” related to master data management.

We may for example have a look at “a six pack of Carlsberg lager”.

Sometimes you may ask how many different products you are handling in a master data hub. In answering that question we here may come up with a lot of different numbers all being a Perfect Wrong Answer.

The real world isn’t flat. When dealing with product master data we certainly need to see the world in hierarchies as:

  • Carlsberg lager as such is a product with some attributes and some relations to the customers liking this product or not.
  • The product may be brewed in the original country of origin (Denmark) or at lot of other facilities around the world, thus making it a different product per supplier with respect to some attributes.
  • As a customer you buy the product in a certain packaging like a six pack of cans in a given size with a given label.

The bottom level presented here is what in data management terms is identified as a Stock Keeping Unit (SKU).

Oh, and consuming the last “six pack” is probably (or actually certainly) not good for achieving the first mentioned “six pack”.

Bookmark and Share

Customer Product Matrix Management

A customer/product matrix is a way of describing the relationships between customer types and product types/attributes.  

Example:

Note: Please find some data quality related product descriptions in the post Data Quality and World Food.

Filling out the matrix may be based on prejudices, gut feelings, assumptions, surveys, focus groups or data.

If we go for data we may do this by collecting available historical data related to sales and inquiries made by persons belonging to each customer type regarding products belonging to each product type.  

In doing that correctly we need two kinds of master data management and data quality assurance in place:

  • Customer Data Integration (CDI) for assigning the accurate customer type in the real world related to the uniquely identified person in transactions coming from all sources – here based on location master data.
  • Product Information Management (PIM) for categorizing the relevant fit for purpose product type.

This reminds me about multi-domain master data management. Customer master data (or shall we say party master data), product master data and location master data used to figure out how to do business. I like it – both the master data management part and the mentioned product types.  

Bookmark and Share

Electronic Data Processing

A comment on my last blog post took me back to the days when I started working with Information Technology (IT). At that time our métier actually wasn’t called IT but EDP (Electronic Data Processing) – at least that was the case in my home country Denmark where we used the local TLA being EDB (Elektronisk Data Behandling).

I have earlier touched the long standing discussion about if “data quality” should be rebranded as “information quality” for example in the post called new blog name, as this should also require a new name for this blog.

The words data and information are indeed used very randomly around. In MDM (Master Data Management) we have two main domains being Customer Data Integration (CDI) and Product Information Management (PIM). Wonder if customer data is old school and product information is new school?

Bookmark and Share

Product Placement

This wasn’t actually meant as a blog post series about the place entity in multi-domain master data management. But I think I have been carried away by my work, so now it is.

Places probably are most common related to the party domain as seen in the previous post called A Place in Time. But places certainly also have multiple relations to the product domain then forming a P trinity of parties, products and places in multi-domain master data management as seen in the post Your Place or My Place?

As with most things in the product domain also the product-place relations usually are very industry specific.

Some of the product-place relations I have worked with come from these industries:

Insurance

The fees you have to pay for some insurance products are related to the place where you live. In order to having the right fees (and for a lot of other reasons) an insurance company needs to analyze data based on the product-place relations. This may by the way go very wrong as told in the post A Really Bad Address.

Hospitality

Your product is a place where the selling attributes includes both the properties belonging to the place itself and the properties of the places being nearby.

Real Estate

Do I have to say more than three words: Location, Location, Location.

Your product-place relations

Tell me about what product-place relations you have worked with?

Bookmark and Share

Lots of Product Names

In master data management the two most prominent domains are:

  • Parties and
  • Products

In the quest for finding representations of parties actually being the same real world party and finding representations of products actually being the same real world product we typically execute fuzzy data matching of:

  • Party names as person names and company names
  • Product descriptions

However I have often seen party names being an integral part of matching products.

Some examples:

Manufacturer Names:

A product is most often being regarded as distinct not only based on the description but also based on the manufacturer. So besides being sharp on matching product descriptions for light bulbs you must also consider if for example the following manufacturer company names are the same or not:

  • Koninklijke Philips Electronics N.V.
  • Phillips
  • Philips Electronic

Author Names:

A book is a product. The title of the book is the description. But also the author’s person name counts. So how do we collect the entire works made by the author:

  • Hans Christian Andersen
  • Andersen, Hans Christian
  • H. C. Andersen

as all three representations are superb bad data?

Bear Names:

A certain kind of teddy bears has a product description like “Plush magenta teddy bear”. But each bear may have a pet name like “Lots-O’-Huggin’ Bear” or just short “Lotso” as seen in the film “Toy Story 3”. And seriously: In real business I have worked with building a bear data model and the related data matching.

PS: For those who have seen Toy Story 3: Is that Lotso one or two real world entities?  

Bookmark and Share

Matching Light Bulbs

This morning I noticed this lightbulb joke in a tweet from @mortensax:

Besides finding it amusing I also related to it since I have used an example with light bulbs in a webinar about data matching as seen here:

The use of synonyms in Search Engine Optimization (SEO) is very similar to the techniques we use in data matching.

Here the problem is that for example these two product descriptions may have a fairly high edit distance (very different character by character), but are the same:

  • Light bulb, A 19, 130 Volt long life, 60 W
  • Incandescent lamp, 60 Watt, A19, 130V

while these two product descriptions have an edit distance of only one substitution of a character, but are not the same product (though being same category):

  • Light bulb, 60 Watt, A 19, 130 Volt long life
  • Light bulb, 40 Watt, A 19, 130 Volt long life

Working with product data matching is indeed very enlightening.

Bookmark and Share

My 2011 To Do List

These days are classic times for predicting something about next year in a blog post. This year I will make some egocentric predictions about what I am going to do next year. Fortunately I think these activities are pretty representative for the trends in the data quality realm.

My three most important challenges in working with data and information quality improvement and master data management will be:

Multi-Domain Master Data Quality

There are some different disciplines and product offerings around as:

  • Data Quality tools
  • Customer Data Integration (CDI) solutions
  • Product Information Management (PIM) platforms

These disciplines and the related software packages used to solve the challenges are constantly maturing and expanded to embrace the problems as a whole.

Find more about the subject in my posts on Multi-Domain MDM.

Exploiting rich external reference data sources in the cloud

Working with external reference sources as a mean to improve data quality has been a focus area of mine for many years.

Recent developments in governments releasing rich sources of data will help with availability here, but new challenges will also arise, like working with conformity across data sources coming from many different countries in many different ways.

Much of the activity here will happen in the cloud.

See my take on the subject on the page Data Quality 3.0 and read about a concrete implementation in instant Data Quality.

Downstream data cleansing

Despite constant improvements with data quality tools and master data management solutions moving us from batch cleansing downstream to upstream prevention there will still be lots of reasons for doing downstream cleansing projects.

Here are the top 5 reasons.

I expect to be involved in at least one of each type next year.

Bookmark and Share

Sell–side vs Buy-side Master Data Quality

The two most prominent domains in master data management and related data quality improvement are:

  • Party master data and
  • Product master data

Party Master Data

Most of the talk about party master data is about customer master data (including prospect master data). This discipline is often called Customer Data Integration (CDI).  Customer data is the sell-side of party master data. The organizations with the biggest pains in this area are mostly organizations with many customers (and prospects). The largest volumes of customer data is related to business-to-consumer (B2C) activities, but certainly we also see many grown customer databases in the business-to-business (B2B) realm.

The buy-side of party master data is supplier data. Fewer organizations have grown supplier databases, but surely big firms with many different departments and subsidiaries have supplier master data issues like the ones we see on the sell-side.

Also many organizations have a surprisingly large intersection of the same parties being both on the sell-side and on the buy-side. I have touched that subject in the post: 360° Business Partner View.

Product Master Data

Product Information Management (PIM) also has a sell-side and a buy-side. Also here the pains grow with the numbers. Opposite to party master data high sell-side numbers is more seldom than high buy-side numbers with product master data.

We often see high sell-side number of products at retailers where the same product also is buy-side at the same time, but where we maybe haven’t the same requirements for entity resolution at the same time. Most organizations don’t have that big issues (like problems with uniqueness) with own produced products.

Else high number of buy-side products is not so much related to buying raw materials as it is to buying things as spare parts and all kind of small equipment and assets of different kind (with software licenses being most close to herding cats I guess).

Multi-Domain Master Data Management

With multi-domain master data management there is of course a connection between sell-side party master data and sell-side product master data with opportunities in analyzing to whom we sell what and discovering cross selling openings and so on.

On the buy-side there are great potentials in looking into from where we buy similar things, looking into discount possibilities and so on.

Same same but different

A while ago I wrote a blog post about similarities and differences between party master data quality and product master data quality called Same Same But Different.

Besides having the differences between party master data and product master data I also find we have differences between sell-side and buy-side making it four different but somewhat similar and connected disciplines in master data management and data quality improvement.

Bookmark and Share

What is Multi-Domain MDM?

Doing master data management with several different entity types is most often seen as the federated discipline of handling Customer Data Integration (CDI) and Product Information Management (PIM) with the same software brand.

And sure, doing this (including making that software) is a challenge as there are basic differences between the two disciplines as discussed in the post Same Same But Different.

But doing both well at the same time is only a starting point. Making business value from the intersection between the two disciplines is the real challenge.

I learned that 20 years ago when I started a new client relationship (which also was before MDM, CDI and PIM was household TLA’s).

The client’s head quarter was in the southern outskirts of Copenhagen, so on a good summer day I could go there on my bike. They imported else wasted peels from oranges grown in the endless South American citrus plantations to be used for our morning juice and else useless seaweed harvested in the hot waters around the countless Philippine islands.

Along with a few other raw materials the peels and seaweed were made into approximately a hundred different semi-finished products. Based on customer orders these were blended into not much more than a thousand different defined finished products being valuable ingredients for food and pharmaceutical production.

The number of different customers was also modest, as I remember not much more than a thousand different worldwide customers.

So, managing 1,000 different customers buying 1,000 different products shouldn’t be much of a MDM case. Of course customer data management with global diverse entities had its challenges and not at least product information handling with rising regulatory demands in the food and pharmaceutical segment wasn’t a walk over either.

But some big hurdles were sure in the intersection between customer master data and product master data and solving the issues did almost always involve data quality related to core transactions referencing the entities described in the master data.

Bookmark and Share

Same Same But Different

The two most common master data types are:

  • Party master data (customers, prospects, suppliers and other business partners)
  • Product master data

When working with data quality within master data management you may of course encounter some similarities between these two master data types, but you will certainly also meet a range differences.  

The basic activities as standardization, consolidation and hierarchy building are the same.

Some of the differences I have learned are:

Multi-cultural issues:

  • Party master data is often stored in a single global format but should be transformed to embrace multi-cultural diversities.
  • Product master data may have multi-cultural issues but should be transformed into a single global format (of course embracing multi-language hierarchies and so).

External reference data available:

  • For party master data the possibilities for real world alignment with external data sources are plenty.
  • For product master data the possibilities for real world alignment with external data sources are few.

Industry specific requirements:

  • Requirements for party master data quality are pretty much the same across industries with few variations as B2B (corporate customers) or B2C (private customers) or both being the most prominent.
  • Requirements for product master data quality vary tremendously across different industries.

Your say:

What are your examples of (similarities and) differences between party master data quality and product master data quality?

Bookmark and Share