Three kinds of a MDM Data Model that comes with a tool

Master Data Management (MDM) is a lot about data modelling. When you buy a MDM tool it will have some implications for your data model. Here are three kinds of data models that may come with a tool:

An off-the-shelf model

This kind is particularly popular with customer and other party master data models. Core party data are pretty much the same to every company. We have national identification numbers, names, addresses, phone numbers and that kind of stuff where you do not have to reinvent the wheel.

Also, you will have access to rich reference data with a model such as address directories (which you may regard as belonging to a separate location domain), business directories (as for example the Dun & Bradstreet Worldbase) and in some countries citizen directories as well. MDM tools may come with a model shaped for these sources.

Tools which are optimized for data matching, including deduplication of party master data, will often shoehorn your party master data into a data model feasible for that.

A buildable model

When it comes to multi-domain MDM we will deal with entities that are not common to everyone.

Here a capability to build your model in the MDM tool is needed. One such tool I have worked with is Semarchy. Here semi-technical people are able to build and deploy incrementally more complex data models, that are default equipped with needed functionality around handling a golden copy and auditing data onboarding and changing.

A dynamic model

Product Information Management (PIM) requires that your end users can build the model on the fly, as product data are so different between product groups.

In my current venture called Product Data Lake the model has these main entities:

PDL Data Model

This model resembles the data model in most PIM solutions (and PIM based MDM solutions), except that we have the party and their two-way partnerships at the top, as Product Data Lake takes care of exchanging data between inhouse PIM solutions at trading partners participating in business ecosystems.

Party and Product: The Core Entities in Most Data Models

Party and product are the most frequent master data domains around.

Often you meet party as one of the most frequent party roles being customer and supplier (or vendor) or by another term related to the context as for example citizen, patient, member, student, passenger and many more. These are the people and legal entities we are interacting with and with whom we usually exchange money – and information.

Product (or material) is the things we buy, make and sell. The goods (or services) we exchange.

In my current venture called Product Data Lake our aim to serve the exchange of information about products between trading partners who are customers and suppliers in business ecosystems.

For that, we have been building a data model. Below you see our first developed conceptual data model, which has party and product as the core entities.

PDL concept model.png

As this is a service for business ecosystems, another important entity is the partnership between suppliers and customers of products and the information about the products.

The product link entity in this data model is handling the identification of products by the pairs of trading partners. In the same way, this data model has link entities between the identification of product attributes at pair of trading partners (build on same standards or not) as well as digital asset types.

If you are offering product information management services, at thus being a potential Product Data Lake ambassador, or you are part of a business ecosystem with trading partners, I will be happy to discus with you about adding handling of trading partnerships and product information exchange to your current model.

 

What is Best Practice: Customer- and Vendor- or Unified Party Master Data Management?

Right now there is a good discussion going on in the Multi-Domain MDM Group on LinkedIn. A member asks:

“I’d like to hear back from anyone who has implemented party master data in either a single, unified schema or separate, individual schemas (Vendor, Customer, etc.).

What were the pros and cons of your approach? Would you do it the same way if you had it to do again?”

Multi-Side MDMThis is a classic consideration at the heart of multi-domain MDM. As I see it, and what I advise my clients to do, is to have a common party (or business partner) structure for identification, names, addresses and contact data. This should be supported by data quality capabilities strongly build on external reference data (third party data). Besides this common structure, there should be specific structures for customer, vendor/supplier and other party roles.

This subject was also recently examined here on the blog in the post Multi-Side MDM.

What is your opinion and experience with this question? Please have your say either here on the blog or in the LinkedIn Multi-Domain MDM Group.

Bookmark and Share

MDM Tools Revealed

Every organization needs Master Data Management (MDM). But does every organization need a MDM tool?

In many ways the MDM tools we see on the market resembles common database tools. But there are some things the MDM tools do better than a common database management tool. The post called The Database versus the Hub outlines three such features being:

  • Controlling hierarchical completeness
  • Achieving a Single Business Partner View
  • Exploiting Real World Awareness

Controlling hierarchical completeness and achieving a single business partner view is closely related to the two things data quality tools do better than common database systems as explained in the post Data Quality Tools Revealed. These two features are:

  • Data profiling and
  • Data matching

Specialized data profiling tools are very good at providing out-of-the-box functionality for statistical summaries and frequency distributions for the unique values and formats found within the fields of your data sources in order to measure data quality and find critical areas that may harm your business. These capabilities are often better and easier to use than what you find inside a MDM tool. However, in order to measure the improvement in a business context and fix the problems not just in a one-off you need a solid MDM environment.

When it comes to data matching we also still see specialized solutions that are more effective and easier to use than what is typically delivered inside MDM solutions. Besides that, we also see business scenarios where it is better to do the data matching outside the MDM platform as examined in the post The Place for Data Matching in and around MDM.

Looking at the single MDM domains we also see alternatives. Customer Relation Management (CRM) systems are popular as a choice for managing customer master data.  But as explained in the post CRM systems and Customer MDM: CRM systems are said to deliver a Single Customer View but usually they don’t. The way CRM systems are built, used and integrated is a certain track to create duplicates. Some remedies for that are touched in the post The Good, Better and Best Way of Avoiding Duplicates.

integriertWith product master data we also have Product Information Management (PIM) solutions. From what I have seen PIM solutions has one key capability that is essentially different from a common database solution and how many MDM solutions, that are built with party master data in mind, has. That is a flexible and super user angled way of building hierarchies and assigning attributes to entities – in this case particularly products. If you offer customer self-service, like in eCommerce, with products that have varying attributes you need PIM functionality. If you want to do this smart, you need a collaboration environment for supplier self-service as well as pondered in the post Chinese Whispers and Data Quality.

All in all the necessary components and combinations for a suitable MDM toolbox are plentiful and can be obtained by one-stop-shopping or by putting some best-of-breed solutions together.

Data Models and Real World Alignment

Usually data models are made to fit a specific purpose of use. As reported in the post A Place in Time this often leads to data quality issues when the data is going to be used for purposes different from the original intended. Among many examples we not at least have heaps of customer tables like this one:

Customer Table

Compared to how the real world works this example has some diversity flaws, like:

  • state code as a key to a state table will only work with one country (the United States)
  • zipcode is a United States description only opposite to the more generic “Postal Code”
  • fname (First name) and lname (Last name) don’t work in cultures where given name and surname have the opposite sequence
  • The length of the state, zipcode and most other fields are obviously too small almost anywhere

More seriously we have:

  • fname and lname (First name and Last name) and probably also phone should belong to an own party entity acting as a contact related to the company
  • company name should belong to an own party entity acting in the role as customer
  • address1, address2, city, state, zipcode should belong to an own place entity probably as the current visiting place related to the company

In my experience looking at the real world will help a lot when making data models that can survive for years and stand use cases different from the one in immediate question. I’m not talking about introducing scope creep but just thinking a little bit about how the real world looks like when you are modelling something in that world, which usually is the case when working with Master Data Management (MDM).

Bookmark and Share

A Master Data Mind Map

A challenge within many disciplines is easily to explain what the discipline is about and that certainly is true for Master Data Management (MDM) too as we often have the question: What is master data?

A good short explanation is:

“The description of the who, what and where in transaction data”.

It could also, with help from Wikipedia, be:

“Information that is key to the operation of a business”.

From Gartner (the analyst firm) we have:

“The consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise”.

The latter one I would not try on friends and relatives though.

Examples are often a good way to go. Visualization is great too. So, therefore I have played with a mind map of what master data entities may be:

Master Data

Bookmark and Share

Will Graph Databases become Common in MDM?

One of my pet peeves in data quality for CRM and ERP systems is the often used way at looking at entities, not at least party entities, in a flat data model as told in the post A Place in Time.

Party master data, and related location master data, will eventually be modeled in very complex models and surely we see more and more examples of that. For example I remember that I long time ago worked with the ERP system that later became Microsoft Dynamics AX.  Then I had issues with the simplistic and not role aware data model. While I’m currently working in a project using the AX 2012 Address Book it’s good to see that things have certainly developed.

This blog has quite a few posts on hierarchy management in Master Data Management (MDM) and even Hierarchical Data Matching. But I have to admit that even complex relational data models and hierarchical approaches in fact don’t align completely with the real world.

In a comment to the post Five Flavors of Big Data Mike Ferguson asked about graph data quality. In my eyes using graph databases in master data management will indeed bring us closer to the real world and thereby deliver a better data quality for master data.

I remember at this year’s MDM Summit Europe that Aaron Zornes suggested that a graph database will be the best choice for reflecting the most basic reference dataset being The Country List. Oh yes, and in master data too you should think then, though I doubt that the relational database and hierarchy management will be out of fashion for a while.

So it could be good to know if you have seen or worked with graph databases in master data management beyond representing a static analysis result as a graph database.

GraphDatabase_PropertyGraph
Wikiopedia article on graph database

Bookmark and Share

The Postal Address Hierarchy

Using postal addresses is a core element in many data quality improvement and master data management (MDM) activities.

HierarchyAs touched many times on this blog postal addresses are formatted very differently around the world. However they may all be arranged in a sort of hierarchy, where there are up to 6 general levels being:

  • Country
  • Region
  • City or district
  • Thoroughfare (street) or block
  • Building number
  • Unit within building

In addition to that the postal code (postcode or zip code) is part of many address formats. Seen in the hierarchical light the postal code is a tricky concept as it may identify a city, district, thoroughfare, a single building or even a given unit within or section of a building. The latter is true for my company address in the United Kingdom, where we have a very granular postcode system.

Country

As discussed in the post The Country List even the top level of a postal address hierarchy isn’t a simple list fit for every purpose. Some issues are:

  • There are different sources with different perceptions of which are the countries on this planet
  • What we regard as countries comes in hierarchies
  • Several coding systems are available

Region

The region is an element in some address formats like the states in the United States and the provinces in Canada, while other countries like Germany that is divided into quite independent Länder do not have the region as a required part of the postal address. The same goes for Swiss cantons.

City or district

I once read that if you used the label city in a web form in Australia, you would get a lot of values like: “I do not live in a city”.

Anyway this level is often (but as mentioned certainly not always) where the postal code is applied. The postal code district may be a single town with surroundings, several villages or a district within a big city.

Thoroughfare (street) or block

Most countries use thoroughfares as streets, roads, lanes, avenues, mews, boulevards and whatever they are called around. Beware that the same street may have several spellings and even several names.

Japan is a counterexample of the use of thoroughfares, as here it’s the blocks between the thoroughfares that are part of the postal address.

Building number

Usually this element will be an integer. However formats with a letter behind the integer (example: 21 A) or a range of integers (example: 21-23) are most annoying. And then this British classic: One Main Grove. OMG.

Unit within a building

This element may or may not be present in a postal address depending on if the building is a single family house or company site, the postal delivery sees it as such or you may actually indicate where within the building the delivery goes or you go. The ups and downs of this level are examined in the post A Universal Challenge.

Bookmark and Share

On MDM, Data Models and Big Data

As described in the post Small Data with Big Impact my guess is that we will see Master Data Management solutions as a core element in having data architectures that are able to make sustainable results from dealing with big data.

If we look at party master data a serious problem with many ERP and CRM systems around is that the data model for party master data aren’t good enough for dealing with the many different forms and differences in which the parties we hold data about are represented in big data sources which makes the linking between traditional systems of record and big data very hard.

Having a Master Data Management (MDM) solution with a comprehensive data model for party master data is essential here.

Some of the capabilities we need are:

Storing multiple occurrences of attributes

People and companies have many phone numbers, they have many eMail addresses and they have many social identities and you will for sure meet these different occurrences in big data sources. Relating these different occurrences to the same real world entity is essential as reported in the post 180 Degree Prospective Customer View isn’t Unusual.

An MDM hub with a corresponding data model is the place to manage that challenge in one place.

Exploiting rich external reference data

As told in the post Where the Streets have Two Names and emphasized in the comments to the post the real world has plenty of examples of the same thing having many names. And this real world will be reflected in big data sources.

Your MDM solution should embrace external reference data solving these issues.

Handling the time dimension

In the post A Place in Time the flaws of the usual customer table in ERP and CRM systems is examined. One common issue is handling when attributes changes. Change of address happens a lot. And this may be complicated by that we may operate several address types at the same time like visiting addresses, billing addresses and correspondence addresses. These different addresses will also pop up in big data sources. And the same goes for other attributes.

You must get that right in your MDM implementation.

Customer Table
The usual but very wrong customer table that wont work with big data.

Bookmark and Share

Where the Streets have one Name but Two Spellings

Last week’s post called Where The Streets have Two Names caught a lot of comments both on this blog and in LinkedIn groups as here on Data Quality Professionals and on The Data Quality Association, with a lot of examples from around the world on how this challenge actually exist more or less everywhere.

Recently I had the pleasure of experiencing a variant of the challenge when driving around in a rented car in the Saint Petersburg area in Russia. Here the streets usually only have one name but that may be presented in two different alphabets being the local Cyrillic or the Latin alphabet I’m used to which also was included in the reference data on the Sat Nav. So while it was nice for me to type destinations in Latin letters it was nice to have directions in Cyrillic in order to follow the progress on road signs.

So here standardization (or standardisation) to one preferred language, alphabet or script system isn’t the best solution. Best of breed solutions for handling addresses must be able to handle several right spellings for the same address.

Nevsky_Prospekt,_St_Petersburg,_street_sign
Street sign in Cyrillic with Latin subtitle

Bookmark and Share