Is Managing Master Data a Differentiating Capability?

If you are a Master Data Management (MDM) fanatic seeing the MDM solution as the centre of the universe and you plan to attend the MDM Summit Europe 2013 then you might as well start to work on your consistency in booing and your accuracy in throwing rotten tomatoes.

In the session called Multi-Entity MDM for the Enterprise Bert Hooyman will shock you by telling that managing master data is not considered a differentiating capability at Royal Philips Electronics.

The solution at Philips is based on the information factory idea and built upon data warehouse technology. Master data and transactional data are treated equally.

saving bulb MDMWhere others may struggle with Multi-Entity / Multidomain MDM the path chosen by Philips already serves multiple business cases for combining party master data and product master data.

I guess the term “a Philips light bulb moment” could have been used too much, so let me just say that I look forward to be enlightened on how to do MDM in an energy saving way.

Bookmark and Share

The Greenland Problem in MDM

In a recent comment here on this blog the relevance of Master Data Management (MDM) solutions was questioned because in real business life different business units sees master data very differently though the data describes the same real world entity. And it’s not the first time I hear this argument.

The Greenland ProblemThe issue is similar to the Greenland problem in geography. When using the most common projection for visualizing a round earth on a flat map, the Mercator projection, Greenland has a true shape but will look as being of same size as Africa, though Africa is over 10 times as large as Greenland.

As examined in the post Sharing data is key to a single version of the truth this is similar to the problems in fulfilling multiple uses embracing all business units in an enterprise:

  • If a map shows a limited part of the world the difference doesn’t matter that much. This is similar to fitting the purpose of use in a single business unit.
  • If the map shows the whole world we may have all kind of different projections offering different kind of views on the world having some advantages and disadvantages like when we do enterprise MDM.

Today we have new technology coming to the rescue. If you go into Google Earth the world indeed looks round and you may have any high altitude view of an apparently round world. If you go closer the map tends to be more and more flat.

Google EarthMy guess is that the solutions to fit the multiple uses conundrum within MDM also will be offered from the cloud by having innovative solutions reflecting the real world entities and relate those to a variety of business functions used in different business units offering a range of views that supports multiple purposes of use.

Bookmark and Share

The Real Estate Domain

In the comments on the recent blog post about multidomain MDM (Master Data Management) it was discussed in what degree multidomain MDM is much more than CDI (Customer Data Integration) and PIM (Product Information Management).

While customer (or rather party) and product are important master entity types, there are of course a lot of other master entity types. The location domain is often mentioned as the third domain in MDM, and then there are some entity types most relevant for specific industries like an insurance policy or a vehicle in public transit, and in public transit we also have the calendar as an important master entity type.

Real estateOne of the entity types that doesn’t belong to party and in many ways is a different thing than a product is real estate (or real property or just property if you like).

For a realtor a real estate looks like a product of course. And it’s all about location, location, location.

Right now I’m working with the instant Data Quality framework. Here we are embracing the party domain by having access to external reference sources about individuals and companies, we are embracing the location domain by having access to external reference sources about addresses and then we are also embracing the real estate domain by having access to external reference sources about properties.

Real properties have addresses in many cases and are therefore close to the location domain. For some business processes it is a product with a product key like mentioned for realtors. For some business processes it is a security often identified by other keys than the postal address. It is related to different party roles like an occupier (or several) and an owner (or several) that may or may not be the same party (or parties).

What about you. Do you feel at home with the real estate entity type?

Bookmark and Share

Hierarchical Single Source of Truth

Most data quality and master data management gurus, experts and practitioners agree that achieving a “single source of truth” is a nice term, but is not what data quality and master data management is really about as expressed by Michele Goetz in the post Master Data Management Does Not Equal The Single Source Of Truth.

Even among those people, including me, who thinks emphasis on real world alignment could help getting better data and information quality opposite to focusing on fitness for multiple different purposes of use, there is acknowledgement around that there is a “digital distance” between real world aligned data and the real world as explained by Jim Harris in the post Plato’s Data. Also, different public available reference data sources that should reflect the real world for the same entity are often in disagreement.

When working with improvement of data quality in party master data, which is the most frequent and common master data domain with issues, you encounter the same issues over and over again, like:

  • Many organizations have a considerable overlap of real world entities who is a customer and a supplier at the same time. Expanding to other party roles this intersection is even bigger. This calls for a 360° Business Partner View.
  • Most organizations divide activities into business-to-business (B2B) and business-to-consumer (B2C). But the great majority of business’s are small companies where business and private is a mixed case as told in the post So, how about SOHO homes.
  • When doing B2C including membership administration in non-profit you often have a mix of single individuals and households in your core customer database as reported in the post Household Householding.
  • As examined in the post Happy Uniqueness there is a lot of good fit for purpose of use reasons why customer and other party master data entities are deliberately duplicated within different applications.
  • Lately doing social master data management (Social MDM) has emerged as the new leg in mastering data within multi-channel business. Embracing a wealth of digital identities will become yet a challenge in getting a single customer view and reaching for the impossible and not always desirable single source of truth.

A way of getting some kind of structure into this possible, and actually very common, mess is to strive for a hierarchical single source of truth where the concept of a golden record is implemented as a model with golden relations between real world aligned external reference data and internal fit for purpose of use master data.

Right now I’m having an exciting time doing just that as described in the post Doing MDM in the Cloud.

Bookmark and Share

Naming the Olympians

The British newspaper The Guardian has a feature on their website where you can get data about the Olympians. Link here: London 2012 Olympic athletes: the full list.

Browsing the list is a good reminder of the world-wide diversity we have with person names.

The names are here formatted with the surname(s) followed by the given name(s). The surname is in upper case.

The sequence of names is for the Chinese and other East Asian Olympians like they are used to opposite to other Olympians from places where we have the first name being the given name and last name being our surname.

Having the surname in upper case also shows where Olympians have two surnames as it is custom in Spanish cultures.

And oh yes. The South African guy has JIM as his surname.

Finally from this screen shot there is a good question. Is JIANG Wenwen superb at both synchronized swimming and track cycling – or is it two different Olympians with the same name. Some names are very common in China. A little goggling tells me it is two different persons. The synchronized swimmer is more related to her twin sister and swimming partner JIANG Tingting.

Let’s check if there is more than one “John Smith”.

Nope.

But it could be fun if “Kim Smith” and “Kimberley Smith” came from the same country.

Many Olympians actually don’t have the names reflected in this sheet as many have names in a different alphabet or script system.

The Danish cycling rider “SORENSEN Nicki” actually share my last name, as we know him as “Nicki Sørensen”. The Serbs, Ukrainians and Russian Olympians have their original name in the Cyrillic alphabet, but they have been transliterated to the English alphabet and Olympians from countries with other script systems than an alphabet have had their names gone through a transcription to the (English) alphabet.

So, is the list bad data quality?

Bookmark and Share

The Big Tower of Babel

3 years ago one of the first blog posts on this blog was called The Tower of Babel.

This post was the first of many posts about multi-cultural challenges in data quality improvement. These challenges includes not only language variations but also different character sets reflecting different alphabets and script systems, naming traditions, address formats, measure units, privacy norms, government registration practice to name some of the ones I have experienced.

When organizations are working internationally it may be tempting to build a new Tower of Babel imposing the same language for metadata (probably English) and the same standards for names, addresses and other master data (probably the ones of the country where the head quarter is).

However, building such a high tower may end up the same way as the Tower of Babel known from the old religious tales.

Alternatively a mapping approach may be technically a bit more complex but much easier when it comes to change management.

The mapping approach is used in the Universal Postal Unions’ (UPU) attempt to make a “standard” for worldwide addresses. The UPU S42 standard is mentioned in the post Down the Street. The S42 standard does not impose the same way of writing on envelopes all over the world, but facilitates mapping the existing ways into a common tagging mapped to a common structure.

Building such a mapping based “standard” for addresses, and other master data with international diversity, in your organization may be a very good way to cope with balancing the need for standardization and the risks in change management including having trusted and actionable master data.

The principle of embracing and mapping international diversity is a core element in the service I’m currently working with. It’s not that the instant Data Quality service doesn’t stretch into the clouds. Certainly it is a cloud service pulling data quality from the cloud. It’s not that that it isn’t big. Certainly it is based on big reference data.

Bookmark and Share

Broken Links

When passing the results of data cleansing activities back to source systems I have often encountered what one might call broken links, which have called for designing data flows that doesn’t go by book, doesn’t match the first picture of the real world and eventually prompts last minute alternate ways of doing things.

I have had the same experience when passing some real (and not real) world bridges lately.

The Trembling Lady: An Unsound Bridge

When walking around in London a sign on the Albert Bridge caught my eye. The sign instructs troops to break steps when marching over.

In researching the Albert Bridge on Wikipedia I learned that the bridge has an unsound construction that makes it vibrate not at least when a bunch of troops marches across in rhythm. The bridge has therefore got the nickname “The Trembling Lady”.

It’s an old sign. The bridge is an old bridge. But it’s still standing.

The same way we often have to deal with old systems running on unstable databases with unsound data models. That’s life. Though it’s not the way we want to see it, we most break the rhythm of else perfectly cleansed data as discussed in the post Storing a Single Version of the Truth.  

The Øresund Bridge: The Sound Link

The sound between the city of Malmö in Sweden and København (Copenhagen) in Denmark can be crossed by the Øresund Bridge. If looking at a satellite picture you may conclude that the bridge isn’t finished. That’s because a part of the link is in fact an undersea tunnel as told in the post Geocoding from 100 Feet Under.

Your first image about what can be done and what can’t be done isn’t always the way of the world. Dig into some more sources, find some more charts and you may find a way.

However, life isn’t always easy. Sometimes charts and maps can be deceiving.

Wodna: The Sound of Silence.

As reported in the post Troubled Bridge over Water I planned a cycling trip last summer. The route would take us across the Polish river Świna by a bridge I found on Google Maps.

When, after a hard day’s ride in the saddle, we reached the river, the bridge wasn’t there. We had to take a ferry across the river instead.

I maybe should have known. The bridge on the map was named Wodna. That is Polish for (something with) water.

Bookmark and Share

Nationally International

I am right now in the process of moving most of my business from the Kingdom of Denmark to the United Kingdom.

During that process I have become a regular customer at the Gatwick Express, the (sometimes) fast train going from London’s second largest airport to central London.

When buying tickets online they require you to enter a billing address. Here you can choose between entering a UK address or an international address.

If you enter a UK address the site takes advantage of the UK postal code system where you just have to enter a postcode, which is very granular in the UK, and a house number, and then the system will know your address.

Alternatively you can choose to enter an international address. In that case you will get a form with more fields for you to enter. But, in order not to be too international the form still have the UK way of formatting an address.

Also the default country is United Kingdom which I guess is the only value that should not be applicable for this form.

Bookmark and Share

Party, Product, Place. Period.

In a recent post here on this blog the Master Data Management domain usually called locations was examined and followed by excellent comments.

Also in the DAMA International LinkedIn group there was a great discussion around the location domain.

The comments touched two subjects:

  • Are locations just geographic locations or can we deal with “digital locations” as eMail addresses, phone numbers, websites, go-to-meeting ID’s, social network ID’s and so as locations as well?
  • How do we model the relations between parties, products and locations?

Sometimes I like to use the word places instead of locations as we then have a P-trinity of parties, products and places.

I’m not sure if places have a stronger semantic link to geography than locations have. Anyway, my thoughts on the location domain were merely connected to geography. The digital locations mentioned also in my eyes are more related to parties and not so much products. The same is true for another good old substitute for a location or address being a mailbox (like “Postbox 1234”), which is a valid notion for the destination of a letter or small package, and often seen in database columns else filled with geographic locations.

So, sticking to places being physical, geographic locations: How do we model parties, products and places?

First of all it’s important that we are able to model different concepts within each domain in one single way. A very common situation in many enterprise data landscapes is that different forms of parties exist with different models, like a model for customers, a model for suppliers, a model for employees and other models for other business partner roles.

The association between a party entity and a location entity is in most cases a time dependent relation like this consumer was billed on this address in this period. The relation between the party and the product is the good old basic data model, that we invoiced this and this product on that date. The product and place relation is very industry specific. One example will be that an on-site service contract applies to this address in this period.

Time, often handled as a period, will indeed add a fourth P to the P-trinity of party, product and place.

Bookmark and Share

The Location Domain

When talking master data management we usually divide the discipline into domains, where the two most prominent domains are:

  • Customer, or rather party, master data management
  • Product, sometimes also named “things”, master data management

One the most frequent mentioned additional domains are locations.

But despite that locations are all around we seldom see a business initiative aimed at enterprise wide location data management under a slogan of having a 360 degree view of locations. Most often locations are seen as a subset of either the party master data or in some cases the product master data.  

Industry diversity

The need for having locations as focus area varies between industries.

In some industries like public transit, where I have been working a lot, locations are implicit in the delivered services. Travel and hospitality is another example of a tight connection between the product and a location. Also some insurance products have a location element. And do I have to mention real estate: Location, Location, Location.

In other industries the location has a more moderate relation to the product domain. There may be some considerations around plant and warehouse locations, but that’s usually not high volume and complex stuff.  

Locations as a main factor in exploiting demographic stereotypes are important in retailing and other business-to-consumer (B2C) activities. When doing B2C you often want to see your customer as the household where the location is a main, but treacherous, factor in doing so. We had a discussion on the house-holding dilemma in the LinkedIn Data Matching group recently.

Whenever you, or a partner of yours, are delivering physical goods or a physical letter of any kind to a customer, it’s crucial to have high quality location master data. The impact of not having that is of course dependent on the volume of deliveries.   

Globalization

If you ask me about London, I will instinctively think about the London in England. But there is a pretty big London in Canada too, that would be top of mind to other people. And there are other smaller Londons around the world.

Master data with location attributes does increasingly come in populations covering more than one country. It’s not that ambiguous place names don’t exist in single country sets. Ambiguous place names were the main driver behind that many countries have a postal code system. However the British, and the Canadians, invented a system including letters opposite to most other systems only having numbers typically with an embedded geographic hierarchy.

Apart from the different standards used around the possibilities for exploiting external reference data is very different concerning data quality dimensions as timeliness, consistency, completeness, conformity – and price.

Handling location data from many countries at the same time ruins many best practices of handling location data that have worked for handling location for a single country.

Geocoding

Instead of identifying locations in a textual way by having country codes, state/province abbreviations, postal codes and/or city names, street names and types or blocks and house numbers and names it has become increasingly popular to use geocoding as supplement or even alternative.

There are different types of geocodes out there suitable for different purposes. Examples are:

  • Latitude and longitude picturing a round world,
  • UTM X,Y coordinates picturing peels of the world
  • WGS84 X, Y coordinates picturing a world as flat as your computer screen.

While geocoding has a lot to offer in identifying and global standardization we of course has a gap between geocodes and everyday language. If you want to learn more then come and visit me at N55’’38’47, E12’’32’58.

Bookmark and Share