Big Data and Multi-Domain Master Data Management

The possible connection between the hot buzz within IT today being “big data” and the good old topic of master data management has been discussed a lot lately. An example from CIO UK today is this article called Big data without master data management is a problem.

As said in the article there is a connection through big master data (and big reference data) to big transaction data. Big transaction data is what we usually would call big data, because these are the really big ones.

The two most mentioned kind of big transaction data are:

  • Social data and
  • Sensor data

I also have seen a lot of connections between these big data and master data in multiple domains.

Social Data

Connecting social data to Master Data Management (MDM) is an ongoing discussion I have been involved in for the last three years lately through the new LinkedIn group called Social MDM.

The customer master data domain is in focus here, as the immediate connection here is how to relate traditional systems of record holding customer master data and the systems of engagement where the big social data are waiting to be analyzed and eventually be a part of day-to-day customer centric business processes.

However being able to analyze, monitor and take action on what is being said about specific products in social data is another option and eventually that has to be linked to product master data. In product master data management the focus has traditionally been on your own (resell) products. Effectively listening to social data will mean that you also have to manage data about competing products.

Attaching location to social data has been around for long. Connecting social data to your master data will also require that your location master data are well aligned with the real world.

Sensor Data   

During the past many years I have been involved in data management within public transportation where we have big data coming in from sensors of different kind.

The big problem has for sure being able to connect these transactions correctly to master data. The challenges here are described in the post Multi-Entity Master Data Quality.

The biggest problem is that all the different equipment generating the sensor data in practice can’t be at the same stage at the same time and this will eventually create data that if related without care will show very wrong information about who was the passenger(s), what kind of trip it were, where the journey happened and under which timetable.

Bookmark and Share

Business Contact Reference Data

When working with selling data quality software tools and services I have often used external sources for business contact data and not at least when working with data matching and party master data management implementations in business-to-business (B2B) environments I have seen uploads of these data in CRM sources.

A typical external source for B2B contact data will look like this:

Some of the issues with such data are:

  • Some of the contact data names may be the same real world individual as told in the post Echoes in the Database
  • People change jobs all the time. The external lists will typically have entries verified some time ago and when you upload to your own databases, data will quickly become useless do to data decay.
  • When working with large companies in customer and other business partner roles you often won’t interact with the top level people, but people in lower levels not reflected in such external sources.

The rise of social networks has presented new opportunities for overcoming these challenges as examined in a post (written some years ago) called Who is working where doing what?

However, I haven’t seen so many attempts yet to automate and include working with social network profiles in business processes. Surely there are technical issues and not at least privacy considerations in doing so as discussed in the post Sharing Social Master Data.

Right now we have a discussion going on in the LinkedIn Social MDM group about examples of connecting social network profiles and master data management. Please add your experiences in the group here – and join if you aren’t already a member.

Bookmark and Share

Do You Want Social MDM?

This weekend I noticed a tweet from the MDM tool vendor Orchestra Networks:

There is clearly something completely wrong with this tweet. Why on earth should a French company use an American date format?

Apart from that there is a very good point. Why should tool vendors work on solving imaginable future master data management issues as integrating social network profiles with traditional customer master data while there are plenty of issues that need a better solution today?

Personally I think social MDM is going to be huge. I had some of my first musings on the subject some years ago in the post Social Master Data Management. Probably we will start with some Lean Social MDM, and that is honestly also as far as I have explored this field until now.

What about you. Do you want social MDM?

Bookmark and Share

Avoiding Contact Data Entry Flaws

Contact data is the data domain most often mentioned when talking about data quality. Names and addresses and other identification data are constantly spelled wrong, or just different, by the employees responsible of entering party master data.

Cleansing data long time after it has been captured is a common way of dealing with this huge problem. However, preventing typos, wrong hearings and multi-cultural misunderstandings at data entry is a much better option wherever applicable.

I have worked with two different approaches to ensure the best data quality for contact data entered by employees. These approaches are:

  • Correction and
  • Assistance

Correction

With correction the data entry clerk, sales representative, customer service professional or whoever is entering the data will enter the name, address and other data into a form.

After submitting the form, or in some cases leaving each field on the form, the application will check the content against business rules and available reference data and return a warning or error message and perhaps a correction to the entered data.

As duplicated data is a very common data quality issue in contact data, a frequent example of such a prompt is a warning about that a similar contact record already exists in the system.

Assistance

With assistance we try to minimize the needed number of key strokes and interactively help with searching in available reference data.

For example when entering address data assistance based data entry will start with the highest geographical level:

  • If we are dealing with international data the country will set the context and know about if a state or province is needed.
  • Where postal codes (like ZIP) exists, this is the fast path to the city.
  • In some countries the postal code only covers one street (thoroughfare), so that’s settled by the postal code. In other situations we will usually have a limited number of streets that can be picked from a list or settled with the first characters.

(I guess many people know this approach from navigation devices for cars.)

When the valid address is known you may catch companies from business directories being on that address and, depending on the country in question, you may know citizens living there from phone directories and other sources and of course the internal party master data, thus avoiding entering what is already known about names and other data.

When catching business entities a search for a name in a business directory often leads to being able to pick a range of identification data and other valuable data and not at least a reference key to future data updates.

Lately I have worked intensively with an assistance based cloud service for business processes embracing contact data entry. We have some great testimonials about the advantages of such an approach here: instant Data Quality Testimonials.

Bookmark and Share

Social MDM, Privacy and Data Quality

The term “Social MDM” has been promoted quite well this week not at least as part of the social media information stream from the ongoing user conference of the tool vendor Informatica.

In a blog post called Informatica 9.5 for Big Data Challenge #2: Social Jody Ko of Informatica introduces the opportunities and challenges.

In the closing remarks Judy says: “There’s still a long way to go to bring social data into the mainstream enterprise, in part due to concerns over privacy and the potential “creepiness” factor of mining social data.”

As I understand it the spearhead Social MDM part of the tool release is a Facebook App that provides connectivity between Facebook and the MDM solution.

Industry analyst R “Ray” Wang examines this in the blog post News Analysis: Informatica Launches MDM 9.5. The analysis states that it now is time to “drive data out of Facebook and not into Facebook”.

The opportunities and challenges of driving data out of Facebook was discussed in a post called exactly Out of Facebook here on the blog some years ago.

Balancing privacy with data hoarding is still for sure a subject that in no way is settled and probably never will be.

Connecting systems of record in traditional MDM solutions with social network profiles is in no way a walk over too. The classic data quality challenges with uniqueness of records and completeness of data only gets more difficult, but also, there are great opportunities for getting a better picture of your customers and other business partners.

If you are interested in Social MDM and the related challenges and opportunities there is a LinkedIn group for Social MDM.

The group is new, less than a month old at the present time, but there is already a lot of content to dip into, including:

Bookmark and Share

Deduplication vs Identity Resolution

When working with data matching you often finds that there basically is a bright view and a dark view.

Traditional data matching as seen in most data quality tools and master data management solutions is the bright view: Being about finding duplicates and making a “single customer view”. Identity resolution is the dark view: Preventing fraud and catching criminals, terrorists and other villains.

These two poles were discussed in a blog post and the following comments last year. The post was called What is Identity Resolution?

While deduplication and identity resolution may be treated as polar opposites and seemingly contrary disciplines they are in my eyes interconnected and interdependent. Yin and Yang Data Quality.

At the MDM Summit in London last month one session was about the Golden Nominal, Creating a Single Record View. Here Corinne Brazier, Force Records Manager at the West Midlands Police in the UK told about how a traditional data quality tool with some matching capabilities was used to deal with “customers” who don’t want to be recognized.

In the post How to Avoid Losing 5 Billion Euros it was examined how both traditional data matching tools and identity screening services can be used to prevent and discover fraudulent behavior.

Deduplication becomes better when some element of identity resolution is added to the process. That includes embracing big reference data in the process. Knowing what is known in available sources about the addresses that is being matched helps. Knowing what is known in business directories about companies helps. Knowing what is known in appropriate citizen directories when deduping records holding data about individuals helps.

Identity Resolution techniques is based on the same data matching algorithms we use for deduplication. Here for example a fuzzy search technology helps a lot compared to using wildcards. And of course the same sources as mentioned above are a key to the resolution.

Right now I’m dipping deep into the world of big reference data as address directories, business directories, citizen directories and the next big thing being social network profiles. I have no doubt about that deduplication and identity resolution will be more yinyang than yin and yang in the future.

Bookmark and Share

Häagen-Dazs Datakvalitet

There is a term called foreign branding. Foreign branding is describing an implied cachet or superiority of products and services with foreign-sounding names

Häagen-Dazs ice cream is an example of foreign branding. Though the brand was established in New York the name was supposed to sound Scandinavian.

However, Häagen-Dazs does sound and look somewhat strange to a Scandinavian. The reason is probably that the constellation of the letters “äa” and “zs” are not part of any native Scandinavian words.

By the way, datakvalitet is the Scandinavian compound word for data quality.

Getting datakvalitet right in world wide data isn’t easy. What works in some countries doesn’t work in other countries, not at least when we are talking datakvalitet regarding party master data such as customer master data, supplier master data and employee master data.

One of the reasons why datakvalitet for party master data is different is the various possibilities with applying big reference data sources. For example the availability of citizen data is different in New York than in Scandinavia. This affects the ways of reaching optimal datakvalitet as reported in the post Did They Put a Man on the Moon.

As part of the ongoing globalization handling international datakvalitet is becoming more and more common. Many enterprises try to deploy enterprise wide datakvalitet initiatives and shared service centers handles party master data uncommon to the people working there. This often results in finding a strange word like Häagen-Dazs.

Bookmark and Share

Social Commerce and Multi-Domain MDM

The term social commerce is said to be a subset of eCommerce where social media is used to ultimately drag prospects and returning customers to your website, where a purchase of products and services can be made.

In complex sales processes, typically for Business-to-Business (B2B) sales, the website may offer product information sheets, demo requests, contact forms and other pipeline steps.

This is the moment where your social media engaged (prospective) customer meets your master data as:

  • The (prospective) customer creates and maintains name, address and communication information by using registration functions
  • The (prospective) customer searches for and reads product information on web shops and information sites

One aspect of this transition is how master data is carried over, namely:

  • How the social network profile used in engagement is captured as part of (prospective) customer master data or if it should be part of master data at all?
  • How product information from the governed master data hub has been used as part of the social media engagement or if the data governance of product data should be extended to use in social media at all?

Any thoughts?

Bookmark and Share

Social MDM and Systems of Engagement

Social Master Data Management has been an interest of mine the last couple of years and last week I have tried to reach out to others in exploring this new era of Master Data Management by creating a group on LinkedIn called Social MDM.

When reading a nice blog with the slogan ”Welcome to the Real (IT) World!” by Max J. Pucher I came across a good illustration by John Mancini showing the history of IT and how the term “Systems of Record” is being replaced (or at least supplemented) by the term “Systems of Engagement”:

Master Data Management (MDM) includes having a System of Record (SOR) describing the core entities that takes part in the transactional systems of record that supports the daily business in every organization. For example a golden MDM record is describing the party that acts as a customer on an order record while the products in the underlying order lines are described in golden MDM records for the things dealt with within the organization.

Social Master Data Management (Social MDM) will be about supplementing that System of Record so we are able to further describe the parties taking part in the new Systems of Engagement and link with the old Systems of Records. These parties are reflected as social network profiles that are owned by the same human beings who are our (prospective) customers, part of the same household or are a contact for a company being a (prospective) customer or any other business partner.

For a guy like me who started in IT in the mainframe era (just after it had ended according to the above illustration) and went on with mini computers, PC’s and the internet it’s very exciting to be moving on into the social and cloud era.

It will be good to be joined by even more data quality and MDM practitioners and anyone else in the LinkedIn Social MDM group.

Bookmark and Share

At Least Two Versions of the Truth

Precisely one year ago I wrote a post called Single Company View examining the challenges of getting a single business partner view in business-to-business (B2B) party master data.

Yesterday Robert Hawker of Vodafone made a keynote at the MDM Summit Europe 2012 telling about supplier master data management.

One of the points was that sometimes you really want the exactly same real world entity to be two golden records in your master data hub, as there may be totally different business activities made with the same legal entity. The Vodafone example was:

  • Having an antenna placed on the top of a building owned by a certain company and thus paying a fee for that
  • Buying consultancy services from the same company

I have met such examples many times when doing data matching as told in the post Entity Revolution vs Entity Evolution.

However at one occasion, many years ago, I worked in a company where not having a single business partner view nearly became a small disaster.

Our company delivered software for membership administration and was at the same time a member of an employer organisation that also happened to be a customer.

A new director got the brilliant idea, that cancelling the membership of the employer organization was an obvious cost reduction.

The cancellation was sent. The employer organisation confirmed the cancellation adding, that they were very sorry that internal business rules at the same time forced them to not being a customer anymore.

Cancellation was cancelled of course and damage control was initiated.

Bookmark and Share