MDM Summit Europe 2012 Preview

I am looking forward to be at the Master Data Management Summit Europe 2012 next week in London. The conference runs in parallel with the Data Governance Conference Europe 2012.

Data Governance

As I am living within a short walking distance of the venue I won’t have so much time thinking as Jill Dyché had when she recently was on a conference within driving distance, as reported on her blog post After Gartner MDM in which Jill considers MDM and takes the road less traveled. In London Jill will be delivering a key note called: Data Governance, What Your CEO Needs to know.

On the Data Governance tracks there will be a panel discussion called Data Governance in a Regulatory Environment with some good folks: Nicola Askham, Dylan Jones, Ken O’Connor and Gwen Thomas.

Nicola is currently writing an excellent blog post series on the Six Characteristics Of A Successful Data Governance Practitioner. Dylan is the founder of DataQualityPro. Ken was the star on the OCDQblog radio show today discussing Solvency II and Data Quality.

Gwen, being the founder of The Data Governance Institute, is chairing the Data Governance Conference while Aaron Zornes, the founder of The MDM Institute, is chairing the MDM Summit.

Master Data, Social MDM and Reference Data Management

The MDM Institute lately had an “MDM Alert”  with Master Data Management & Data Governance Strategic Planning Assumptions for 2012-13 with the subtitle: Pervasive & Pandemic MDM is in Your Future.

Some of the predictions are about reference data and Social MDM.

Social master data management has been a favorite subject of mine the last couple of years, and I hope to catch up with fellow MDM practitioners and learning how far this has come outside my circles.

Reference Data is a term often used either instead of Master Data or as related to Master Data. Reference data is those data defined and initially maintained outside a single enterprise. Examples from the customer master data realm are a country list, a list of states in a given country or postal code tables for countries around the world.

The trend as I see it is that enterprises seek to benefit from having reference data in more depth than those often modest populated lists mentioned above. In the customer master data realm such big reference data may be core data about:

  • Addresses being every single valid address typically within a given country.
  • Business entities being every single business entity occupying an address in a given country.
  • Consumers (or Citizens) being every single person living on an address in a given country.

There is often no single source of truth for such data.

As I’m working with an international launch of a product called instant Data Quality (iDQ™) I look forward to explore how MDM analysts and practitioners are seeing this field developing.

Bookmark and Share

Bat-and-ball Data Quality

Lately Jim Harris of the OCDQblog has written two excellent blog posts, or may I say home runs, discussing data quality with inspiration from baseball.

In the post Quality Starts and Data Quality Jim talks about that you may have a tough loss in business despite stellar data quality and have a cheap win in business despite of horrible data quality, but in the long run by starting off with good data quality, your organization have a better chance to succeed.

The follow up post called Pitching Perfect Data Quality Jim ponders that business success is achievable without perfect data quality, but data quality has a role to play.

Now, despite that baseball is a very popular sport in the United States, but largely unknown in the rest of world, I think we all understand the metaphors.

Also we have different but similar sports, with other rules, statistics and terms attached, over the world. The common name for these sports is bat-and-ball games.

In Britain, where I live now, cricket is huge and can be used to attract awareness of data issues. As late as yesterday the Ordnance Survey, a government body that have registries with addresses, coordinates and maps, made a blog post called Anyone for cricket? British blogger Peter Thomas also wrote among others a post on cricket and data quality called Wager.

Before coming to Britain I lived in Denmark, where we don’t know baseball, don’t know cricket but sometimes at family picnics, perhaps after a Carlsberg and a snaps or two, plays a similar game called rundbold, with kids and grandpa friendly rules and score board and usually using a tennis ball.

Data quality, not at least data quality in relation to party master data, which is the most prominent domain within the discipline, is also a same same but different game around the world as told in the post Partnerships for the Cloud.

Understanding the rules, statistics and terms of baseball, cricket, rundbold and all the other bat-and-ball games of the world is a daunting task, even though we all know how to hit a ball with a bat.

Bookmark and Share

Updating a Social Business Directory

Business directories have been around for ages. In the old days it was paper based as in the yellow pages for a phone book. The yellow pages have since made it to be online searchable. We also know commercial business directories as the Dun & Bradstreet WorldBase as well as government operated national wide directories of companies and industry specific business directories.

Such business directories often takes a crucial role in master data quality work as sources for data enrichment in the quest for getting as close as possible to a single version of the truth when dealing with B2B customer master data, supplier master data and other business partner master data.

A classic core data model for Master Data in CRM systems, SCM solutions and Master Data hubs when doing B2B is that you have:

  • Accounts being the BUSINESS entities who are your customers, suppliers, prospects and all kind of other business partners
  • Contacts being the EMPLOYEEs working there and acting in the roles as decision makers, influencers, gate keepers, users and so on

Today we also have to think about social master data management, being exploiting reference data in social media as a supplementary source of external data.

As all social activity this exercise goes two ways:

  • Finding and monitoring your existing and wanted business partners in the social networks
  • Updating your own data

Most business entities in this world are actually one-man-bands. So are mine. Therefore I went to the LinkedIn company pages this morning and updated data about my company Liliendahl Limited: Unlimited Data Quality and Master Data Management consultancy for tool and service vendors.

Bookmark and Share

Eating the MDM Elephant

The idiom of eating the elephant one bite at time is often used when trying to vision a roadmap for Master Data Management (MDM).

It’s a bit of a contradiction to look at it that way, because the essence of MDM is an enterprise wide single source of truth eventually for all master data domains.

But it may be the only way.

Using a cliché MDM is (as any discipline) about people, processes and technology.

In an earlier post called Lean MDM a data quality and entity resolution technology focused approach to start consuming the elephant was described, here starting with building universal data models for party master data and rationalizing the data within a short frame of time.

I have often encountered that many organizations actually don’t want an entity revolution but are more comfortable with having entity evolution when it comes to entity resolution as examined the post Entity Revolution vs Entity Evolution.

The term “Evolutionary MDM” is used by the MDM vendor Semarchy as seen on this page here called What is Evolutionary MDM?

The idea is to have technology that supports an evolutionary way of implementing MDM. This is in my eyes very important, as people, processes and technology may be prioritized in the said order, but shouldn’t be handled in a serial matter that reveals the opportunities and restrictions related to technology at a very late stage in implementing MDM.

Bookmark and Share

Know Your Foreign Customer

I’m not saying that Customer Master Data Management is easy. But if we compare the capabilities within most companies with handling domestic customer records they are often stellar compared to the capabilities of handling foreign customer records.

It’s not that the knowledge, services and tools doesn’t exist. If you for example are headquartered in the USA, you will typically use best practice and services available there for domestic records. If you are headquartered in France, you will use best practice and services available there for domestic records. Using the best practices and services for foreign (seen from where you are) records is more seldom and if done, it is often done outside enterprise wide data management.

This situation can’t, and will not, continue to exist. With globalization running at full speed and more and more enterprise wide data management programs being launched, we will need best practices and services embracing worldwide customer records.

Also new regulatory compliance will add to this trend. Being effective next year the US Foreign Account Tax Compliance Act (FATCA) will urge both US Companies and Foreign Financial Institutions to better know your foreign customers and other business partners.

In doing that, you have to know about addresses, business directories and consumer/citizen hubs for an often large range of countries as described in the post The Big ABC of Reference Data.

It may seem a daunting task for each enterprise to be able to embrace big reference data for all the countries where you have customers and other business partners.

My guess, well, actually plan, is, that there will be services, based in the cloud, helping with that as indicated in the post Partnerships for the Cloud.

Bookmark and Share

Fit for repurposing

Reading a blog post by David Loshin called Data Governance and Quality: Data Reuse vs. Data Repurposing I was, perhaps a bit off topic, inspired to pose the question about if data are of high quality if they are:

  • Fit for the purpose of use
  • Fit for repurposing

The first definition has been around for many years and has been adapted by many data quality practitioners. I have however often encountered situations where the reuse of data for other purposes than the original purpose has raised data quality issues with else cleared data. One of my first pieces on my own blog discussed that challenge in a post called Fit for what purpose?

Not at least within master data management where data are maintained for multiple uses, this problem is very common.

Data in a master data hub may either:

  • Be entered directly into the hub where multiple uses is handled
  • Be loaded from other sources where data capture was done

In the latter case the data governance necessary to ensure fitness for multiple uses must stretch to the ingestion in these sources.

Now, if repurposing is seen as a future not yet discovered purpose of use, what can you then do to ensure that data today are fit for future repurposing?

The only answer is probably real world alignment as discussed here on a page called Data Quality 3.0. Make sure your data are reflecting the real world as close as we can when captured and make sure data can be maintained in order to keep that alignment. And make sure this is done and facilitated where data are entered.

Bookmark and Share

Sharing Social Master Data

If a company runs a Customer Relationship Management (CRM) system all employees are supposed to enter their interactions with customers and prospects including adding new accounts and contacts if it’s the first engagement.

With the rise of social networks first engagements are increasingly done in those networks. Furthermore new employees often bring old contacts from former employments with them thus utilizing an established relationship that probably is manifested in one or more already existing social network connections.

As explained in the post Social Master Data Management the term ”Social CRM” has been around for a while. We now see CRM solutions where the account and contact master data primarily is build on extracting those data from social networks.

I have just tried out such a solution called Nimble.

If you are more than a one-man-band company it’s interesting in what degree you are willing (or forced) to share your connections as master data entities for the CRM solution.

In Nimble you have the choice of differentiate for each network. I would probably freely choose a setup with Twitter and LinkedIn as shared with the team, but Facebook as private:

But that is just how I think based on my way of using social networks.

There is a fundamental data quality versus privacy issue around utilizing employee’s social network connections as master data for CRM and eventually enterprise wide Master Data Management (MDM).

All things equal data quality will be best if everyone contributes within reason. Not at least in sales, but also more or less in other functions, you are hired also because of your relations.

What do you think?

Bookmark and Share

Informatics for adding value to information

Recently the Global Agenda Council on Emerging Technologies within the World Economic Forum has made a list of the top 10 emerging technologies for 2012. According to this list the technology with the greatest potential to provide solutions to global challenges is informatics for adding value to information.

As said in the summary: “The quantity of information now available to individuals and organizations is unprecedented in human history, and the rate of information generation continues to grow exponentially. Yet, the sheer volume of information is in danger of creating more noise than value, and as a result limiting its effective use. Innovations in how information is organized, mined and processed hold the key to filtering out the noise and using the growing wealth of global information to address emerging challenges.”

Big data all over

Surely “big data” is the buzzword within data management these days and looking for extreme data quality will be paramount.

Filtering out the noise and using the growing wealth of global information will help a lot in our endurance to make a better world and to make better business.

In my focus area, being master data management, we also have to filtering out the noise and exploit the growing wealth of information related to what we may call Big Master Data.

Big external reference data

The growth of master data collections is also seen in collections of external reference data.

For example the Dun & Bradstreet Worldbase holding business entities from around the world has lately grown quickly from 100 million entities to over 200 millions entities. Most of the growth has been due to better coverage outside North America and Western Europe, with the BRIC countries coming in fast. A smaller world resulting in bigger data.

Also one of the BRICS, India, is on the way with a huge project for uniquely identifying and holding information about every citizen – that’s over a billion. The project is called Aadhaar.

When we extend such external registries also to social networking services by doing Social MDM, we are dealing with very fast growing number of profiles in Facebook, LinkedIn and other services.

Surely we need informatics for adding the value of big external reference data into our daily master data collections.

Bookmark and Share

Five Moments of Truth in Subscriber Data Management

The term “Subscriber Data Management” with SDM as the TLA is the industry flavor in the telecommunication sector of the general term “Customer Data Management”.

Recently Teresa Cottam, research director of Telesperience, made a good introduction to the subject in an interview on DataQualityPro.com.

As we have a term as “Customer Master Data Management” we will then also have a term as “Subscriber Master Data Management”.

Based on my experience with phone companies “Subscriber Master Data Management” will be very much about (better) handling the subscriber’s life circle.

These are probably the five most important moments in a subscriber’s life circle(s):

  • A lead is born
  • Engaging a prospect
  • One more subscriber
  • Churn happens
  • Win-Back happiness

A lead is born

One of the most important things to do when capturing the data at this point is ensuring if you already have the person/business behind the subscriber somewhere in the life circle or maybe even in other party roles as examined in the post 360° Business Partner View.

Engaging a prospect

Much of the information prospects are asked about already exist somewhere in the cloud. Why not take advantage of these rich sources as described in Reference Data at Work in the Cloud. By doing that you will have fewer keystrokes and a much better chance of getting it right the first time.  

One more subscriber

After a successful sales process a new subscriber can be added to the subscriber list often with more data being captured as adding a billing address and stating credit risk as credit limit and terms of payment.

This is the point where many party entities are split into data silos. Maybe the current subscriber master data lives on in sales oriented systems while new subscriber data are reentered and enriched in an ERP system and other business applications.

Keeping these data silos aligned is the master data challenge as discussed in the post Boiling Data Silos.

Churn happens

A churn is often seen as the termination of a given subscription. But did the person/business behind the subscription really quit or is the service still covered by other subscriptions by the same person, by the household or within a company family tree?

Isn’t the person among us anymore or did a business dissolve?  

Such questions can be answered better if you are practicing Ongoing Data Maintenance

Win-Back happiness

If a person or business really did quit, but then comes back, then be sure to build on the data from the first engagement and not start from scratch again capturing master data and history. Avoiding this covers up for some of the 55 reasons to improve data quality related to party master data uniqueness.

Bookmark and Share

Wildcard Search versus Fuzzy Search

My last post about search functionality in Master Data Management (MDM) solutions was called Search and if you are lucky you will find.

In the comments the use of wildcards versus fuzzy search was touched.

The problem with wildcards

I have a company called “Liliendahl Limited” as this is the spelling of the name as it is registered with the Companies House for England and Wales.

But say someone is searching using one of the following strings:

  • “Liliendahl Ltd”,
  • “Liliendal Limited” or
  • “Liljendahl Limited”

Search functionality should in these situations return with the hit “Liliendahl Limited”.

Using wildcard characters could, depending on the specific syntax, produce a hit in all combinations of the spelling with a string like this: “lil?enda*l l*”.

The problem is however that most users don’t have the time, patience and skills to construct these search strings with wildcard characters. And maybe the registered name was something slightly else not meeting the wildcard characters used.  

Matching algorithms

Tools for batch matching of name strings have been around for many years. When doing a batch match you can’t practically use wildcard characters. Instead matching algorithms typically rely of one, or in best case a combination, of these techniques:

The same techniques can be used for interactive search thus reaching a hit in one fast search.

Fuzzy search

I have worked with the Omkron FACT algorithm for batch matching. This algorithm has morphed into being implemented as a fuzzy search algorithm as well.

One area of use for this is when webshop users are searching for a product or service within your online shop. This feature is, along with other eCommerce capabilities, branded as FACT-Finder.

The fuzzy search capabilities are also used in a tool I’m involved with called iDQ. Here external reference data sources, in combination with internal master data sources, are searched in an error tolerant way, thus making data available for the user despite heaps of spelling possibilities.

Bookmark and Share