Some Kinds of Reference Data

The term ”reference data” and related Reference Data Management (RDM) is used commonly in the data quality and Master Data Management (MDM) realm.

As with most terms it may be used with slightly different meanings. Usually, but not necessarily always, reference data are core data entities defined outside a given organization.

I have come across the below discussed kinds of reference data:

Reference Data in Investment Banking

The term “reference data” is well established in investment banking. Reference data are core master data entities as counterparties, securities and currencies. These are the things you deal with in investment banking. They are not made up for a given bank or other single financial institution but are shared across the whole market and should optimally be the same to every institution at exactly the same point of time.

RDMSmall Reference Data

In Master Data Management in general we usually see reference data as value lists helping describing and standardizing internal master data.

One example will be a country list. A list of countries should be the same for every organization in the world. However available lists does differ though most variations usually don’t have any business impact as the academic question about if Antarctica should be in the list or not.

A list of codes describing to which industry a given company belongs is another example of reference data. As examined in the post What are they doing? you may choose to standardize on SIC codes or standardise on NACE codes or develop your own set of codes for that purpose.

Big Reference Data

In geography a country list is in the top levels of defining locations. Further deep we may have postal code systems within each country as ZIP codes in the United States, PLZ codes in Germany and PIN codes in India. Yet further deep we have every single valid postal address eventually all over the world. This is what I call big reference data.

A way of sourcing industry codes for your customers, suppliers and other business partners will be picking from or enriching from a business directory like for example the D&B WorldBase or any other of the many business directories around. Such directories may also be seen as big reference data.

The dramatic increase in the use of social media and related social network profiles has emerged as a new kind of big reference data serving as links to our internal master data.

Bookmark and Share

Addressing Digital Identity

A physical address has traditionally been a core element of doing identity resolution. Stating a name and an address is the most widespread way of telling with which person or which company we are (aiming at) having a business and other form of relationship.

However, during the last 25 years a lot of things have moved from the physical world to the online world. Not at least a lot of things start in the online world while in many cases ends up in the physical world. Today selling, the smart way, starts in social media. Final delivery may be digital or may be sending a package or a consultant to a physical address. A thing like dating most often starts in the online world today but surely the aim is a physical encounter.

This new way of life has a tremendous affect on data quality and master data management. Within quality of contact data, the most frequent domain for data quality issues, we have traditionally dealt with verifying names and addresses and deduplicating names and addresses.

As the best way of preventing data quality issues is looking at the root we must address that onboarding of contact data often starts with a digital identity where a physical address isn’t present in the first place but often will be updated at a later stage.

As described in the post Social MDM and Systems of Engagement a new trend in master data management is to establish a link between the new systems of engagement and the old systems of record.

In the same way data quality prevention and improvement will have to cover establishing a link between a new discipline being digital identity resolution and the good old address verification stuff.

Bookmark and Share

MDM meets MDM

The three letter acronym MDM may mean several different things. It can for example be:

This morning I read an article on metering.com telling about that Meter Data Management varies across Europe. The article was about the different approaches that are taken in various European countries when it comes to access to and storing meter readings in the energy sector.

I noticed that these different approaches very much resembles how public sector basic data about addresses, companies and citizens are made accessible and stored in the same countries. Scandinavia has a centralized approach, some other countries have some hybrid solutions and Germany has a strictly decentralized approach.

As told in the post instant Data Quality at Work I am right now working with Master Data Management in the energy sector. These projects certainly have some bordering zones to Meter Data Management.

So it’s good to see that reading the article about MDM (in this case Meter Data Management) makes just as much sense if you thought it was about MDM (being Master Data Management):

“The most important factors for supporting and choosing a particular model are cost efficiency, transparency, data security and efficient processes. The rationale for centralized MDM also seems to be strengthened .. because of the increased amount of information exchanged.

There is also a clear understanding .. that the chosen MDM model needs clear rules regarding data access, privacy and security, while enabling proportionate access to data by authorized parties to ensure that benefits can be delivered.

As such many regulatory changes are occurring .. that .. considers that efficient and secure information and data access for relevant stakeholders is fundamental for a proper .. market functioning and customer protection and empowerment .. and indeed different countries might require different MDM models on the basis of their market design specificities.”

So, MDM is just like MDM.

Bookmark and Share

Beyond True Positives in Deduplication

The most frequent data quality improvement process done around is deduplication of party master data.

A core functionality of many data quality tools is the capability to find duplicates in large datasets with names, addresses and other party identification data.

When evaluating the result of such a process we usually divide the result of found duplicates into:

  • False positives being automated match results that actually do not reflect  real world duplicates
  • True positives being  automated match results reflecting the same real world entity

The difficulties in reaching the above result aside, you should think the rest is easy. Take the true positives, merge into a golden record and purge the unneeded duplicate records in your database.

Well, I have seen so many well executed deduplication jobs ending just there, because there are a lot of reasons for not making the golden records.

Sure, at lot of duplicates “are bad” and should be eliminated.

But many duplicates “are good” and have actually been put into the databases for a good reason supporting different kind of business processes where one view is needed in one case and another view is needed in another case.

Many, many operational applications, including very popular ERP and CRM systems, do have inferior data models that are not able to reflect the complexity of the real world.

Only a handful of MDM (Master Data Management) solutions are able to do so, but even then the solutions aren’t easy as most enterprises have an IT landscape with all kinds of applications with other business relevant functionality that isn’t replaced by a MDM solution.

What I like to do when working with getting business value from true positives is to build a so called Hierarchical Single Source of Truth.

Bookmark and Share

Social MDM and Complex Sales

Social Master Data Management (Social MDM) is about linking the increasing trend of doing business via social media, using what we may call “systems of engagement”, with the traditional way of supporting business using what we call “systems of record”.

Doing social MDM is a natural consequence of adapting social CRM (Social Customer Relation Management). Many CRM solutions are supporting Business-to Business (B2B) activities helping with keeping track of what’s going on with a lot of contacts related to a business account within so called complex sales processes.

Traditional MDM in B2B environments has been much about a single view of the business account and the legal entity behind. As social CRM is much about the relations to the business contacts, the people side of business, we need a solid master data foundation behind the people being those contacts.

The same individual may in fact be an important influencer related to a range of business accounts being the legal entity with who you are aiming for a sales contract. You need a single view of that. So many sales contracts are based on a relation to a buyer moving from one business account to another. You need to be the winner in that game and the answer to that may very well be your ability to do better social MDM.

Social MDM adds a new external source of reference data to MDM solutions for B2B customer master data management. This new source is professional social network profiles where LinkedIn is the most known and used service around.

It is early days for social MDM solutions so it is quite exciting for me to work with designing the first kind of such solutions around the MDM edition of the instant Data Quality service.

Stay tuned for more news in this field on this blog in the times to come.

Bookmark and Share

Hierarchical Single Source of Truth

Most data quality and master data management gurus, experts and practitioners agree that achieving a “single source of truth” is a nice term, but is not what data quality and master data management is really about as expressed by Michele Goetz in the post Master Data Management Does Not Equal The Single Source Of Truth.

Even among those people, including me, who thinks emphasis on real world alignment could help getting better data and information quality opposite to focusing on fitness for multiple different purposes of use, there is acknowledgement around that there is a “digital distance” between real world aligned data and the real world as explained by Jim Harris in the post Plato’s Data. Also, different public available reference data sources that should reflect the real world for the same entity are often in disagreement.

When working with improvement of data quality in party master data, which is the most frequent and common master data domain with issues, you encounter the same issues over and over again, like:

  • Many organizations have a considerable overlap of real world entities who is a customer and a supplier at the same time. Expanding to other party roles this intersection is even bigger. This calls for a 360° Business Partner View.
  • Most organizations divide activities into business-to-business (B2B) and business-to-consumer (B2C). But the great majority of business’s are small companies where business and private is a mixed case as told in the post So, how about SOHO homes.
  • When doing B2C including membership administration in non-profit you often have a mix of single individuals and households in your core customer database as reported in the post Household Householding.
  • As examined in the post Happy Uniqueness there is a lot of good fit for purpose of use reasons why customer and other party master data entities are deliberately duplicated within different applications.
  • Lately doing social master data management (Social MDM) has emerged as the new leg in mastering data within multi-channel business. Embracing a wealth of digital identities will become yet a challenge in getting a single customer view and reaching for the impossible and not always desirable single source of truth.

A way of getting some kind of structure into this possible, and actually very common, mess is to strive for a hierarchical single source of truth where the concept of a golden record is implemented as a model with golden relations between real world aligned external reference data and internal fit for purpose of use master data.

Right now I’m having an exciting time doing just that as described in the post Doing MDM in the Cloud.

Bookmark and Share

Doing MDM in the Cloud

As reported in the post What to do in 2012 doing Master Data Management (MDM) in the cloud is one of three trends within MDM that according to Gartner (the analyst firm) will shape the MDM market in the coming years.

Doing MDM in the cloud is an obvious choice if all your operational applications are in the cloud already. Such a solution was presented on Informatica Perspectives in the blog post Power the Social Enterprise with a Complete Customer View. The post includes a Video where the situation with multiple instances of SalesForce.com solutions within the same enterprise is supported by a master data backbone in the cloud.

But even if all your operational applications are on premise you may start with lifting some master data management functionality up in the cloud. I am currently working with such a solution.

When onboarding customer (and other party) master data much of the basic information needed is already known in the cloud. Therefore lifting the onboarding functionality up into the cloud makes a lot of sense. This is the premise, so to speak, for the MDM edition of the instant Data Quality (iDQ) solution that we are working on these days.

Cloud services for the other prominent MDM domain being product master data also makes a lot of sense. As told in the post Social PIM a lot of basic product master data may be shared in the cloud embracing the supply chain of manufacturers, distributors, retailers and end users.

In both these cases some of the master data management functionality is handled in the cloud while the data integration stuff takes place where the operational applications resides be that in the cloud and/or on premise.

Bookmark and Share

Free and Open Public Sector Master Data

Yesterday the Danish Ministry of Finance announced an agreement between local authorities and the central government to improve and link public registers of basic data and to make data available to the private sector.

Once the public authorities have tidied up, merged the data and put a stop to parallel registration, annual savings in public administration could amount to 35 million EUR in 2020.

Basic open data includes private addresses, companies’ business registration numbers, cadastral numbers of real properties and more. These master data are used for multiple purposes by public sector bodies.

Private companies and other organizations can look forward to large savings when they no longer have to buy their basic data from the public authorities.

In my eyes this is a very clever move by the authorities exactly because of the two main opportunities mentioned:

  • The public sector will see savings and related synergies from a centralized master data management approach
  • The private sector will gain a competitive advantage from better and affordable reference data accessibility and thereby achieve better master data quality.

Denmark have, along with the other Nordic countries, always had a more mature public sector master data approach than we see in most other countries around the world.

I remember I worked with the committee that prepared a single registry for companies in Denmark back in the 80’s as mentioned in the post Single Company View.

Today I work with a solution called iDQ (instant Data Quality) which is about mashing up internal master data and a range of external reference data from social networks and not at least public sector sources. In that realm there is certainly not something rotten in Denmark. Rather there is a good answer to the question about to be free and open or not to be.

Bookmark and Share

Killing Keystrokes

Keystrokes are evil. Every keystroke represents a potential root cause of poor data quality by spelling things wrongly, putting the right thing in the wrong place, putting the wrong thing in the right place and so on. Besides that every keystroke is a cost of work summing up with all the other keystrokes to gigantic amounts of work costs.

In master data management (MDM) you will be able to getting things right, and reduce working costs, by killing keystrokes wherever possible.

Killing keystrokes in Product Information Management (PIM)

I have seen my share of current business processes where product master data are reentered or copied and pasted from different sources extracted from one product master data container and, often via spreadsheets, captured into another product master data container.

This happens inside organizations and it happens in the ecosystem of business partners in supply chains encompassing manufactures, distributors and retailers.

As touched in the post Social PIM there might be light at the end of the tunnel by the rise of tools, services and platforms setting up collaboration possibilities for sharing product master data and thus avoiding those evil keystrokes.

Killing keystrokes in Party Master Data Management

With party master data there are good possibilities of exploiting external data from big reference data sources and thus avoiding the evil keystrokes. The post instant Data Quality at Work tells about how a large utility company have gained better data quality, and reduced working costs, by using the iDQ™ service in that way within customer on-boarding and other business processes related to customer master data maintenance.

The next big thing in this area will be the customer data integration (CDI) part of what I call Social MDM, where you may avoid the evil keystrokes by utilizing the keystrokes already made in social networks by who the master data is about.

Bookmark and Share

The Big MDM Trend

Back in 2011 Gartner (the analyst firm) released a document where Gartner Highlights Three Trends That Will shape the Master Data Management Market.

The three things were:

  • Multi-Domain MDM
  • MDM in the Cloud
  • MDM and Social Networks

MDM and Social Networks (also called Social MDM) was described as shown below:

Gartner 3 MDM things 2011

In a 2012 article on Computerweekly called Three trends that will shape the master data management market also by John Radcliffe of Gartner the three trends are repeated however with social MDM now described in the context of MDM and big data:

Gartner 3 MDM things 2012

The slightly different use of terms to describe the trends and what it entails used by Gartner follows the big trend of using the term “big data” by everyone else in the industry as discussed in the post Data Quality vs Big Data, where you see that the use of the term “big data” exploded just after the original Gartner piece on the three trends.

Bookmark and Share