Data Driven Data Quality

In a recent article Loraine Lawson examines how a vast majority of executives describes their business as “data driven” and how the changing world of data must change our approach to data quality.

As said in the article the world has changed since many data quality tools were created. One aspect is that “there’s a growing business hunger for external, third-party data, which can be used to improve data quality”.

Embedding third-party data into data quality improvement especially in the party master data domain has been a big part of my data quality work for many years.

Some of the interesting new scenarios are:

Ongoing Data Maintenance from Many Sources

As explained in the article on Wikipedia about data quality services as the US National Change of Address (NCOA) service and similar services around the world has been around for many years as a basic use of external data for data quality improvement.

Using updates from business directories like the Dun & Bradstreet WorldBase and other national or industry specific directories is another example.

In the post Business Contact Reference Data I have a prediction saying that professional social networks may be a new source of ongoing data maintenance in the business-to-business (B2B) realm.

Using social data in business-to-consumer (B2C) activities is another option though also haunted with complex privacy considerations.

Near-Real-Time Data Enrichment

Besides updating changes of basic master data from business directories these directories typically also contains a lot of other data of value for business processes and analytics.

Address directories may also hold further information like demographic stereotype profiles, geo codes and property data elements.

Appending phone numbers from phone books and checking national suppression lists for mailing and phoning preferences are other forms of data enrichment used a lot related to direct marketing.

Traditionally these services have been implemented by sending database extracts to a service provider and receiving enriched files for uploading back from the service provider.

Lately I have worked with a new breed of self service data enrichment tools placed in the cloud making it possible for end users to easily configure what to enrich from a palette of address, business entity and consumer/citizen related third-party data and executing the request as close to real-time as the volume makes it possible.

Such services also include the good old duplicate check now much better informed by including third-party reference data.

Instant Data Quality in Data Entry

As discussed in the post Avoiding Contact Data Entry Flaws third-party reference data as address directories, business directories and consumer/citizen directories placed in the cloud may be used very efficiently in data entry functionality in order to get data quality right the first time and at the same time reduce the time spend in data entry work.

Not at least in a globalized world where names of people reflect the diversity of almost any nation today, where business names becomes more and more creative and data entry is done at shared service centers manned with people from cultures with other address formatting rules, there is an increased need for data entry assistance based on external reference data.

When mashing up advanced search in third-party data and internal master when doing data entry you will solve most of the common data quality issues around avoiding duplicates and getting data as complete and timely as needed from day one.

Bookmark and Share

Pulling Data Quality from the Cloud

In a recent post here on the blog the benefits of instant data enrichment was discussed.

In the contact data capture context these are some examples:

  • Getting a standardized address at contact data entry makes it possible for you to easily link to sources with geo codes, property information and other location data.
  • Obtaining a company registration number or other legal entity identifier (LEI) at data entry makes it possible to enrich with a wealth of available data held in public and commercial sources.
  • Having a person’s name spelled according to available sources for the country in question helps a lot with typical data quality issues as uniqueness and consistency.

However, if you are doing business in many countries it is a daunting task to connect with the best of breed sources of big reference data. Add to that, that many enterprises are doing both business-to-business (B2B) and business-to-consumer (B2C) activities including interacting with small business owners. This means you have to link to the best sources available for addresses, companies and individuals.

A solution to this challenge is using Cloud Service Brokerage (CSB).

An example of a Cloud Service Brokerage suite for contact data quality is the instant Data Quality (iDQ™) service I’m working with right now.

This service can connect to big reference data cloud services from all over the world. Some services are open data services in the contact data realm, some are international commercial directories, some are the wealth of national reference data services for addresses, companies and individuals and even social network profiles are on the radar.

Bookmark and Share

The Big ABC of Reference Data

Reference Data is a term often used either instead of Master Data or as related to Master Data. Reference data is those data defined and (initially) maintained outside a single organisation. Examples from the party master data realm are a country list, a list of states in a given country or postal code tables for countries around the world.

The trend is that organisations seek to benefit from having reference data in more depth than those often modest populated lists mentioned above.

In the party master data realm such reference data may be core data about:

  • Addresses being every single valid address typically within a given country.
  • Business entities being every single business entity occupying an address in a given country.
  • Consumers (or Citizens) being every single person living on an address in a given country.

There is often no single source of truth for such data. Some of the challenges I have met for each type of data are:

Addresses

The depth (or precision if you like) of an address is a common problem. If the depth of address data is at the level of building numbers on streets (thoroughfares) or blocks, you have issues as described in the blog post called Multi-Occupancy.

Address reference data of course have issues with the common data quality dimensions as:

  • Timeliness, because for example new addresses will exist in the real world but not yet in a given address directory.
  • Accuracy, as you are always amazed when comparing two official sources which should have the same elements, but haven’t.

Business Entities

Business directories have been accessible for many years and are often used when handling business-to-business (B2B) customer master data and supplier master data management. Some hurdles in doing this are:

  • Uniqueness, as your view of what a given business entity is occasionally don’t match the view in the business directory as discussed in the post 3 out of 10
  • Conformity, because for example an apparently simple exercise as assigning an industry vertical can be a complex matter as mentioned in the post What are they doing?

Consumers (or Citizens)

In business-to-consumer (B2C) or other activities involving citizens a huge challenge is identifying the individuals living on this planet as pondered in the post Create Table Homo Sapiens. Some troubles are:

  • Consistency isn’t easy, as governments around the world have found 240 (or so) different solutions to balancing privacy concerns and administrative effectiveness.
  • Completeness, as the rules and traditions not only between countries, but also within different industries, certain activities and various channels, are different.

Big Reference Data as a Service

Even though I have emphasized on some data quality dimensions for each type of data, all dimensions apply to all types of data.

For organisations operating multinational and/or multichannel exploiting the wealth and diversity of external reference data is a daunting task.

This is why I see reference data as a service embracing many sources as a good opportunity for getting data quality right the first time. There is more on this subject in the post Reference Data at Work in the Cloud.

Bookmark and Share

Small Business Owners

A challenge I encounter over and over again within Data Matching and customer Master Data Management is what to do with small business owners.

Examples of small business owners are:

  • Farmers
  • Healthcare professionals with an own clinic
  • Small family driven shop owners
  • Modest membership organisation administrators
  • Local hospitality providers as Basil Fawlty of Fawlty Towers
  • Independent Data Quality consultants as myself

When handling customer master data we often like to divide those into Business-to-consumer (B2C) or Business-to-business (B2B). We may have different source systems, different data models and different data owners and data stewards for each of the two divisions.

But small business owners usually belong to both divisions. In some transactions they act as private persons (B2C) and in some other transactions they act as a business contact (B2B). If you like to know your customer, have a single customer view , engage in social media and all that jazz, you must have a unique view of the person, the business and the household.

In several industries small business owners, the business and the household is a special target group with unique product requirements. This is true for industries as banking, insurance, telco, real estate, law.

So here are plenty of business cases for multi-domain Master Data Management embracing customer master data and product master data.

The capability to handle a single customer view of small business owners is in my experience very poorly fulfilled in Data Quality and Master Data Management solutions around. Here is certainly room for improvement and entrepreneurship.

Bookmark and Share

Lean Social MDM

I have previously written some blog posts about “Social MDM” using the term “Social MDM” to describe the trend of having social media (master) data as a new complexity on top of the already known conundrum of mastering traditional master data.

Stephan Zoder of IBM Initiate discussed this topic in a recent post called CMM is Actually High-Frequency, Social MDM (where CMM is about Customer Motivation Management).

As I also briefly examined the term “Lean MDM” last week I wonder if it is possible to start embracing social media (master) data under a term as “Lean Social MDM”.

The lean MDM post included an actual real life project I have been involved in, which was about how the car rental giant Avis achieved lean MDM for the Scandinavian business.

An underlying business case for this project was that many decisions about car rental is made by individual persons who may act as an employee at (changing) employers and as private renters. Therefore the emphasis of the master data management was at the person in contact, user and private roles.

Having a “single person view” is in my eyes, if it wasn’t before, a good place to start your “Lean Social MDM” journey.

Bookmark and Share

Proactive Data Governance at Work

Data governance is 80 % about people and processes and 20 % (if not less) about technology is a common statement in the data management realm.

This blog post is about the 20 % (or less) technology part of data governance.

The term proactive data governance is often used to describe if a given technology platform is able to support data governance in a good way.

So, what is proactive data governance technology?

Obviously it must be the opposite of reactive data governance technology which must be something about discovering completeness issues like in data profiling and fixing uniqueness issues like in data matching.

Proactive data governance technology must be implemented in data entry and other data capture functionality. The purpose of the technology is to assist people responsible for data capture in getting the data quality right from the start.

If we look at master data management (MDM) platforms we have two possible ways of getting data into the master data hub:

  • Data entry directly in the master data hub
  • Data integration by data feed from other systems as CRM, SCM and ERP solutions and from external partners

In the first case the proactive data governance technology is a part of the MDM platform often implemented as workflows with assistance, checks, controls and permission management. We see this most often related to product information management (PIM) and in business-to-business (B2B) customer master data management. Here the insertion of a master data entity like a product, a supplier or B2B customer involves many different employees each with responsibilities for a set of attributes.

The second case is most often seen in customer data integration (CDI) involving business-to-consumer (B2C) records, but certainly also applies to enriching product master data, supplier master data and B2B customer master data. Here the proactive data governance technology is implemented in the data import functionality or even in the systems of entry best done as Service Oriented Architecture (SOA) components that are hooked into the master data hub as well.

It is a matter of taste if we call such technology proactive data governance support or upstream data quality. From what I have seen so far, it does work.

Bookmark and Share

Party On

The most frequent data domain addressed in data quality improvement and master data management is parties.

Some of the issues related to parties that keeps on creating difficulties are:

  • Party roles
  • International diversity
  • Real world alignment

Party roles

Party data management is often coined as customer data management or customer data integration (CDI).

Indeed, customers are the lifeblood of any enterprise – also if we refer to those who benefit from our services as citizens, patients, clients or whatever term in use in different industries.

But the full information chain within any organization also includes many other party roles as explained in the post 360° Business Partner View. Some parties are suppliers, channel partners and employees. Some parties play more than one role at the same time.

The classic question “what is a customer?” is of course important to be answered in your master data management and data quality journey. But in my eyes there is lot of things to be solved in party data management that don’t need to wait for the answer to that question which anyway won’t be as simple as cutting the Gordian Knot as said in the post Where is the Business.

International diversity

As discussed in the post The Tower of Babel more and more organizations are met with multi-cultural issues in data quality improvement within party data management.

Whether and when an organization has to deal with international issues is of course dependent on whether and in what degree that organization is domestic or active internationally. Even though in some countries like Switzerland and Belgium having several official languages the multi-cultural topic is mandatory. Typically in large countries companies grows big before looking abroad while in smaller countries, like my home country Denmark, even many fairly small companies must address international issues with data quality.

However, as Karen Lopez recently pondered in the post Data Quality in The Wild, Some Where …, actually everyone, even in the United States, has some international data somewhere looking very strange if not addressed properly.

Real world alignment

I often say that real world alignment, sometimes as opposed to the common definition of data quality as being fit for purpose, is the short cut to getting data quality right related to party master data.

It is however not a straight forward short cut. There are multiple challenges connected with getting your business-to-business (B2B) records aligned with the real world as discussed in the post Single Company View.  When it comes to business-to-consumer (B2C) or government-to-citizen (G2C) I think the dear people who sometimes comments on this blog did a fine job on balancing mutating tables and intelligent design in the post Create Table Homo_Sapiens.

Bookmark and Share

B2C versus B2B Data Quality

The data quality issues in doing business with private consumers (business-to-consumer = B2C) and doing business with other business’s (business-to-business = B2B) have a lot of similar challenges but also differs in a lot of ways.

Some of my experiences (and thoughts) related to different master data domains are:

Customer master data

In B2C the number of customers, prospects and leads is usually high and characterized by relatively few interactions with each entity.  In B2B you usually have a relatively small number of customers with a high number of interactions.

One of the most automated activities in data quality improvement is matching master data records with information about customers. Many of the examples we see in marketing material, research documents, blog posts and so on is about matching in the B2C realm. This is natural since the high number of records typically with a low attached value calls for automation.

Data matching in the B2B realm is indeed more complex due to numerous challenges like less standardized names of companies and typically more options in what constitutes a single customer. The high value attached to each customer also makes the risk of mistakes a showstopper for too much automation.

So in B2B we see an increasing adaption of creating workflows that insures data quality during data capture often by exploiting external reference data which also in general are more available related to business entities.

Location master data

The location of B2C customers means a lot. Accurate and timely delivery addresses for everything from direct mails to bringing goods to the premises are essential. Location data are used to recognize household relations, assigning demographic stereotypes and in many cases calculating fees of different kind. I had a near disaster experience with a really bad address in my early career.

Even though location data for B2B activities theoretically is just as important, I have often seen that a little less precision is fit for purpose or anyway lower prioritized than more pressing issues.

Product master data

Theoretically there should be no difference between B2C and B2B here, but I guess there is in practice?

The most interesting aspect is probably the multi-domain aspect examining the relations between customers and products.   

I had some experiences some years ago with the B2B realm as described in the post What is Multi-Domain MDM?: 1,000 B2B customers buying 1,000 different finished products can be a quite complicated data quality operation.

Within the B2C realm the most predominant multi-domain data quality issues I have met is related to analytics. As discussed in the post Customer/Product Matrix Management it is about typifying your customers correctly and categorizing your products adequately at the same time.

Bookmark and Share

Single Business Partner View

If you search in google for “single customer view” you’ll get over 20,000 hits. If you search for “single business partner view” you’ll get zero – until I just posted this blog post.

Some time ago I wrote about getting a 360° Business Partner View elaborating on extending the 360° Customer View or Single Customer View (SVC) to embrace all sorts of party master data managed within the organization.

In fact there is at least the same amount of similar techniques used between

  • managing supplier master data and business-to business (B2B) customer master data

as there is between

  • managing business-to-business (B2B) customer master data and business-to-consumer (B2C) customer master data.

If you look at Customer Relation Management (CRM) systems almost every package is aimed at managing B2B data as the data model and the functionality supports real world B2B structures and how the sales force and other employees interacts with B2B customers and prospects.

Interacting with B2C customers and prospects is much more diverse and often supported by operational systems specialized for the industry in question like solutions for financial services, healthcare and so on.

A business partner is a party acting in the role as customer, prospect, supplier, reseller, distributor, agent and other forms of partnership. Sometimes the same party is acting in several roles at the same time thus potentially being both on the Sell–side and Buy-side of Master Data Quality management.

As sell side and buy side has intersections within party master data, in some industries we may also go deeper into identity resolution and find intersections between B2B entities and B2C entities. I’ve described these matters in the post So, how about SOHO homes. The business case is that some products in some industries are aimed at the households of business owners and the small businesses at the same time. This is for example true for industries as banking, insurance, telco, real estate and  law.

All in all achieving a single view of business partners is a task going beyond traditional customer data integration (CDI) and stretching into areas traditionally belonging to Product Information Management (PIM). This is a business case for multi-domain master data management.

Bookmark and Share

Customer Product Matrix Management

A customer/product matrix is a way of describing the relationships between customer types and product types/attributes.  

Example:

Note: Please find some data quality related product descriptions in the post Data Quality and World Food.

Filling out the matrix may be based on prejudices, gut feelings, assumptions, surveys, focus groups or data.

If we go for data we may do this by collecting available historical data related to sales and inquiries made by persons belonging to each customer type regarding products belonging to each product type.  

In doing that correctly we need two kinds of master data management and data quality assurance in place:

  • Customer Data Integration (CDI) for assigning the accurate customer type in the real world related to the uniquely identified person in transactions coming from all sources – here based on location master data.
  • Product Information Management (PIM) for categorizing the relevant fit for purpose product type.

This reminds me about multi-domain master data management. Customer master data (or shall we say party master data), product master data and location master data used to figure out how to do business. I like it – both the master data management part and the mentioned product types.  

Bookmark and Share