The Data Enrichment ABC

A popular and indeed valuable method of avoiding decay of data quality in customer master data and other master data entities is setting up data enrichment services based on third party reference data sources. Examples of such services are:

  • Relocation updates like National Change Of Address services from postal services
  • Change of name, address and a variety of status updates from business directories and in some countries citizen directories too

When using such services you will typically want to consider the following options for how to deal with the updates:

A: Automatic Update

Here your internal master data will be updated automatically when a change is received from the external reference data source.

C: Excluded Update

Here an automated rule will exclude the update as there may be a range reasons for why you don’t want to update certain entity segments under certain circumstances.

B: Interactive Update

Here the update will require a form of manual intervention either to be fulfilled or excluded based on human decision.

An example will be if a utility supplier receives a relocation update for the occupier at an installation address. This will trigger/support a complex business process far beyond changing the billing address.

iDQ logo
iDQ

As explained in the post When Computer Says Maybe we need functionality within data quality tools and Master Data Management (MDM) solutions to support data stewards in cost effectively handling these situations and this certainly also applies to the B pot in data enrichment.

Right now I’m working with designing such data stewardship functionality within the instant Data Quality environment.

Bookmark and Share

OK, so big data is about size (and veracity)

During the rise of the term “big data” there has been a lot of different definitions around trying to shortly express what this very popular term really is about. A lot of these definitions has included a sentiment about that big data is not (only) about size. The tree V’s being Volume, Variety and Velocity has been very popular. A fourth V being Veracity has been added, though this hardly isn’t a definition of big data but rather a desirable capability of big (and any other) data.

OEDBut apparently big data is about size.

The Oxford English Dictionary has now included big data in this authoritative explanation of English words and terms, and big data is:

“Data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges”.

It’s interesting that the challenges that make data big are not about analyzing the data. It is about data manipulation and data management. These are by the way things you do to achieve veracity.

Bookmark and Share

In the future, data quality will be more social

Every time I walk in and out of a plane at London-Gatwick Airport I always nod at an advert from the HSBC bank saying that in the future, selling will be more social:

Selling will be more social

A natural consequence of this will also be that data quality improvement (and master data management) will be more social.

One example is how complex sales, being sales processes typically in business-to-business (B2B) environments, will be heavily depended on integrating the exploitation of professional social networks as discussed on the DataQualityPro interview about the benefits of Social MDM.

Traditional Master Data Management (MDM) and related data quality improvement in B2B environments has been a lot about a single view of the business account and the legal entity behind. As Social Customer Relation Management (CRM) is much about the relations to the business contacts, the people side of business, we need a solid master data foundation behind the people being those contacts.

The same individual may in fact be an important influencer related to a range of business accounts being the legal entity with who you are aiming for a sales contract. You need a single view of that. So many sales contracts are based on a relation to a buyer moving from one business account to another. You need to be the winner in that game and the answer to that may very well be your ability to do better social MDM and embrace the data quality issues related to that.

Social selling of course also relates to business-to-consumer (B2C) activities and in doing that we will see new data quality issues. When exploiting social networks, both in B2B and B2C activities you need to link the traditional attributes as name and address with new attributes in the online and social world as explained in the post Multi-Channel Data Matching.

Besides exploiting social networks we will also see social collaboration as a mean to improve data quality. Social collaboration will go beyond collaboration within a single company and extend to the ecosystems of manufacturers, distributors, resellers and end users. A good example of this is the social collaboration platform called Actualog, which is about sharing product master data and thereby improving product data quality.

Bookmark and Share

Social Score Credibility

A recent piece from Fliptop is called What’s the Score. It is a thorough walk through on what is usually called social scoring done in influence scoring platforms within social media, where Klout, Kred and PeerIndex are the most known services of that kind.

The Fliptop piece has a section around faking, which was also the subject in a post lately on this blog. The post is called Fact Checking by Mashing Up, and is about how to link social network profiles with other known external sources in order to detect cheat. Linking social network profiles with other external sources and internal sources is what is known as Social MDM, a frequent subject on this blog for several years.

A social score must of course be seen in context, as it matters a lot what you are influential about when you want to use social scoring for business. As told in the post Klout Data Quality this was a challenge two years ago, and this is probably still the case. Also here I think linking with other (big) data sources and letting Social MDM be the hub will help.

Kred
Taken from Kred on my twitter handle.

PS: I have no idea why moron ended up there. Einstein is OK.

Bookmark and Share

Real World Alignment and Continental Drift

You can find many great analogies for working with data quality and Master Data Management (MDM) in world maps. One example is reported in the post The Greenland Problem in MDM, which is about how different business units have a different look on the same real world entity.

Real world alignment isn’t of course without challenges. Also because the real world changes as reported on Daily Mail in an article about how modern countries would be placed on the landmasses as they were 300 million years ago.

World 300 M years ago

The image above may very well show how many master data repositories today reflect the real world. Yep, we may have the country list covered well enough. We may even do quite well if we look at each geographical unit independently. However, the big picture doesn’t fit the world as it is today.

Bookmark and Share

Psychographic Master Data Management

As told in the post Psychographic Data Quality marketers are moving from demographic marketing to psychographic marketing where a lot more data than before are used to getting the right message, to the right suspect at the right time. This affects the way we are working with data quality around customer master data and eventually how we do multi-domain master data management.

Using data for building psychographic profiles not only deals with lead generation. It’s usable throughout the whole customer master data life cycle by for example:

  • psychographic MDMFinding the best suspects at the right moment
  • Keeping the prospects on the optimal track coordinated with the prospects need
  • Ensuring a well received customer experience and facilitating up-sell and cross-sell.
  • Preventing churn
  • Making win-back possible

These opportunities apply to business-to-consumer (B2C) and business-to-business (B2B) as well.

Location master data management is essential in this quest as well, because we are not abandoning the basic demographic attributes in the physiographic world. We are building a deeper data universe on top of the traditional demographic (and firmographic) data. Having accurate location master data only helps here.

Mastering product master data is essential in the psychographic world too. This does not only apply to having your product hierarchies well manages for your own products, but will eventually also lead to a need for handling data on your competitors products and services in order to listen to social data streams.

Master Data Management (MDM) will extend to Social Master Data Management and must support wider exploitation of big data sources by being the hub for the psychographic customer profiles and the reference for descriptions of the product and service realm related to the psychographic attributes.

Bookmark and Share

The Internet of Things and the Fat-Finger Syndrome

When coining the term “the Internet of Things” Kevin Ashton said:

“The problem is, people have limited time, attention and accuracy—all of which means they are not very good at capturing data about things in the real world.”

Indeed, many many data quality flaws are due to a human typing the wrong thing. We usually don’t do that intentionally. We do it because we are human.

Typographical errors, and the sometimes dramatic consequences, are often referred to as the “fat-finger syndrome”.

As reported in the post Killing Keystrokes avoiding typing is a way forward for example by sharing data instead of typing in the same data (a little bit differently) within every organization.

IoT Data QualityThe Internet of Things, being common access to data provided by a huge number of well defined devices, is another development in avoiding typos.

It’s not that data coming from these devices can’t be flawed. As debated in the post Social Data vs Sensor Data there may be challenges in sensor data due to errors in a human setting up the sensors.

Also misunderstandings by humans in combining sensor data for analytics and predictions may cause consequences as bad as those based on the traditional fat-finger syndrome.

All in all I guess we won’t see a decrease in the need to address data quality in the future, we just will need to use different approaches, methodologies and tools to fight bad data and information quality.

Are you interested in what all this will be about? Why not joining the Big Data Quality group on LinkedIn?

Bookmark and Share

Data Quality Luxury

I am a bit of a map addict. So when figuring out a visit to London City today I tuned in on Google Maps. When zooming in I got this map:

Louis Vuitton

The pink establishment in the lower middle is the Royal Exchange, which today is filled up by luxury shops. First guess is that Google Maps has overlaid the map with positions from a business directory where Paul Smith was placed inside the building but Louis Vuitton due to a precision issue was placed outside in front of the building.

But there may be other explanations.

As the list of shops in the Royal Exchange shows here, there apparently isn’t a Louis Vuitton shop there.

So maybe Google Maps is timely real world aligned and Louis Vuitton was kicked out of the building (for being too cheap?) and now only has a booth on the steps in front of the building?

Of course, being a data quality geek, yours truly made a real world alignment check.

My report:

  • There’s no booth with bags (fake or real) in front of the building.
  • Paul Smith is exactly on the position within the building as shown on the map.
  • There’s no Louis Vuitton shop in the building.
  • There’s a Louis Vuitton shop, with only one bag with no price tag per window (so it must be real), in the next building behind the Royal Exchange.

Conclusion:

It’s a precision issue with business directory positions on a map, where one is randomly spot on and the other isn’t. You can’t expect data quality luxury.

Bookmark and Share

Know Your Supplier

Social Responsibility for Retailers and Distributors is No Longer an Option is the title of a new blog post by Paul Sirface on the Stibo Systems Datafusion blog.

Herein Paul writes:

“While many companies know that they have to respond to consumers’ demands, those with an active Master Data Management strategy have the best chance of responding effectively.  Multi-domain Master Data Management (MDM) is the perfect place to begin organizing and collecting the data on product related information within the supply chain, including supplier compliance..”

KYC KYSKnow Your Customer (KYC) is a well established term within data management and linked to fraud protection and anti money laundering.

Know Your Supplier (KYS) is indeed an equally important side of party master data management.

While customer master data management is on the way of evolving from handling mostly domestic customer data quality issues to also handling international customer data quality issues, supplier master data management has always been about international data quality challenges for most businesses.

As with customer master data having supplier master data that is well aligned with the real world and that can be maintained to reflect changes in the real world is indeed the starting point.

Bookmark and Share

The Future of Data Stewardship

Data Stewardship is performed by data stewards.

What is a Data Steward?

A steward may in a general sense be:

  • One employed in a large household or estate to manage domestic concerns – typically an old role.
  • An employee on a ship, airplane, bus, or train who attends passengers needs – typically a new role.

My guess is that data stewardship also will tend to be going from the first kind of role related to data to the latter kind role related to data.

The current data steward role is predominately seen as the oversight of the house-holding related to the internal enterprise data assets. It’s about keeping everything there clean and tidy. It involves having routines and rules that ensure that things with data are done properly according to the traditions and culture in the enterprise.

Big Data Stewardship

In the future enterprises will rely much more on external data. Exploiting third party reference data and open government data and digging into big data sources as social data and sensor data will shift the focus from looking mostly into keeping the internal data fit for purposes.

As such you as a data steward will become more like the steward on a ship, airplane, bus or train. Data will come and go. After a nice welcoming smile you will have to carefully explain about the safety procedures. Some data will be fairly easy to handle – mostly just spending the time sleeping. Other data will be demanding asking for this and that and changing its mind shortly after. Some data will be a frequent traveler and some data will be there for the first time.

So, are you ready to attend the next batch of travelling data on board your enterprise?

star trek enterprise

Bookmark and Share