Staying in Doggerland

Currently I’m travelling a lot between my present home in London, United Kingdom and Copenhagen, Denmark where I have most of my family and where the iDQ headquarter is.

When flying between London and Copenhagen you pass the southern North Sea. In the old days (8,000 years ago) this area was a land occupied by human beings. This ancient land is known today as Doggerland.

Sometimes I feel like a citizen of Doggerland not really belonging in the United Kingdom or Denmark.

I still have some phone subscriptions in Denmark I use there and my family are using there.  The phone company seems to have a hard time getting a 360 degree customer view as I have two different spellings of my name and two different addresses as seen on the screen when I look up myself in the iDQ service:

Besides having a Customer Relationship Mess (CRM) the phone company has recently shifted their outsourcing partner (from CSC to TCS). This has caused a lot of additional mess, apparently also closing one of my subscriptions due to that they have failed to register my payments. They did however send a chaser they say, but to the oldest of the addresses where I don’t pick up mail anymore.

I called to settle the matter and asked if they could correct the address not in use anymore. They couldn’t. The operator did some kind of query into the citizen hub similar to what I can do on iDQ:

However the customer service guy’s screen just showed that I have no address in Denmark in the citizen hub (called CPR), so he couldn’t change the address.

Apparently the phone company have correctly picked up an accurate address in the citizen hub when I got the subscription but failed to update it (along with the other subscriptions) when I moved to another domestic address and now don’t have an adequate business rule when I’m registered at a foreign address.

So now I’m staying in Doggerland.

Bookmark and Share

Beyond Address Validation

The quality of contact master data is the number one data quality issue around.

Lately there has been a lot of momentum among data quality tool providers in offering services for getting at least the postal address in contact data right. The new services are improved by:

  • Being cloud based offering validation services that are implemented at data entry and based on fresh reference data.
  • Being international and thus providing address validation for customer and other party data embracing a globalized world.

Capturing an address that is aligned with the real world may have a significant effect on business outcomes as reported by the tool vendor WorldAddresses in a recent blog post.

However, a valid address based on address reference data only tells you if the address is valid, not if the addressee is (still) on the address, and you are not sure if the name and other master data elements are accurate and complete. Therefore you often need to combine address reference data with other big reference data sources as business directories and consumer/citizen reference sources.

Using business directories is not new at all. Big reference sources as the D&B WorldBase and many other directories have been around for many years and been a core element in many data quality initiatives with customer data in business-to-business (B2B) environments and with supplier master data.

Combining address reference data and business entity reference data makes things even better, also because business directories doesn’t always come with a valid address.

Using public available reference data when registering private consumers, employees and other citizen roles has until now been practiced in some industries and for special reasons. Therefore the big reference data and the services are out there and being used today in some business processes.

Mashing up address reference data, business entity reference data and consumer/citizen reference data is a big opportunity for many organizations in the quest for high quality contact master data, as most organizations actually interact with both companies and private persons if we look at the total mix of business processes.

The next big source is going to be exploiting social network profiles as well. As told in the post Social Master Data Management social media will be an additional source of knowledge about our business partners. Again, you won’t find the full truth here either. You have to mashup all the sources.

Bookmark and Share

Sharing Bigger Data

Yesterday I attended an event called Big Data Forum 2012 held in London.

Big data seems to be yet a buzzing term with many definitions. Anyway, surely it is about datasets that are bigger (and more complex) than before.

The Olympics is Going to be Bigger

One session on the big data forum was about how BBC will use big data in covering the upcoming London Olympics on the BBC website.

James Howard who I know as speckled_jim on Twitter told that the bulk of the content on the BBC Sports website is not produced by BBC. The data is sourced from external data providers and actually also the structure of the content is based on the external sources.

So for the Olympics there will be rich content about all the 10,000 athletes coming from all over the world. The BBC editorial stuff will be linked to this content of course emphasizing on the British athletes.

I guess that other broadcasting bodies and sports websites from all over the world will base the bulk of the content from the same sources and then more or less link targeted own produced content in the same way and with their look and feel.

There are some data quality issues related to sourcing such data Jim told. For example you may have your own guideline for how to spell names in other script systems.

I have noticed exactly that issue in the news from major broadcasters. For example BBC spells the new Egyptian president Mursi while CNN says his name is Morsi.

Bigger Data in Party Master Data Management

The postal validation firm Postcode Anywhere recently had a blog post called Big Data – What’s the Big Deal?

The post has the well known sentiment that you may use your resources better by addressing data quality in “small data” rather than fighting with big data and that getting valid addresses in your party master data is a very good place to start.

I can’t agree more about getting valid addresses.

However I also see some opportunities in sharing bigger datasets for valid addresses. For example:

  • The reference dataset for UK addresses typically based on the Royal Mail Postal Address File (PAF) is not that big. But the reference dataset for addresses from all over the world is bigger and more complex. And along with increasing globalization we need valid addresses from all over the world.
  • Rich address reference data will be more and more available. The UK PAF file is not that big. The AddressBase from Ordnance Survey in the UK is bigger and more complex. So are similar location reference data with more information than basic postal attributes from all over world not at least when addressed together.
  • A valid address based on address reference data only tells you if the address is valid, not if the addressee is (still) on the address. Therefore you often need to combine address reference data with business directories and consumer/citizen reference sources. That means bigger and more complex data as well.

Similar to how BBC is covering the Olympics my guess is that organizations will increasingly share bigger public address, business entity and consumer/citizen reference data and link private master data that you find more accurate (like the spelling example) along with essential data elements that better supports your way of doing business and makes you more competitive.

My recent post Mashing Up Big Reference Data and Internal Master Data describes a solution for linking bigger data within business processes in order to get a valid address and beyond.

Bookmark and Share

Finding the Truth in Social Business Directories

LinkedIn has a section called companies. When browsing around on LinkedIn you are sometimes hinted to follow a company that LinkedIn think will be of interest for you.

The other day my hint included two identical logo’s for the old Master Data Management (MDM) vendor called Siperian. Curiously and data quality geeky as I am I checked and actually there are two Siperians on LinkedIn companies:

Both have an identical head quarter address in California, USA.

So, even MDM vendors have created duplicates.

Also, Siperian was acquired by the Data Integration giant Informatica some years ago, so you should expect that the Siperians was emptied. But that is not the case. Some Siperian folks still claims working for one of the Siperian duplicates (though many also for Imformatica at the same time).

Now, I was not sure about the legal status of the old Siperian company. So I went to another social network called Companybook. On that site the company registry is based on an external business directory.

Here it seems that the Siperian company in Toronto, Canada actually still exist, though marked as owned by Informatica.

So, I’m still looking for that single source of the truth out there. Until then I will mashup the external sources out there with my internal MDM vendor knowledge as told in the post yesterday called Mashing Up Big Reference Data with Internal Master Data.

Bookmark and Share

Mashing Up Big Reference Data and Internal Master Data

Right now I’m working on a cloud service called instant Data Quality (iDQ™).

It is basically a very advanced search engine capable of being integrated into business processes in order to get data quality right the first time and at the same time reducing the time needed for looking up and entering contact data.

With iDQ™ you are able to look up what is known about a given address, company and individual person in external sources (I call these big reference data) and what is already known in internal master data.

From a data quality point of view this mashup helps with solving some of the core data quality issues almost every organization has to deal with, being:

  • Avoiding duplicates
  • Getting data as complete as possible
  • Ensuring maximal accuracy

The mashup is also a very good foundation for taking real-time decisions about master data survivorship.

The iDQ™ service helps with getting data quality right the first time. However, you also need Ongoing Data Maintenance in order to keep data at a high quality. Therefore iDQ™ is build for trigging into subscription services for external reference data.

At iDQ we are looking for partners world-wide who see the benefit of having such a cloud based master data service connected to providing business-to-business (B2B) and/or business-to-consumer (B2C) data services, data quality services and master data management solutions.

Here’s the contact data: http://instantdq.com/contact/

Bookmark and Share

Hierarchy Management in Social MDM

Hierarchy management is a core feature in master data management (MDM). When it comes to integrating social data and social network profiles into MDM, hierarchy management will be very important too.

Aggregated Level of Social MDM in B2C

The primarily privacy related challenges of social MDM not at least within business-to-consumer (B2C) have been a topic of a lot of blogging lately.  Examples are:

One way of overcoming the privacy considerations is linking to social data and social network profiles at an aggregate level.

Using aggregate level linking is already well known in direct marketing with the use of demographic stereotypes. These stereotypes are based on groups of consumers often defined by their address and/or their age. Combining this knowledge with product master data was examined in the post Customer Product Matrix Management.

Social MDM will add new dimensions to this way of using hierarchies in master data and linking the data across multiple channels without the need to uniquely identify a real world person in every aspect.

Contact Level Social MDM in B2B

As discussed in the post Business Contact Reference Data social network profiles has lot to offer within mastering business-to-business (B2B) contact data.

While access to external reference data at the account level has been around for many years by having available public and commercial (and even open) business directories, the problem of identifying and maintain correct and timely data about the contacts at these accounts has been huge.

Integrating with social networks can help here and social networks are actually also integrating more and more with the traditional business directories. LinkedIn has business directory links for larger companies today and lately I noticed a new professional social network called CompanyBook that is based on linking your profile to a (complete) business directory. By the way: The business directory data available in CompanyBook is surprisingly deep, for example revenue data is free for you to grab.

When it comes to contact data they are basically maintained out there by you. A service like LinkedIn is often described as a recruitment service. In my eyes it is a lot more than that. It is along with similar services a goldmine (within a minefield) for getting MDM within B2B done much better.

Bookmark and Share

Big Data and Multi-Domain Master Data Management

The possible connection between the hot buzz within IT today being “big data” and the good old topic of master data management has been discussed a lot lately. An example from CIO UK today is this article called Big data without master data management is a problem.

As said in the article there is a connection through big master data (and big reference data) to big transaction data. Big transaction data is what we usually would call big data, because these are the really big ones.

The two most mentioned kind of big transaction data are:

  • Social data and
  • Sensor data

I also have seen a lot of connections between these big data and master data in multiple domains.

Social Data

Connecting social data to Master Data Management (MDM) is an ongoing discussion I have been involved in for the last three years lately through the new LinkedIn group called Social MDM.

The customer master data domain is in focus here, as the immediate connection here is how to relate traditional systems of record holding customer master data and the systems of engagement where the big social data are waiting to be analyzed and eventually be a part of day-to-day customer centric business processes.

However being able to analyze, monitor and take action on what is being said about specific products in social data is another option and eventually that has to be linked to product master data. In product master data management the focus has traditionally been on your own (resell) products. Effectively listening to social data will mean that you also have to manage data about competing products.

Attaching location to social data has been around for long. Connecting social data to your master data will also require that your location master data are well aligned with the real world.

Sensor Data   

During the past many years I have been involved in data management within public transportation where we have big data coming in from sensors of different kind.

The big problem has for sure being able to connect these transactions correctly to master data. The challenges here are described in the post Multi-Entity Master Data Quality.

The biggest problem is that all the different equipment generating the sensor data in practice can’t be at the same stage at the same time and this will eventually create data that if related without care will show very wrong information about who was the passenger(s), what kind of trip it were, where the journey happened and under which timetable.

Bookmark and Share

Sometimes Big Brother is Confused

Google Maps knows a lot. It knows about addresses and it knows about companies on these addresses.

As with most services it seems that Google Maps gets the reference data from different sources.

The other day I went to visit “Channel 4”, the British TV channel that hosted the UK “Big Brother” reality show until lately.

I typed in the address “124 Horseferry Road, London, United Kingdom” and got the point:

However, it seems that there is a large building up to the left called “Channel 4 Television”. Strange. Then I tried with “Channel 4, 124 Horseferry Road, London, United Kingdom”:

Oh, so I will find “Channel Four Television, 124 Horseferry Road” in the “Channel 4 Television” building only 0.2 miles west of “124 Horseferry Rd”:

Bookmark and Share

Data Driven Data Quality

In a recent article Loraine Lawson examines how a vast majority of executives describes their business as “data driven” and how the changing world of data must change our approach to data quality.

As said in the article the world has changed since many data quality tools were created. One aspect is that “there’s a growing business hunger for external, third-party data, which can be used to improve data quality”.

Embedding third-party data into data quality improvement especially in the party master data domain has been a big part of my data quality work for many years.

Some of the interesting new scenarios are:

Ongoing Data Maintenance from Many Sources

As explained in the article on Wikipedia about data quality services as the US National Change of Address (NCOA) service and similar services around the world has been around for many years as a basic use of external data for data quality improvement.

Using updates from business directories like the Dun & Bradstreet WorldBase and other national or industry specific directories is another example.

In the post Business Contact Reference Data I have a prediction saying that professional social networks may be a new source of ongoing data maintenance in the business-to-business (B2B) realm.

Using social data in business-to-consumer (B2C) activities is another option though also haunted with complex privacy considerations.

Near-Real-Time Data Enrichment

Besides updating changes of basic master data from business directories these directories typically also contains a lot of other data of value for business processes and analytics.

Address directories may also hold further information like demographic stereotype profiles, geo codes and property data elements.

Appending phone numbers from phone books and checking national suppression lists for mailing and phoning preferences are other forms of data enrichment used a lot related to direct marketing.

Traditionally these services have been implemented by sending database extracts to a service provider and receiving enriched files for uploading back from the service provider.

Lately I have worked with a new breed of self service data enrichment tools placed in the cloud making it possible for end users to easily configure what to enrich from a palette of address, business entity and consumer/citizen related third-party data and executing the request as close to real-time as the volume makes it possible.

Such services also include the good old duplicate check now much better informed by including third-party reference data.

Instant Data Quality in Data Entry

As discussed in the post Avoiding Contact Data Entry Flaws third-party reference data as address directories, business directories and consumer/citizen directories placed in the cloud may be used very efficiently in data entry functionality in order to get data quality right the first time and at the same time reduce the time spend in data entry work.

Not at least in a globalized world where names of people reflect the diversity of almost any nation today, where business names becomes more and more creative and data entry is done at shared service centers manned with people from cultures with other address formatting rules, there is an increased need for data entry assistance based on external reference data.

When mashing up advanced search in third-party data and internal master when doing data entry you will solve most of the common data quality issues around avoiding duplicates and getting data as complete and timely as needed from day one.

Bookmark and Share

Business Contact Reference Data

When working with selling data quality software tools and services I have often used external sources for business contact data and not at least when working with data matching and party master data management implementations in business-to-business (B2B) environments I have seen uploads of these data in CRM sources.

A typical external source for B2B contact data will look like this:

Some of the issues with such data are:

  • Some of the contact data names may be the same real world individual as told in the post Echoes in the Database
  • People change jobs all the time. The external lists will typically have entries verified some time ago and when you upload to your own databases, data will quickly become useless do to data decay.
  • When working with large companies in customer and other business partner roles you often won’t interact with the top level people, but people in lower levels not reflected in such external sources.

The rise of social networks has presented new opportunities for overcoming these challenges as examined in a post (written some years ago) called Who is working where doing what?

However, I haven’t seen so many attempts yet to automate and include working with social network profiles in business processes. Surely there are technical issues and not at least privacy considerations in doing so as discussed in the post Sharing Social Master Data.

Right now we have a discussion going on in the LinkedIn Social MDM group about examples of connecting social network profiles and master data management. Please add your experiences in the group here – and join if you aren’t already a member.

Bookmark and Share