In a recent article Loraine Lawson examines how a vast majority of executives describes their business as “data driven” and how the changing world of data must change our approach to data quality.
As said in the article the world has changed since many data quality tools were created. One aspect is that “there’s a growing business hunger for external, third-party data, which can be used to improve data quality”.
Embedding third-party data into data quality improvement especially in the party master data domain has been a big part of my data quality work for many years.
Some of the interesting new scenarios are:
Ongoing Data Maintenance from Many Sources
As explained in the article on Wikipedia about data quality services as the US National Change of Address (NCOA) service and similar services around the world has been around for many years as a basic use of external data for data quality improvement.
Using updates from business directories like the Dun & Bradstreet WorldBase and other national or industry specific directories is another example.
In the post Business Contact Reference Data I have a prediction saying that professional social networks may be a new source of ongoing data maintenance in the business-to-business (B2B) realm.
Using social data in business-to-consumer (B2C) activities is another option though also haunted with complex privacy considerations.
Near-Real-Time Data Enrichment
Besides updating changes of basic master data from business directories these directories typically also contains a lot of other data of value for business processes and analytics.
Address directories may also hold further information like demographic stereotype profiles, geo codes and property data elements.
Appending phone numbers from phone books and checking national suppression lists for mailing and phoning preferences are other forms of data enrichment used a lot related to direct marketing.
Traditionally these services have been implemented by sending database extracts to a service provider and receiving enriched files for uploading back from the service provider.
Lately I have worked with a new breed of self service data enrichment tools placed in the cloud making it possible for end users to easily configure what to enrich from a palette of address, business entity and consumer/citizen related third-party data and executing the request as close to real-time as the volume makes it possible.
Such services also include the good old duplicate check now much better informed by including third-party reference data.
As discussed in the post Avoiding Contact Data Entry Flaws third-party reference data as address directories, business directories and consumer/citizen directories placed in the cloud may be used very efficiently in data entry functionality in order to get data quality right the first time and at the same time reduce the time spend in data entry work.
Not at least in a globalized world where names of people reflect the diversity of almost any nation today, where business names becomes more and more creative and data entry is done at shared service centers manned with people from cultures with other address formatting rules, there is an increased need for data entry assistance based on external reference data.
When mashing up advanced search in third-party data and internal master when doing data entry you will solve most of the common data quality issues around avoiding duplicates and getting data as complete and timely as needed from day one.