The Big Data Secret of SPECTRE

I’m sorry if this blog is turning into a travel blog. But here’s a third Paris story.

Boulevard Haussmann is one of the city’s great thoroughfares (to use the right meta-data term) and is known to be where we can find the headquarters of SPECTRE.

While visiting SPECTRE today I learned a lot about how SPECTRE is exploiting big data as an important way of keeping up with the tough competition in its industry sector today. But all that is of course a secret.

When asking about if they still has trouble with Bond the answer was:

Barry_Nelson_as_Jimmy_Bond_in_1954
Jimmy Bond when he was a field agent

“Bond? – Jimmy Bond? – The sexy data scientist who is working for NSA?”

“Oh no, I replied. James Bond.”

“Oh, yes” the SPECTRE chief data manipulator replied. “He was with British Intelligence. But he has been moved to the EU Data Protection Service. He just got his license to fine. Now 2%  and soon 5% of our global turnover each time. Very dangerous man. Very dangerous”.

Bookmark and Share

Sharing is the Future of MDM

Over at the DataRoundtable blog Dylan Jones recently posted an excellent piece called The Future of MDM?

Herein Dylan examines how a lot of people in different organizations spend a lot of time on trying to get complete, timely and unique data about customers and other business partners.

A better future for MDM (Master Data Management) could certainly be that every organization doesn’t have to do the work over and over and again. While self registration by customers is a way of letting off the burden on private enterprises and public sector bodies, we may even do better by not having the customer being the data entry clerk and typing in the same information over and over and again.

Today there are several available options for customer and other business partner reference data:

  • Public sector registries which are getting more and more open being that for example for the address part or even deeper in due respect of privacy considerations which may be different for business entities and individual entities.
  • Commercial directories often build on top of public registries.
  • Personal data lockers like the Mydex service mentioned by Dylan.
  • Social network profiles.

instant Single Customer ViewMy guess is that the future of MDM is going to be a mashup of exploiting the above options.

Oh, and as representatives of such a mashup service we recently at iDQ made sure we had the accurate, complete and timely information filled in on our Linkedin Company profile.

Bookmark and Share

Doctor Livingstone, I Presume?

The title of this blog post is a famous quote from history (which as most quotes are disputed) said by Henry Morton Stanley (who actually was born John Rowlands) when he found Doctor Livingstone (David Livingstone) deep into the African jungle in 1871 after a 6 month expedition with 200 men through unknown territory.

Today it’s much easier to find people. Mobile phone use, credit card transactions and tweet positions leads the way, unless of course you really, really don’t want to be found as it was with Osama bin Mohammed bin Awad bin Laden.

One of the biggest issues in data quality is real world alignment of the data registered about persons. As told in the post out Out of Africa there are some issues in the way we handle such data, as:

  • Cultural diversity: Names, addresses, national ID’s and other basic attributes are formatted differently country by country and in some degree within countries. Most data models with a person entity are build on the format(s) of the country where it is designed.
  • Intended purpose of use: Person master data are often stored in tables made for specific purposes like a customer table, a subscriber table a contact table and so on. Therefore the data identifying the individual is directly linked with attributes describing a specific role of that individual.
  • “Impersonal” use: Person data is often stored in the same table as other party master types as business entities, projects, households et cetera.

Besides that I have found that many organizations don’t use the sources available today in getting data quality right when it comes to contact data.

It’s not that I suggest actually hacking into mobile phone use logs and so. There are a lot of sources not compromising with privacy that let you exploit external reference data as explained in the post Beyond Address Validation.

Bookmark and Share

Hierarchy Management in Social MDM

Hierarchy management is a core feature in master data management (MDM). When it comes to integrating social data and social network profiles into MDM, hierarchy management will be very important too.

Aggregated Level of Social MDM in B2C

The primarily privacy related challenges of social MDM not at least within business-to-consumer (B2C) have been a topic of a lot of blogging lately.  Examples are:

One way of overcoming the privacy considerations is linking to social data and social network profiles at an aggregate level.

Using aggregate level linking is already well known in direct marketing with the use of demographic stereotypes. These stereotypes are based on groups of consumers often defined by their address and/or their age. Combining this knowledge with product master data was examined in the post Customer Product Matrix Management.

Social MDM will add new dimensions to this way of using hierarchies in master data and linking the data across multiple channels without the need to uniquely identify a real world person in every aspect.

Contact Level Social MDM in B2B

As discussed in the post Business Contact Reference Data social network profiles has lot to offer within mastering business-to-business (B2B) contact data.

While access to external reference data at the account level has been around for many years by having available public and commercial (and even open) business directories, the problem of identifying and maintain correct and timely data about the contacts at these accounts has been huge.

Integrating with social networks can help here and social networks are actually also integrating more and more with the traditional business directories. LinkedIn has business directory links for larger companies today and lately I noticed a new professional social network called CompanyBook that is based on linking your profile to a (complete) business directory. By the way: The business directory data available in CompanyBook is surprisingly deep, for example revenue data is free for you to grab.

When it comes to contact data they are basically maintained out there by you. A service like LinkedIn is often described as a recruitment service. In my eyes it is a lot more than that. It is along with similar services a goldmine (within a minefield) for getting MDM within B2B done much better.

Bookmark and Share

Data Driven Data Quality

In a recent article Loraine Lawson examines how a vast majority of executives describes their business as “data driven” and how the changing world of data must change our approach to data quality.

As said in the article the world has changed since many data quality tools were created. One aspect is that “there’s a growing business hunger for external, third-party data, which can be used to improve data quality”.

Embedding third-party data into data quality improvement especially in the party master data domain has been a big part of my data quality work for many years.

Some of the interesting new scenarios are:

Ongoing Data Maintenance from Many Sources

As explained in the article on Wikipedia about data quality services as the US National Change of Address (NCOA) service and similar services around the world has been around for many years as a basic use of external data for data quality improvement.

Using updates from business directories like the Dun & Bradstreet WorldBase and other national or industry specific directories is another example.

In the post Business Contact Reference Data I have a prediction saying that professional social networks may be a new source of ongoing data maintenance in the business-to-business (B2B) realm.

Using social data in business-to-consumer (B2C) activities is another option though also haunted with complex privacy considerations.

Near-Real-Time Data Enrichment

Besides updating changes of basic master data from business directories these directories typically also contains a lot of other data of value for business processes and analytics.

Address directories may also hold further information like demographic stereotype profiles, geo codes and property data elements.

Appending phone numbers from phone books and checking national suppression lists for mailing and phoning preferences are other forms of data enrichment used a lot related to direct marketing.

Traditionally these services have been implemented by sending database extracts to a service provider and receiving enriched files for uploading back from the service provider.

Lately I have worked with a new breed of self service data enrichment tools placed in the cloud making it possible for end users to easily configure what to enrich from a palette of address, business entity and consumer/citizen related third-party data and executing the request as close to real-time as the volume makes it possible.

Such services also include the good old duplicate check now much better informed by including third-party reference data.

Instant Data Quality in Data Entry

As discussed in the post Avoiding Contact Data Entry Flaws third-party reference data as address directories, business directories and consumer/citizen directories placed in the cloud may be used very efficiently in data entry functionality in order to get data quality right the first time and at the same time reduce the time spend in data entry work.

Not at least in a globalized world where names of people reflect the diversity of almost any nation today, where business names becomes more and more creative and data entry is done at shared service centers manned with people from cultures with other address formatting rules, there is an increased need for data entry assistance based on external reference data.

When mashing up advanced search in third-party data and internal master when doing data entry you will solve most of the common data quality issues around avoiding duplicates and getting data as complete and timely as needed from day one.

Bookmark and Share

Business Contact Reference Data

When working with selling data quality software tools and services I have often used external sources for business contact data and not at least when working with data matching and party master data management implementations in business-to-business (B2B) environments I have seen uploads of these data in CRM sources.

A typical external source for B2B contact data will look like this:

Some of the issues with such data are:

  • Some of the contact data names may be the same real world individual as told in the post Echoes in the Database
  • People change jobs all the time. The external lists will typically have entries verified some time ago and when you upload to your own databases, data will quickly become useless do to data decay.
  • When working with large companies in customer and other business partner roles you often won’t interact with the top level people, but people in lower levels not reflected in such external sources.

The rise of social networks has presented new opportunities for overcoming these challenges as examined in a post (written some years ago) called Who is working where doing what?

However, I haven’t seen so many attempts yet to automate and include working with social network profiles in business processes. Surely there are technical issues and not at least privacy considerations in doing so as discussed in the post Sharing Social Master Data.

Right now we have a discussion going on in the LinkedIn Social MDM group about examples of connecting social network profiles and master data management. Please add your experiences in the group here – and join if you aren’t already a member.

Bookmark and Share

Social MDM, Privacy and Data Quality

The term “Social MDM” has been promoted quite well this week not at least as part of the social media information stream from the ongoing user conference of the tool vendor Informatica.

In a blog post called Informatica 9.5 for Big Data Challenge #2: Social Jody Ko of Informatica introduces the opportunities and challenges.

In the closing remarks Judy says: “There’s still a long way to go to bring social data into the mainstream enterprise, in part due to concerns over privacy and the potential “creepiness” factor of mining social data.”

As I understand it the spearhead Social MDM part of the tool release is a Facebook App that provides connectivity between Facebook and the MDM solution.

Industry analyst R “Ray” Wang examines this in the blog post News Analysis: Informatica Launches MDM 9.5. The analysis states that it now is time to “drive data out of Facebook and not into Facebook”.

The opportunities and challenges of driving data out of Facebook was discussed in a post called exactly Out of Facebook here on the blog some years ago.

Balancing privacy with data hoarding is still for sure a subject that in no way is settled and probably never will be.

Connecting systems of record in traditional MDM solutions with social network profiles is in no way a walk over too. The classic data quality challenges with uniqueness of records and completeness of data only gets more difficult, but also, there are great opportunities for getting a better picture of your customers and other business partners.

If you are interested in Social MDM and the related challenges and opportunities there is a LinkedIn group for Social MDM.

The group is new, less than a month old at the present time, but there is already a lot of content to dip into, including:

Bookmark and Share