You probably won’t find the truth (and salsa) inside your firewall

In a Data Roundtable blog post published today and called Big Data in Your Kitchen Phil Simon says:

“CXOs who believe that “data” is simply the content in their own internal databases are increasing being seen as anachronistic. More progressive leaders understand that data is everywhere, including–and especially–external to the enterprise.”

Bringing in external data was also touched recently by Kim Loughead of Informatica in the post Bring The Outside In: Why Integrating External Data Sources Should Be Your Next Data integration Project.

Herein Kim emphasizes that: “Innovation is driven by data and that data largely resides outside your firewall”.

SalsaMy humble work in bringing in the outside revolves around a service called instant Data Quality (iDQ™). This service is about exploiting the increasing choice if external directories holding valuable information about the individuals, companies, addresses and properties we have so much trouble with reflecting in our party master data hubs.

What about you? Are you anachronistic or do you bring in the outside? Or as it will sound in Phil’s Big Data Kitchen: Will you miss salsa tonight?

Bookmark and Share

Data Quality Technology for Marketing

TFMAAlso this year I visited the Technology for Marketing and Advertising event in London in order to take part in as many prize drawings as possible. And oh, also to catch up on new developments in applying data quality to marketing.

Translation Management and Social Intelligence

SDL has the slogan: Because Business is Global. I like it. Besides doing translation management SDL also excels in social intelligence. As discussed with the SDL representative on the booth a core competency in doing this is to link social data with master data entities, a subject I touched yesterday on Informatica Perspectives in the post called Social MDM and Future Competitive Analysis.

A proof of that it is a small world is that Informatica is a SDL reference customer for localization as told here.

Utilizing Location Data

Entergate, a survey tool specialist, focused on a new tool called pointSurvey. It’s so new I can’t find any links on their website. The concept is embedding maps into surveys that relate to location data. Using the tool respondents may point out places of interest or draw out routes.

Surely this is a better way to catch locations than typing in postal addresses.

eMail Verification

BriteVerify says on their site:

“At BriteVerify, we take verification seriously – in fact, making sure that you receive the most accurate information possible is pretty much the only thing that matters to us. Well, that and pancakes. Mmmmm… pancakes.”

Somehow I missed the pancakes. But the eMail verification presented by BriteVerify was good.

Bookmark and Share

Beware of False Positives in Data Matching

In a recent blog post by Kristen Gregerson of Satori Software you may learn A Terrible Tale where the identity of two different real world individuals were merged into one golden record with the most horrible result you may imagine associated with a recent special day related to the results of the other kind of matching going around.

datamatching
Join the Data Matching Group on LinkedIn

As reported by Jim Harris some years ago in the post The Very True Fear of False Positives the bad things happening from false positives in data matching is indeed a hindrance for doing data matching

If we do data matching we should be aware that false positives will happen and we should know the probability of that it happens and we should know how to avoid the resulting heartache.

Indeed using a data matching tool is better than relying on simple database indexes and indeed there are differences in how good various data matching tools are at doing the job, not at least doing it under different circumstances as told in the post What is a best-in-class match engine?

Curious about how data matching tools work (differently)? There is an eLearning course available co-authored by yours truly. The course is called Data Parsing, Matching and De-duplication.

Bookmark and Share

Data Quality Does Matter!

The title of this blog post is the title of a seminar about data quality and data matching taking place in Copenhagen:

Data Quality Does Matter

The seminar is hosted by Affecto, a data management consultancy firm with strong presence in the Nordic and the Baltic countries, and Informatica, a leading data management tool vendors word-wide.

There will be three sessions on the seminar:

  • First you will learn about steps for working with a data quality platform to improve BI and master data management solutions.
  • Then you will see a walkthrough of the architecture and capabilities of the Informatica Data Quality platform.
  • And finally you shouldn’t miss the session with yours truly on data matching based on a Informatica Perspectives blog post called Five Future Data Matching Trends.

Hope to see you in Copenhagen, København, Köpenhamn, Kopenhagen, Copenhague, Copenaghen, Hafnia or whatever name you use for that place as told in the post about data matching and Diversity in City Names.

Bookmark and Share

What’s so special about your party master data?

My last blog post was called Is Managing Master Data a Differentiating Capability? The post is an introduction to a conference session being a case story about managing master data at Philips.

During my years working with data quality and master data management it has always struck me how different organizations are managing the party master data domain while in fact the issues are almost the same everywhere.

business partnersFirst of all party master data are describing real world entities being the same to everyone. Everyone is gathering data about the same individuals and the same companies being on the same addresses and having the same digital identities. The real world also comes in hierarchies as households, company families and contacts belonging to companies which are the same to everyone. We may call that the external hierarchy.

Based on that everyone has some kind of demand for intended duplicates as a given individual or company may have several accounts for specific purposes and roles. We may call that the internal hierarchy.

A party master data solution will optimally reflect the internal hierarchy while most of the business processes around are supported by CRM-systems, ERP-systems and special solutions for each industry.

Fulfilling reflecting the external hierarchy will be the same to everyone and there is no need for anyone to reinvent the wheel here. There are already plenty of data models, data services and data sources out there.

Right now I’m working on a service called instant Data Quality that is capable of embracing and mashing up external reference data sources for addresses, properties, companies and individuals from all over the world.

The iDQ™ service already fits in at several places as told in the post instant Data Quality and Business Value. I bet it fits your party master data too.

Bookmark and Share

Tomorrow’s Data Quality Tool

In a blog post called JUDGEMENT DAY FOR DATA QUALITY published yesterday Forrester analyst Michele Goetz writes about the future of data quality tools.

Michele says:

“Data quality tools need to expand and support data management beyond the data warehouse, ETL, and point of capture cleansing.”

and continues:

“The real test will be how data quality tools can do what they do best regardless of the data management landscape.”

As described in the post Data Quality Tools Revealed there are two things data quality tools do better than other tools:

  • Data profiling and
  • Data matching

Some of these new challenges I have worked with within designing tomorrow’s data quality tools are:

  • open-doorPoint of capture profiling
  • Searching using data matching techniques
  • Embracing social networks

Point of capture profiling:

The sweet thing about profiling your data while you are entering your data is that analysis and cleansing becomes part of the on-boarding business process. The emphasis moves from correction to assistance as explained in the post Avoiding Contact Data Entry Flaws. Exploiting big external reference data sources within point of capture is a core element in getting it right before judgment day.

Searching using data matching techniques:

Error tolerant searching is often the forgotten capability when core features of Master Data Management solutions and data quality tools are outlined. Applying error tolerant search to big reference data sources is, as examined in the post The Big Search Opportunity, a necessity to getting it right before judgment day.

Embracing social networks:

The growth of social networks during the recent years has been almost unbelievable. Traditionally data matching has been about comparing names and addresses. As told in the post Addressing Digital Identity it will be a must to be able to link the new systems of engagement with the old systems of record in order to getting it right before judgment day.

How have you prepared for judgment day?

Bookmark and Share

Future Identities

Recently I stumbled upon a report called Future Identities in the UK. The purpose of the report is to provide the government in the UK insight into how identities of citizens will develop over the next 10 years. But the insight certainly also applies to how private companies will have to react to this development and certainly also not just in the UK.

The report talks about three different kinds of identities:

identies in the UK

Applied to data quality and master data management I think these future kinds of identities will have these consequences:

Biometric identities relates to hard core identity resolution as in fighting terrorism, crime investigation and physical access control but is sometimes even used in simple commercial checks as told in the post Real World Identity. My guess is that we will see biometrics used more as a mean to have better data quality, but not considerable more due to return of investment also as examined in the post Citizen ID and Biometrics.

Biographical identities and the related attributes resembles what we often also calls demographic attributes used in handling data for direct marketing and other purposes of data management. Direct marketing may, as reported in the post Psychographic Data Quality, be in transition to go deeper into big data in order to be psychographic marketing.

Social identities is the new black. As discussed on this blog, latest in the post Defining Social MDM, my guess is that social data master management is going to be big and has to be partly interwoven with using traditional biographical attributes and even, like it or not, biometric attributes. The art of doing that in a proper way is going to be very exciting.

Bookmark and Share

Multi-Entity MDM vs Multidomain MDM

puzzleOn the upcoming MDM Summit Europe 2013 in London this April you will be able to learn about Multi-Entity MDM as well as Multi-Domain MDM.

So, what is the difference between Multi-Entity MDM and Multi-Domain MDM?

To my knowledge it is two terms having the same meaning. It is doing the two main preceding disciplines for MDM being Customer Data Integration (CDI) and Product Information Management (PIM) at the same time presumably using the same software brand.

Multi-Entity MDM was probably the first term used and still used by The MDM Institute while Multidomain MDM is used by Gartner (the analyst firm) and most tool vendors today. For example Stibo Systems is focusing on mutidomain recently in this press release about latest achievements.

Talking about Gartner and the vendor crowd Gartner analyst Andrew White wrote a blog post the other day: Round-Up of Master Data Management (MDM) 2012, and looking forward to 2013.

Herein White bashes the vendors by saying:

“Vendor hype related to multidomain …. continued to be far in excess of reality”.

What do you think? Is Andrew White right about that? And what about Multi-Entity MDM, is that any better?

Bookmark and Share

The Three Big M’s of Data Quality

Most organizations have a lot of data quality issues where there is a wealth of possible solutions to deal with these challenges.

What you usually do is that that you categorize the problems into three different types of best resolutions:

Mañana

You could go ahead with solving the data quality problems today but probably you have better and more important things to do right now.

Your organization may have a global SAP rollout going on or other resource demanding implementations. Therefore it is most wise to deal with the data quality issues when everything is running smoothly.

Mission impossible

Maybe a resolution has been tried before and didn’t work. Chances that alternate people management, different orchestration of processes and development in available technology will change that are very slim.

May the force be with you

Many problems solve themselves over time or hopefully don’t get noticed by anyone. If things get ugly you always have your lightsaber.

Bookmark and Share

Doing MDM in the Cloud

As reported in the post What to do in 2012 doing Master Data Management (MDM) in the cloud is one of three trends within MDM that according to Gartner (the analyst firm) will shape the MDM market in the coming years.

Doing MDM in the cloud is an obvious choice if all your operational applications are in the cloud already. Such a solution was presented on Informatica Perspectives in the blog post Power the Social Enterprise with a Complete Customer View. The post includes a Video where the situation with multiple instances of SalesForce.com solutions within the same enterprise is supported by a master data backbone in the cloud.

But even if all your operational applications are on premise you may start with lifting some master data management functionality up in the cloud. I am currently working with such a solution.

When onboarding customer (and other party) master data much of the basic information needed is already known in the cloud. Therefore lifting the onboarding functionality up into the cloud makes a lot of sense. This is the premise, so to speak, for the MDM edition of the instant Data Quality (iDQ) solution that we are working on these days.

Cloud services for the other prominent MDM domain being product master data also makes a lot of sense. As told in the post Social PIM a lot of basic product master data may be shared in the cloud embracing the supply chain of manufacturers, distributors, retailers and end users.

In both these cases some of the master data management functionality is handled in the cloud while the data integration stuff takes place where the operational applications resides be that in the cloud and/or on premise.

Bookmark and Share