Getting eMail Addresses Right the First Time

emailChecking if an eMail address will bounce is essential for executing and measuring campaigns, news letter operations and other activities based on sending eMails as explained here on the site Don’t Bounce by BriteVerify.

A good principle within data quality prevention and Master Data Management (MDM) is the first time right approach. There is a 1-10-100 rule saying:

“One dollar spent on prevention will save 10 dollars on correction and 100 dollar on failure costs”.

(Replace dollars with your favorite currency: Euros, pounds, rubles, rupees, whatever.)

This also applies to capturing an eMail address of a (prospect) customer and other business partners. Many business processes today requires communication through eMails in order to save costs and speed up processes. If you register an invalid eMail address or allow self registration of an invalid eMail address you have got yourself some costly scrap and rework or maybe lost an opportunity.

As a natural consequence the instant Data Quality MDM Edition besides ensuring right names and correct postal addresses also checks for valid eMail addresses.

Bookmark and Share

Social IT and Business

business partnersThe distinction between IT and business is an often used concept in a lot of disciplines from Enterprise Architecture (EA) over data quality management to Master Data Management (MDM). While it may be a good concept to use when assigning responsibilities and finding out who should be driving what I have personally always kind of disliked the concept. It’s a concept practically only used by the IT side and in my eyes IT is part of the business just as much as marketing, sales, accounting and all the other departments are.

So, now when business is going social, how does that affect the IT and business distinction?

Social business is in large parts done today in the enterprises around without involving the internal IT department. The use of data and functionality in social business done with so called systems of engagement is much more external focused than the internal focused nature of the traditional systems of record.

However, if enterprises are to harvest the fruits of systems of engagement it must be done by linking the new systems of engagement to the old systems of record and that means involving internal IT. Please do this without dividing the enterprise into IT and business again. Please be more social this time.

Bookmark and Share

What’s Different about MDM in France?

franceAs told in the post about French MDM vendors yesterday I have been on a MDM (Master Data Management) event in Paris today.

An interesting take away from the event’s presentations and the mingling is some differences between how MDM is handled in France (and the rest of continental Europe as I know it) compared to the English speaking world. Some observations are:

People, process and technology

Many MDM gurus (and gurus in other disciplines) stress that you shouldn’t focus on technology (alone) but take people and process very serious too. That’s not so important in France. Everyone knows that already.

Multi-Domain MDM

In France it’s common to start with product MDM and then continue with customer (party) MDM.

The Quadrant Magic

If you made a Gartner Magic Quadrant for MDM solutions in France you wouldn’t have a quadrant for customer data and another one for product data. There would be only one quadrant for (multi-domain) MDM and some of the local vendors would be leaders as discussed in the post MDM for Customer Data Quadrant: No challengers. No visionaries.

Bookmark and Share

Trois acteurs français dans le marché du MDM

I am looking forward to going to Paris today in order to be at the Forum MDM arranged by Micropole taking place tomorrow.

The French MDM market is a vibrant one with several French grown solutions also going well on the world-wide MDM market:

Semarchy

As told in the post Eating the MDM Elephant the MDM solution provider Semarchy has emphasized on making a MDM platform that supports evolutionary MDM which means that you don’t have to pre-think every aspect of your to-be MDM implementation before going ahead. This is a recommendable approach indeed.

Orchestra Networks

Master data and reference data are, besides of sometimes actually being used synonymously, close terms and so are Master Data Management (MDM) and Reference Data Management (RDM). Orchestra Networks seems to be on the forefront in offering a well founded solution for both disciplines at the same time.

Talend

The open source provider Talend has over the years developed a solution that started with data integration and then added data quality functionality and some years ago also included MDM and, in touch with the way of the world today, now also is embracing big data.

Bandeau_partenaires_Forum_MDM_2
All the sponsors at the Forum MDM in Paris 24th October 2013

Bookmark and Share

MDM for Customer Data Quadrant: No challengers. No visionaries.

The Gartner Magic Quadrant for Master Data Management of Customer Data Solutions is out. You may have a free look at it for example going through Talend’s press release on the matter here.

MDM Brands
PS: This isn’t the quadrant. Just a few vendor names.

It’s not a crowded picture. There are few solutions in there and several come from the same brand. And there are no challengers and no visionaries.

Gartner expect that challengers may arrive later for example as those who are building up multi-domain MDM solutions right now. Should be interesting to see what comes first: Challengers in the customer MDM quadrant or a multi-domain MDM quadrant. Other analysts have a single view of MDM vendors.

From where will we see the visionaries then? Gartner says current niche players may spread into the visionary field. If we will see new vendors emerging into the visionary field it may in my eyes be based on the Growing Variety in Big Master Data which includes widening the term customer data into taking care of all kinds of party master data.

Bookmark and Share

What Should a Data Quality Tool Do?

Earlier this month we had this year’s magic quadrant for data quality tools from Gartner (the analyst firm). The magic quadrant always stirs up posts about data quality tools and this is true again this year. For example yours truly had a post here and Lorraine Lawson had a say on the ITBusinessEdge in the post Eight Questions to Ask Before Investing in Data Quality Tools.

Some of these questions asked by Lorraine relates to a grounding principle in the magic quadrant that is, that the data quality tool should be able to do everything data quality and even, as stated in Lorraine’s question 2: Can it be embedded into business process workflows or other technology-enabled programs or initiatives, such as MDM and analytics?

The LEGO StoryThinking that question  to the end inevitably makes you think about where data quality tools ends and where applications for different business processes, with data quality built in, takes over?

That question is close to me as I’m right now working with a tool for maintaining party master data with two main advantages:

  • Making the business process as smooth as possible
  • Ensuring data quality at pre data entry and all through the data lifetime

So, it’s not a true data quality tool. It doesn’t do everything data quality. It’s not a true MDM platform. It doesn’t do everything master data. But I would say that it does do what it does better than the full monty behemoths.

Bookmark and Share

Building an instant Data Quality Service for Quotes

In yesterday’s post called Introducing the Famous Person Quote Checker the issue with all the quotes floating around in social media about things apparently said by famous persons was touched.

The bumblebee can’t fly faster than the speed of light – Albert Einstein
The bumblebee can’t fly faster than the speed of light – Albert Einstein

If you were to build a service that could avoid postings with disputable quotes, what considerations would you have then? Well, I guess pretty much the same considerations as with any other data quality prevention service.

Here are three things to consider:

Getting the reference data right

Finding the right sources for say reference data for world-wide postal addresses was discussed in the post A Universal Challenge.

The same way, so to speak, it will be hard to find a single source of truth about what famous persons actually said. It will be a daunting task to make a registry of confirmed quotes.

Embracing diversity

Staying with postal addresses this blog has a post called Where the Streets have one Name but Two Spellings.

The same way, so to speak again, quotes are translated, transliterated and has gone through transcription from the original language and writing system. So every quote may have many true versions.

Where to put the check?

As examined in the post The Good, Better and Best Way of Avoiding Duplicates there are three options:

1)      A good and simple option could be to periodically scan through postings in social media and when a disputable quote is found sending an eMail to the culprit who did the posting. However, it’s probably too late, as even if you for example delete your tweet, the 250 retweets will still be out there. But it’s a reasonable way of starting marking up all the disputable quotes out there.

2)      A better option could be a real-time check. You type in a quote on a social media site and the service prompts you: “Hey Dude, that person didn’t say that”. The weak point is that you already did all the typing, and now you have to find a new quote. But it will work when people try to share disputable quotes.

3)    The best option would be that you start typing “If you can’t explain it simply… “ and the service prompts a likely quote as: “Everything should be as simple as it can be, but not simpler – Albert Einstein”.

Bookmark and Share

Entity Resolution and Big Data

FingerprintThe Wikipedia article on Identity Resolution has this catch on the difference between good old data matching and Entity Resolution:

”Here are four factors that distinguish entity resolution from data matching, according to John Talburt, director of the UALR Laboratory for Advanced Research in Entity Resolution and Information Quality:

  • Works with both structured and unstructured records, and it entails the process of extracting references when the sources are unstructured or semi-structured
  • Uses elaborate business rules and concept models to deal with missing, conflicting, and corrupted information
  • Utilizes non-matching, asserted linking (associate) information in addition to direct matching
  • Uncovers non-obvious relationships and association networks (i.e. who’s associated with whom)”

I have a gut feeling that Data Matching and Entity (or Identity) Resolution will melt together in the future as expressed in the post Deduplication vs Identity Resolution.

If you look at the above mentioned factors that distinguish data matching from identity resolution, some of the often mentioned features in the new big data technology shine through:

  • Working with unstructured and semi-structured data is probably the most mentioned difference between working with small data versus working with big data.
  • Working with associations is a feature of graph databases or other similar technologies as mentioned in the post Will Graph Databases become Common in MDM?

So, in the quest of expanding matching small data to evolve into Entity (or Identity) Resolution we will be helped by general developments in working with big data.

Bookmark and Share

Why don’t MDM Implementations Stick?

puzzleFormer Gartner (the analyst firm) MDM guru John Radcliffe has established his own business and blog and started off revealing some dirty secrets about how sticky MDM implementations are.  Quote:

“Another interesting thing was something that we found during Magic Quadrant reference checking. Increasingly the initial MDM champion, who made the business case, chose the software and led the MDM program had now moved on. The new guy (or gal) in the role often didn’t have the same enthusiasm (putting it politely) for MDM generally, for the MDM software that was installed or for the incumbent MDM software supplier.”

You may read John Radcliffe’s blog here.

A pretty bad review of MDM vendors merits indeed. But, as I have experienced during several decades in the IT business, this is an observation that probably could be made not only in the MDM realm.

However it could be good to learn how MDM implementations could be stickier. What are MDM implementations missing? Is it:

  • The functionality in MDM solutions that needs improvement?
  • The often massive consultancy that comes with a MDM tool that doesn’t meet expectations?
  • Enterprises not actually being ready for MDM?

My take is: All of above in mentioned order. Your take is?

Bookmark and Share

Somehow Deduplication won’t Stick

psychographic MDM18 years ago I cruised into the data quality realm when making my first deduplication tool. Then it was an attempt to solve a business case of two companies who were considering merging and wanted to know the intersection of customers. So far, so good.

Since then I have worked intensively with deduplication and other data matching tools and approaches and also co-authored a leading eLearning course on the matter as seen here.

Deduplication capability is a core feature of many data quality tools and indeed the probably most mentioned data quality pain is lack of uniqueness not at least in party master data management.

However, most deduplication efforts don’t in my experience stick. Yes, we can process a file ready for direct marketing and purge the messages that might end up in the same offline or online inbox despite of spelling differences. But taking it from there and use the techniques in achieving a single customer view is another story. Some obstacles are:

In the comments to the latter 3 year old post the intersection (and non-intersection) of Entity Resolution and Master Data Management (MDM) was discussed.

During my latest work I have become more and more convinced that achieving a single view of something is a lot about entity resolution as expressed in the post The Good, Better and Best Way of Avoiding Duplicates.

Bookmark and Share