Partnerships for the Cloud

24th February 2012

Earlier this month Loraine Lawson was so kind to quote me in an article on IT Business Edge called New Partnerships Create Better Customer Data via the Cloud.

The article mentions some cloud services from StrikeIron and Melissadata. These services are currently based on improving North American, being US and Canadian, customer data.

I am involved in similar services that currently are based on improving Danish customer data, which then covers the rest of North America being Greenland.

Improving customer data from all over the world is surely a daunting task that needs partnerships.

The cloud is the same, the reference data isn’t and the rules and traditions aren’t either as governments around the world has found 240 (or so) different solutions to balancing privacy concerns and administrative efficiency.

So, if not partnering, you risk getting solutions that are nationally international.

Bookmark and Share


Wildcard Search versus Fuzzy Search

13th February 2012

My last post about search functionality in Master Data Management (MDM) solutions was called Search and if you are lucky you will find.

In the comments the use of wildcards versus fuzzy search was touched.

The problem with wildcards

I have a company called “Liliendahl Limited” as this is the spelling of the name as it is registered with the Companies House for England and Wales.

But say someone is searching using one of the following strings:

  • “Liliendahl Ltd”,
  • “Liliendal Limited” or
  • “Liljendahl Limited”

Search functionality should in these situations return with the hit “Liliendahl Limited”.

Using wildcard characters could, depending on the specific syntax, produce a hit in all combinations of the spelling with a string like this: “lil?enda*l l*”.

The problem is however that most users don’t have the time, patience and skills to construct these search strings with wildcard characters. And maybe the registered name was something slightly else not meeting the wildcard characters used.  

Matching algorithms

Tools for batch matching of name strings have been around for many years. When doing a batch match you can’t practically use wildcard characters. Instead matching algorithms typically rely of one, or in best case a combination, of these techniques:

The same techniques can be used for interactive search thus reaching a hit in one fast search.

Fuzzy search

I have worked with the Omkron FACT algorithm for batch matching. This algorithm has morphed into being implemented as a fuzzy search algorithm as well.

One area of use for this is when webshop users are searching for a product or service within your online shop. This feature is, along with other eCommerce capabilities, branded as FACT-Finder.

The fuzzy search capabilities are also used in a tool I’m involved with called iDQ. Here external reference data sources, in combination with internal master data sources, are searched in an error tolerant way, thus making data available for the user despite heaps of spelling possibilities.

Bookmark and Share


Reference Data at Work in the Cloud

5th January 2012

One of the product development programs I’m involved in is about exploiting rich external reference data and using these data in order to get data quality right the first time and being able to maintain optimal data quality over time.

The product is called instant Data Quality (abbreviated as iDQ ™). I have briefly described the concept in an earlier post called instant Data Quality.

iDQ ™combines two concepts:

  • Software as a Service
  • Data as a Service

While most similar solutions are bundled with one specific data provider the iDQ ™ concept embraces a range data sources. The current scope is around customer master data where iDQ ™ may include Business-to-Business (B2B) directories, Business-to-Consumer (B2C) directories, real estate directories, Postal Address Files and even social media network data from external sources as well as internal master data at the same time all presented in a compact mash-up.

The product has already gained a substantial success in my home country Denmark leading to the formation of a company solely working with development and sales of iDQ ™.

The results iDQ ™ customers gains may seem simple but are the core advantages of better data quality most enterprises are looking for, like said by one of Denmark’s largest companies:

“For DONG Energy iDQ ™ is a simple and easy solution when searching for master data on individual customers. We have 1,000,000 individual customers. They typically relocate a few times during the time they are customers of us. We use iDQ ™ to find these customers so we can send the final accounts to the new address. iDQ ™ also provides better master data because here we have an opportunity to get names and addresses correctly spelled.

iDQ ™ saves time because we can search many databases at the time. Earlier we had to search several different databases before we found the right master data on the customer. “

Please find more testimonials (in Danish) here.

I hope to be able to link to testimonials in more languages in the future.

Bookmark and Share


The Pond

1st October 2011

The term ”The Pond” is often used as an informal term for the Atlantic Ocean, especially the North Atlantic Ocean being the waters that separates North America and Europe.

Within information technology and not at least my focus areas being data quality and master data management there is a lot of exchange going on over the pond as European companies are using North American technology and sometimes vice versa. Also European companies are setting up operations in North America and of course also the other way around.

Some technologies works pretty much the same regardless of in which country it is deployed. A database manager product is an example of that kind of technology. Other pieces of software must be heavily localized. An ERP application belongs to that category. Data quality and master data management tools and implementation practice are indeed also subject to diversity considerations.

When North American companies go to Europe my gut feeling is that an overwhelming part of them chooses to start with a European or EMEA wide head quarter on the British Isles – and that again means mostly in the London area.

The reasons for that may be many. However I guess that the fact that people on the British Isles doesn’t speak a strange language has a lot to say. What many North American companies with a head quarter in London often has to realize then is, that this move only got them half way over the pond.  

Bookmark and Share


Nonprofit Data Quality

25th September 2011

One of the industries where I have worked a lot with data quality issues is at nonprofit organizations such as charities and other form of membership based organizations.

A general characteristic of such organizations is that they have databases with as many “customers” as huge global enterprises; however the number of employee records is only a fraction compared to those large companies.

So the emphasis is often not at creating well manned data governance organizational structures but implementing the best automation available in order to have optimal party master data management, where the parties involved are members and other roles played by individuals and companies with a common interest.

Many nonprofit organizations have several different fundraising activities going on at the same time. This means that real world individuals, households, organizations and their contacts are registered through different channels. The challenges of getting a “single view of customer” from the data streams created in these processes are discussed in the post Multi-Purpose Data Quality.

There are many nonprofit organizations working internationally. The often decentralized management structures in nonprofit organizations means that way of doing things will naturally be different between countries where nonprofits are operating. Also the differences in legislation and culture are important. Some examples related to how to exploit master data are examined in the post Feasible Names and Addresses.

When it comes to creating business cases for data quality nonprofits are basically of course not different from any other organization. The main goals are increased fundraising and lowering administration costs. As said, the low number of employees often leads to using technology. The low amount of money available often leads to using agile technology.

Bookmark and Share


Lean MDM

17th August 2011

With a discipline as master data management there will of course always be an agile or lean way of doing things.

What is lean MDM?

A document from 2008 called A LEAN APPROACH TO MASTER DATA MANAGEMENT by Duff Bailey examines the benefits of lean MDM.  

The document has a view close to me saying that: “While there is little argument over what constitutes an individual person, many existing data models make the mistake of modeling “roles” (customer, employee, stock-holder, vendor contact, etc.) instead”.

As discussed in the article similar views can be made around organization entities, location entities and product entities.

In conclusion Duff says that: “Because of their universality and their abstract nature, these core data models can be established quickly, without the need for lengthy review that normally accompanies an enterprise data model. Thereafter, the focus of the lean data managemnent effort will be to grow the models and populate the repositories in support of specific business objectives”.

MDM in the high gear

The fast time-to-value for lean MDM was also emphasized by MDM guru Aaron Zornes in a tweet yesterday:

 

The mentioned LeanMDM offer from Omikron Data Quality (which is one of my employers) is described in the link (in German). A short resume of the text is that you among other things will get this from lean MDM:   

  • An increase in the corporate value of customer data
  • Short project times and fast results
  • Lower implementation costs through service-oriented architecture (SOA)

I have been involved in one of the implementations of the LeanMDM concept as described in this article (in English) about how the car rental giant Avis achieved lean MDM for the Scandinavian business.    

Bookmark and Share


Who Is Not Using Data Quality Magic?

2nd August 2011

The other day the latest Gartner Magic Quadrant for Data Quality Tools was released.

If you are interested in knowing what it says, it’s normally possible to download a copy from the leading vendors’ website.

Among the information in the paper you will find some estimated numbers of customers who has purchased the tools from the vendors included in the quadrant.

If you sum up these numbers, then it is estimated that 16,540 organizations worldwide is a customer at an included vendor.

So, if I matched that compiled customer list with the Dun & Bradstreet WorldBase holding at least 100 million active business entities worldwide, I will have a group of at least 99,983,460 companies who is not using magical data quality tools.

And that is probably falsely excluding that there are customers who has more than one vendor.

Anyway, what do all the others do then?

Well, of course the overwhelming number of companies will be too small to have any chance of investing in a data quality tool from a vendor that made it to the quadrant.

The quadrant also list a range of other vendors of data quality tools typically operating locally around the world. These vendors also have customers and probably more customers in numbers but not at the size of the companies who chooses a vendor in the quadrant.   

A lot of data quality technology is also used by service providers who either use a tool from a data quality tool vendor or has made a homegrown solution. So a lot of companies benefit from such services when processing large number of data records to be standardized, deduplicated and enriched.

Then we must not forget that technology doesn’t solve all your data quality issues as stated by the founder of DataQualityPro Dylan Jones in a recent post on a data quality forum operated by the (according to Gartner) leading data quality tool vendor. The post is called Finding the Passion for Data Quality.

My take is that it’s totally true that data quality tools doesn’t solve most of your data quality issues, but those issues addressed, typically data profiling and data matching, are hard to solve without a tool. So there is still a huge market out there currently covered by the true leader in the data quality market: Laissez-Faire.

Bookmark and Share


Proactive Data Governance at Work

28th July 2011

Data governance is 80 % about people and processes and 20 % (if not less) about technology is a common statement in the data management realm.

This blog post is about the 20 % (or less) technology part of data governance.

The term proactive data governance is often used to describe if a given technology platform is able to support data governance in a good way.

So, what is proactive data governance technology?

Obviously it must be the opposite of reactive data governance technology which must be something about discovering completeness issues like in data profiling and fixing uniqueness issues like in data matching.

Proactive data governance technology must be implemented in data entry and other data capture functionality. The purpose of the technology is to assist people responsible for data capture in getting the data quality right from the start.

If we look at master data management (MDM) platforms we have two possible ways of getting data into the master data hub:

  • Data entry directly in the master data hub
  • Data integration by data feed from other systems as CRM, SCM and ERP solutions and from external partners

In the first case the proactive data governance technology is a part of the MDM platform often implemented as workflows with assistance, checks, controls and permission management. We see this most often related to product information management (PIM) and in business-to-business (B2B) customer master data management. Here the insertion of a master data entity like a product, a supplier or B2B customer involves many different employees each with responsibilities for a set of attributes.  

The second case is most often seen in customer data integration (CDI) involving business-to-consumer (B2C) records, but certainly also applies to enriching product master data, supplier master data and B2B customer master data. Here the proactive data governance technology is implemented in the data import functionality or even in the systems of entry best done as Service Oriented Architecture (SOA) components that are hooked into the master data hub as well.

It is a matter of taste if we call such technology proactive data governance support or upstream data quality prevention, and from what I have seen so far, it does work.

Bookmark and Share


History of Data Quality

22nd June 2011

When did the first data quality issue occur? Wikipedia says in the data quality article section titled history that it began with the mainframe computer in the United States of America.

Fellow data quality blogger Steve Sarsfield made a blog post a few years ago called A Brief History of Data Quality where it is said “Believe it or not, the concept of data quality has been touted as important since the beginning of the relational database”.

However, a predominant sentiment in the data quality realm is that data quality is not about technology. It is about people. People are the sinners of data quality flaws and as the main part of the problem people should also be the overwhelming part, if not the only part, of the solution.

So I guess data quality challenges were introduced when people showed up in the real world. How and when that happened is a matter of discussion as discussed in the blog post Out of Africa.

As explained in the post Movable Types the invention of movable types in printing some hundreds of years ago (the most important invention since someone invented the wheel for the first time) made a big boost in knowledge sharing among people – and also a big boost in data and information quality issues.

But I think the saying “To err is human, but to really foul things up you need a computer” is valid. Consequently I also think you may need a computer to help with cleaning up the mess and to prevent the mess from happening again. End of (hi)story.    

Bookmark and Share


Finding Finland

15th April 2011

This is the fourth post in a series of short blog posts focusing on data quality related to different countries around the world. I am not aiming at presenting a single version of the full truth but rather presenting a few random observations that I hope someone living in or with knowledge about the country are able to clarify in a comment.

Let’s start with Finnish

Finland is situated in the North Eastern corner of Europe. The Finnish language is together with Estonian and Hungarian much longer south in Europe totally different from the neighboring countries languages which are Germanic or Slavic. Swedish is also an official language in Finland, and in some parts of Finland cities and streets have both (usually totally different) Finnish and Swedish names.

Galoshes

The by far largest company in Finland is the cell phone maker Nokia. Before the cell phone was invented Nokia made paper and galoshes – the old way of connecting people. Nokia also from 2006 to 2008 owned the data quality firm Identity Systems. It was sold to Informatica. I guess Identity Systems connected with the Gaelic Tiger firm Similarity Systems make up the data matching capabilities at Informatica.

Syslore

One of the remaining (relatively) larger independent data matching firms in the world is Syslore. Syslore is hiding in Finland.

Previous Data Quality World Tour blog posts:

Bookmark and Share


Follow

Get every new post delivered to your Inbox.

Join 109 other followers