Extreme (Weather) Information Quality

This morning I had my scheduled train journey from London, UK to Manchester, UK cancelled.

It’s not that I wasn’t warned. The British press has been hysterical the last days because temperature was going to be below freezing and some snowfall was expected. For example BBC had a subject matter expert in the studio showing how to pack the trunk of your car with stuff feasible for a trip across the North Pole.

Anyway, encouraged by that the train was set to go on the online status I made my way to Euston Station, where I was delighted to see the train was announced for none delayed departure on the screen there. Only to be very disappointed by the message, 10 minutes after scheduled departure, saying that the service was cancelled “due to the severe weather conditions”.

Well, well, well. The temperature is above freezing this lovely Sunday morning. There is practically no wind and only some watery remains of tonight’s snowfall on the ground. With that interpretation of the raw data I guess you couldn’t go around in Scandinavia a considerable part of the year.

But that is how it is when making raw data into information. Different results indeed.

I guess it is good business for Virgin Train not to be prepared for a little bit of snow when operating in England thus making the first sign of the white fluffy stuff from above being “severe weather conditions”.

My next business analysis with Virgin Train will be targeting at the refund procedure. Hope the customer experience will be just fine.

Bookmark and Share

Multi-Occupancy

The fact that many people doesn’t live in a single family house but live in a flat sharing the same building number on a street with people living in other flats in the same building is a common challenge in data quality and data matching.

The same challenge also applies to companies sharing the same building number with other companies and not to say when companies and households are in the same building. So this is a common party master data issue.

Address verification and geocoding is seen as important methods for achieving data quality improvement related to the top data quality pain all over being quality of party master data and aiming at getting a single customer view.

Multi-occupancy is a pain in the (you know) getting there.

My pain

I have had some personal experiences living at multi-occupancy addresses lately.

One and a half years ago I was living a painless life in single family house in a Copenhagen suburb.

Then I moved closer to downtown Copenhagen in a flat as mentioned in post Down the Street.

The tradition in Denmark is to send letters and make deliveries and register master data with a common format of units within a building and having separate mailboxes with flat ID and names for each flat. I have received most of my post since then and got all deliveries I’m aware of.

Then I moved to London in a flat. Here the flats in my building have numbers. But the postman delivers the letters in one batch in the street door, and there are no names on the doorbells in front of the door.

So now I sense I don’t get many letters and today I had to order the same stuff trice from amazon.co.uk, because I haven’t received the first two packages despite of their state of the art online accessible package tracking systems that tells me that delivery was successful.

Master data pains unresolved

Address reference data at building number level and related geocodes are becoming commonly available many places around these days.

But having reference data and real world aligned location and related party master data at the unit level is still a challenge most places. Therefore we are still struggling with using address verification and geocoding for single customer view where a given building number has more than a single occupancy.

Bookmark and Share

Nationally International

I am right now in the process of moving most of my business from the Kingdom of Denmark to the United Kingdom.

During that process I have become a regular customer at the Gatwick Express, the (sometimes) fast train going from London’s second largest airport to central London.

When buying tickets online they require you to enter a billing address. Here you can choose between entering a UK address or an international address.

If you enter a UK address the site takes advantage of the UK postal code system where you just have to enter a postcode, which is very granular in the UK, and a house number, and then the system will know your address.

Alternatively you can choose to enter an international address. In that case you will get a form with more fields for you to enter. But, in order not to be too international the form still have the UK way of formatting an address.

Also the default country is United Kingdom which I guess is the only value that should not be applicable for this form.

Bookmark and Share

The Pond

The term ”The Pond” is often used as an informal term for the Atlantic Ocean, especially the North Atlantic Ocean being the waters that separates North America and Europe.

Within information technology and not at least my focus areas being data quality and master data management there is a lot of exchange going on over the pond as European companies are using North American technology and sometimes vice versa. Also European companies are setting up operations in North America and of course also the other way around.

Some technologies works pretty much the same regardless of in which country it is deployed. A database manager product is an example of that kind of technology. Other pieces of software must be heavily localized. An ERP application belongs to that category. Data quality and master data management tools and implementation practice are indeed also subject to diversity considerations.

When North American companies go to Europe my gut feeling is that an overwhelming part of them chooses to start with a European or EMEA wide head quarter on the British Isles – and that again means mostly in the London area.

The reasons for that may be many. However I guess that the fact that people on the British Isles doesn’t speak a strange language has a lot to say. What many North American companies with a head quarter in London often has to realize then is, that this move only got them half way over the pond.  

Bookmark and Share

The Location Domain

When talking master data management we usually divide the discipline into domains, where the two most prominent domains are:

  • Customer, or rather party, master data management
  • Product, sometimes also named “things”, master data management

One the most frequent mentioned additional domains are locations.

But despite that locations are all around we seldom see a business initiative aimed at enterprise wide location data management under a slogan of having a 360 degree view of locations. Most often locations are seen as a subset of either the party master data or in some cases the product master data.  

Industry diversity

The need for having locations as focus area varies between industries.

In some industries like public transit, where I have been working a lot, locations are implicit in the delivered services. Travel and hospitality is another example of a tight connection between the product and a location. Also some insurance products have a location element. And do I have to mention real estate: Location, Location, Location.

In other industries the location has a more moderate relation to the product domain. There may be some considerations around plant and warehouse locations, but that’s usually not high volume and complex stuff.  

Locations as a main factor in exploiting demographic stereotypes are important in retailing and other business-to-consumer (B2C) activities. When doing B2C you often want to see your customer as the household where the location is a main, but treacherous, factor in doing so. We had a discussion on the house-holding dilemma in the LinkedIn Data Matching group recently.

Whenever you, or a partner of yours, are delivering physical goods or a physical letter of any kind to a customer, it’s crucial to have high quality location master data. The impact of not having that is of course dependent on the volume of deliveries.   

Globalization

If you ask me about London, I will instinctively think about the London in England. But there is a pretty big London in Canada too, that would be top of mind to other people. And there are other smaller Londons around the world.

Master data with location attributes does increasingly come in populations covering more than one country. It’s not that ambiguous place names don’t exist in single country sets. Ambiguous place names were the main driver behind that many countries have a postal code system. However the British, and the Canadians, invented a system including letters opposite to most other systems only having numbers typically with an embedded geographic hierarchy.

Apart from the different standards used around the possibilities for exploiting external reference data is very different concerning data quality dimensions as timeliness, consistency, completeness, conformity – and price.

Handling location data from many countries at the same time ruins many best practices of handling location data that have worked for handling location for a single country.

Geocoding

Instead of identifying locations in a textual way by having country codes, state/province abbreviations, postal codes and/or city names, street names and types or blocks and house numbers and names it has become increasingly popular to use geocoding as supplement or even alternative.

There are different types of geocodes out there suitable for different purposes. Examples are:

  • Latitude and longitude picturing a round world,
  • UTM X,Y coordinates picturing peels of the world
  • WGS84 X, Y coordinates picturing a world as flat as your computer screen.

While geocoding has a lot to offer in identifying and global standardization we of course has a gap between geocodes and everyday language. If you want to learn more then come and visit me at N55’’38’47, E12’’32’58.

Bookmark and Share

Managing Client On-Boarding Data

This year I will be joining FIMA: Europe’s Premier Financial Reference Data Management Conference for Data Management Professionals. The conference is held in London from 8th to 10th November.

I will present “Diversities In Using External Registries In A Globalised World” and take part in the panel discussion “Overcoming Key Challenges In Managing Client On-Boarding Data: Opportunities & Efficiency Ideas”.

As said in the panel discussion introduction: The industry clearly needs to normalise (or is it normalize?) regional differences and establish global standards.

The concept of using external reference data in order to improve data quality within master data management has been a favorite topic of mine for long.

I’m not saying that external reference data is a single source of truth. Clearly external reference data may have data quality issues as exemplified in my previous blog post called Troubled Bridge Over Water.

However I think there is a clear trend in encompassing external sources, increasingly found in the cloud, to make a shortcut in keeping up with data quality. I call this Data Quality 3.0.

The Achilles Heel though has always been how to smoothly integrate external data into data entry functionality and other data capture processes and not to forget, how to ensure ongoing maintenance in order to avoid else inevitable erosion of data quality.

Lately I have worked with a concept called instant Data Quality. The idea is to make simple yet powerful functionality that helps with hooking up with many external sources at the same time when on-boarding clients and making continuous maintenance possible.

One aspect of such a concept is how to exploit the different opportunities available in each country as public administrative practices and privacy norms varies a lot over the world.

I’m looking forward to present and discuss these challenges and getting a lot of feedback.

Bookmark and Share

A Data Quality Appliance?

Today it was announced that IBM is to acquire Netezza, a data warehouse appliance vendor.

5 years ago I guess the interest for data warehouse appliances was very sparse. I guess this because I attended a session held by Netezza at the 2005 London Information Management conference. We were 3 people in the room: The presenter, a truly interested delegate and me. I was basically in the room because I was the next speaker in the room and wanted to see how things worked out. For the record: It was a good session, I learned a lot about appliances.  

Probably therefore I noticed a piece from 2007 where Philip Howard of Bloor wrote about The scope for appliances. In this article Phillip Howard also suggested other types of appliances, for example a data quality (data matching) appliance.  

I have been around some implementations where we could use the power of an appliance when we have to match a lot of rows. The Achilles’ heel in data matching is candidate selection and often you have to restrict on your methods in order to maintain a reasonable performance.

But I wonder if I ever will see an on promise data quality (data matching) appliance or it will be placed in the cloud. Or maybe there already is one out there? If so, please tell about it.    

Bookmark and Share

My Ash Cloud Prediction

The Master Data Management Summit Europe 2010 starts tomorrow. I have attended the IRM events in London several times (and also spoken there once). This year I didn’t plan to go to London in April because I predicted the no fly havoc in Northern Europe that would follow the Iceland volcanic eruption given the wind direction. Not?