Indulgent Moderator or Ruthless Terminator?

I am the founder/moderator of two small niche LinkedIn groups in the data quality and Master Data Management (MDM) realm:

As a moderator I feel responsible for keeping the discussions in the group on target.

I guess my challenges in doing so resemble what nearly every other moderator on LinkedIn groups are faced with.

The postings that keep creating trouble are related to:

  • Jobs
  • Promotions

LinkedIn does have a facility to place entries into these two alternative tabs. But people seldom do that voluntary.

Jobs

In fact I’m pleased when a job is posted in one of the groups. But I also know that many people don’t like job postings coming up among the “normal” discussions in the groups.

I’m not so naive that I think recruiters forget to post as a job or don’t know how to do it. Many recruiters don’t respect the rules even if reminded. And some recruiters keep on entering the same job over and over again.

Therefore I have to mark recruiters, who twice “forget”, as subject to indulgent moderation. As said, I like job postings, so until now I haven’t practiced ruthless termination apart from deleting double entries – but that is also a destination of data matching anyway.

Promotions

With the relative small number of members in the groups in question, and recognising that most participants are tool vendors and service providers, I find it refreshing and informative with entries with promotional content, however most pleased when it’s done with limited marketing triviality.     

My indulgence may be explained by that I’m interconnected with tool makers and service providers myself. So these promotions are great ready-made competitor monitoring.

However, my indulgence has its limits when it comes to off topic promotion.

A special case here is outsourcing promotions. I find it peculiar that those people practicing this trade don’t target the message for the group where posted. It shouldn’t be too hard to make an angle with data matching or Multi-Domain MDM for your services. But I find that most out-sourcing people copy-paste their usual stuff.

So, in this area I mostly am the ruthless terminator. And there is seldom any hasta la vista, baby.

Bookmark and Share

Multi-Occupancy

The fact that many people doesn’t live in a single family house but live in a flat sharing the same building number on a street with people living in other flats in the same building is a common challenge in data quality and data matching.

The same challenge also applies to companies sharing the same building number with other companies and not to say when companies and households are in the same building. So this is a common party master data issue.

Address verification and geocoding is seen as important methods for achieving data quality improvement related to the top data quality pain all over being quality of party master data and aiming at getting a single customer view.

Multi-occupancy is a pain in the (you know) getting there.

My pain

I have had some personal experiences living at multi-occupancy addresses lately.

One and a half years ago I was living a painless life in single family house in a Copenhagen suburb.

Then I moved closer to downtown Copenhagen in a flat as mentioned in post Down the Street.

The tradition in Denmark is to send letters and make deliveries and register master data with a common format of units within a building and having separate mailboxes with flat ID and names for each flat. I have received most of my post since then and got all deliveries I’m aware of.

Then I moved to London in a flat. Here the flats in my building have numbers. But the postman delivers the letters in one batch in the street door, and there are no names on the doorbells in front of the door.

So now I sense I don’t get many letters and today I had to order the same stuff trice from amazon.co.uk, because I haven’t received the first two packages despite of their state of the art online accessible package tracking systems that tells me that delivery was successful.

Master data pains unresolved

Address reference data at building number level and related geocodes are becoming commonly available many places around these days.

But having reference data and real world aligned location and related party master data at the unit level is still a challenge most places. Therefore we are still struggling with using address verification and geocoding for single customer view where a given building number has more than a single occupancy.

Bookmark and Share

Reference Data at Work in the Cloud

One of the product development programs I’m involved in is about exploiting rich external reference data and using these data in order to get data quality right the first time and being able to maintain optimal data quality over time.

The product is called instant Data Quality (abbreviated as iDQ ™). I have briefly described the concept in an earlier post called instant Data Quality.

iDQ ™combines two concepts:

  • Software as a Service
  • Data as a Service

While most similar solutions are bundled with one specific data provider the iDQ ™ concept embraces a range data sources. The current scope is around customer master data where iDQ ™ may include Business-to-Business (B2B) directories, Business-to-Consumer (B2C) directories, real estate directories, Postal Address Files and even social media network data from external sources as well as internal master data at the same time all presented in a compact mash-up.

The product has already gained a substantial success in my home country Denmark leading to the formation of a company solely working with development and sales of iDQ ™.

The results iDQ ™ customers gains may seem simple but are the core advantages of better data quality most enterprises are looking for, like said by one of Denmark’s largest companies:

“For DONG Energy iDQ ™ is a simple and easy solution when searching for master data on individual customers. We have 1,000,000 individual customers. They typically relocate a few times during the time they are customers of us. We use iDQ ™ to find these customers so we can send the final accounts to the new address. iDQ ™ also provides better master data because here we have an opportunity to get names and addresses correctly spelled.

iDQ ™ saves time because we can search many databases at the time. Earlier we had to search several different databases before we found the right master data on the customer. ”

Please find more testimonials here.

I hope to be able to link to testimonials in more languages in the future.

Bookmark and Share

What to do in 2012

The time between Christmas and New Year is a good time to think about if you are going to do the right things next year. In doing so, you will have to look back at the current year and see how you can develop from there.

In my professional life as a data quality and master data management practitioner my 2011 to do list included these three main activities:

  • Working with Multi-Domain Master Data Quality
  • Exploiting rich external reference data sources in the cloud
  • Doing downstream data cleansing

In a press release from May 2011 Gartner (the analyst firm) Highlights Three Trends That Will Shape the Master Data Management Market. These are:

  • Growing Demand for Multidomain MDM Software
  • Rising Adoption of MDM in the Cloud
  • Increasing Links Between MDM and Social Networks

It looks like I was working in the right space for the first two things but stayed in the past regarding the third activity being downstream data cleansing.

The third thing to embrace in the future, social MDM we may call it, has been an area of interest for me the last couple of years and actually some downstream data cleansing projects has touched making master data useful for including social media networks in the loop.  

I’m not sure if 2012 will be a breakthrough for social MDM, but I think there will be some exciting opportunities out there for paving the road for social MDM.

Bookmark and Share

The Pond

The term ”The Pond” is often used as an informal term for the Atlantic Ocean, especially the North Atlantic Ocean being the waters that separates North America and Europe.

Within information technology and not at least my focus areas being data quality and master data management there is a lot of exchange going on over the pond as European companies are using North American technology and sometimes vice versa. Also European companies are setting up operations in North America and of course also the other way around.

Some technologies works pretty much the same regardless of in which country it is deployed. A database manager product is an example of that kind of technology. Other pieces of software must be heavily localized. An ERP application belongs to that category. Data quality and master data management tools and implementation practice are indeed also subject to diversity considerations.

When North American companies go to Europe my gut feeling is that an overwhelming part of them chooses to start with a European or EMEA wide head quarter on the British Isles – and that again means mostly in the London area.

The reasons for that may be many. However I guess that the fact that people on the British Isles doesn’t speak a strange language has a lot to say. What many North American companies with a head quarter in London often has to realize then is, that this move only got them half way over the pond.  

Bookmark and Share

Nonprofit Data Quality

One of the industries where I have worked a lot with data quality issues is at nonprofit organizations such as charities and other form of membership based organizations.

A general characteristic of such organizations is that they have databases with as many “customers” as huge global enterprises; however the number of employee records is only a fraction compared to those large companies.

So the emphasis is often not at creating well manned data governance organizational structures but implementing the best automation available in order to have optimal party master data management, where the parties involved are members and other roles played by individuals and companies with a common interest.

Many nonprofit organizations have several different fundraising activities going on at the same time. This means that real world individuals, households, organizations and their contacts are registered through different channels. The challenges of getting a “single view of customer” from the data streams created in these processes are discussed in the post Multi-Purpose Data Quality.

There are many nonprofit organizations working internationally. The often decentralized management structures in nonprofit organizations means that way of doing things will naturally be different between countries where nonprofits are operating. Also the differences in legislation and culture are important. Some examples related to how to exploit master data are examined in the post Feasible Names and Addresses.

When it comes to creating business cases for data quality nonprofits are basically of course not different from any other organization. The main goals are increased fundraising and lowering administration costs. As said, the low number of employees often leads to using technology. The low amount of money available often leads to using agile technology.

Bookmark and Share

Party, Product, Place. Period.

In a recent post here on this blog the Master Data Management domain usually called locations was examined and followed by excellent comments.

Also in the DAMA International LinkedIn group there was a great discussion around the location domain.

The comments touched two subjects:

  • Are locations just geographic locations or can we deal with “digital locations” as eMail addresses, phone numbers, websites, go-to-meeting ID’s, social network ID’s and so as locations as well?
  • How do we model the relations between parties, products and locations?

Sometimes I like to use the word places instead of locations as we then have a P-trinity of parties, products and places.

I’m not sure if places have a stronger semantic link to geography than locations have. Anyway, my thoughts on the location domain were merely connected to geography. The digital locations mentioned also in my eyes are more related to parties and not so much products. The same is true for another good old substitute for a location or address being a mailbox (like “Postbox 1234”), which is a valid notion for the destination of a letter or small package, and often seen in database columns else filled with geographic locations.

So, sticking to places being physical, geographic locations: How do we model parties, products and places?

First of all it’s important that we are able to model different concepts within each domain in one single way. A very common situation in many enterprise data landscapes is that different forms of parties exist with different models, like a model for customers, a model for suppliers, a model for employees and other models for other business partner roles.

The association between a party entity and a location entity is in most cases a time dependent relation like this consumer was billed on this address in this period. The relation between the party and the product is the good old basic data model, that we invoiced this and this product on that date. The product and place relation is very industry specific. One example will be that an on-site service contract applies to this address in this period.

Time, often handled as a period, will indeed add a fourth P to the P-trinity of party, product and place.

Bookmark and Share

Citizen Master Data Management

Citizen Master Data Management in the public sector is the equivalence of Customer Master Data Management in the private sector.

Where are we?

As private organizations find different solutions to how to manage customer master data, governments around the world also have found their particular solution for managing citizen master data.

Most descriptions on data management are originated in the United States and so are also many examples and issues related to citizen master data management. One example is this blog post from IBM Initiate called The End of the Social Security Number?

As mentioned in the post there are different administrative practices around the world where governments may learn from experiences with alternative solutions in other countries.

During last year’s discussion in Canada about the census form I had the chance to write a guest blog post on a Canadian blog about How Denmark does it.

The way of the world does change. One example is the program in India called Aadhaar aiming at providing a unique national ID for the over one billion people living in India.

When to register?

The question about when a citizen has to be included in a citizen master data registry of course depends on the purpose of the registry. If the single purpose for example is driving license administration it will depend on when a citizen may obtain a driving license and that will exclude citizens under a certain age depending on the rules in place. The same applies to an electoral roll.

In my country we have an all-purpose citizen master data hub, which today means that a new born is registered and provided a unique Citizen ID within seconds.

Similar considerations apply to immigration and cross boarder employment.

What to store?

Citizen master data registries typically hold attributes as an identifier, name and address and status information.

As new technologies matures governments of course considers if such technologies may be feasible and may add benefits as part of the master data stored about citizens.

Using biometrics is a controversial topic here. The pros and cons were discussed, based on the cancelled program in the United Kingdom, in the post Citizen ID and Biometrics.

Who will share?

Privacy considerations are paramount in most discussions around citizen master data hubs.

Even if you have an all-purpose citizen registry there will be laws limiting how public sector may exploit data identified with the registry and the identifier in use.

On the other hand, in some countries even private sector organizations may benefit from such a master data hub.

An example from Sweden is shown here in the post No Privacy Customer Onboarding.

Bookmark and Share

Some Voter Musings

Tomorrow there is a general election in my home country Denmark.

Voter registration

There are different systems of voter registration around the world.

In some countries there are electoral roles being data silos of citizen master data more or less integrated with other citizen master data silos for other purposes as driving license administration, social security and taxation.

In Denmark we have an all-purpose single master data hub for citizens. When we have to vote, the ballots are extracted from the hub based on your age (from 18 on election day) and citizen status (excluding citizens of other countries living or working here).

The political scope

The voter’s role is to select members for the parliament. Then the parliament will select a prime minister.

One of the two most likely candidates for next prime minister is the current one with the nickname “Little Lars”, who came to power when the former one became general secretary of NATO and moved to the HQ in Brussels. Lars is head of the political party called Left (Venstre), which is a right wing party. He is going to defend the welfare state, including universal healthcare and free college.

His main opponent has the nickname “Gucci Helle”.  She is leading the left block. She is going to defend the welfare state, including universal healthcare and free college.

Head of state

As voters we are not trusted to select the head of state. The queen was born to be queen, and her eldest son will be the next king. On the other hand, the members of the Royal Family are not allowed to vote in the election.  This is the exception that confirms the rule.

Bookmark and Share

The Location Domain

When talking master data management we usually divide the discipline into domains, where the two most prominent domains are:

  • Customer, or rather party, master data management
  • Product, sometimes also named “things”, master data management

One the most frequent mentioned additional domains are locations.

But despite that locations are all around we seldom see a business initiative aimed at enterprise wide location data management under a slogan of having a 360 degree view of locations. Most often locations are seen as a subset of either the party master data or in some cases the product master data.  

Industry diversity

The need for having locations as focus area varies between industries.

In some industries like public transit, where I have been working a lot, locations are implicit in the delivered services. Travel and hospitality is another example of a tight connection between the product and a location. Also some insurance products have a location element. And do I have to mention real estate: Location, Location, Location.

In other industries the location has a more moderate relation to the product domain. There may be some considerations around plant and warehouse locations, but that’s usually not high volume and complex stuff.  

Locations as a main factor in exploiting demographic stereotypes are important in retailing and other business-to-consumer (B2C) activities. When doing B2C you often want to see your customer as the household where the location is a main, but treacherous, factor in doing so. We had a discussion on the house-holding dilemma in the LinkedIn Data Matching group recently.

Whenever you, or a partner of yours, are delivering physical goods or a physical letter of any kind to a customer, it’s crucial to have high quality location master data. The impact of not having that is of course dependent on the volume of deliveries.   

Globalization

If you ask me about London, I will instinctively think about the London in England. But there is a pretty big London in Canada too, that would be top of mind to other people. And there are other smaller Londons around the world.

Master data with location attributes does increasingly come in populations covering more than one country. It’s not that ambiguous place names don’t exist in single country sets. Ambiguous place names were the main driver behind that many countries have a postal code system. However the British, and the Canadians, invented a system including letters opposite to most other systems only having numbers typically with an embedded geographic hierarchy.

Apart from the different standards used around the possibilities for exploiting external reference data is very different concerning data quality dimensions as timeliness, consistency, completeness, conformity – and price.

Handling location data from many countries at the same time ruins many best practices of handling location data that have worked for handling location for a single country.

Geocoding

Instead of identifying locations in a textual way by having country codes, state/province abbreviations, postal codes and/or city names, street names and types or blocks and house numbers and names it has become increasingly popular to use geocoding as supplement or even alternative.

There are different types of geocodes out there suitable for different purposes. Examples are:

  • Latitude and longitude picturing a round world,
  • UTM X,Y coordinates picturing peels of the world
  • WGS84 X, Y coordinates picturing a world as flat as your computer screen.

While geocoding has a lot to offer in identifying and global standardization we of course has a gap between geocodes and everyday language. If you want to learn more then come and visit me at N55’’38’47, E12’’32’58.

Bookmark and Share