Small Business Owners

1st February 2012

A challenge I encounter over and over again within Data Matching and customer Master Data Management is what to do with small business owners.

Examples of small business owners are:

  • Farmers
  • Healthcare professionals with an own clinic
  • Small family driven shop owners
  • Modest membership organisation administrators
  • Local hospitality providers as Basil Fawlty of Fawlty Towers
  • Independent Data Quality consultants as myself

When handling customer master data we often like to divide those into Business-to-consumer (B2C) or Business-to-business (B2B). We may have different source systems, different data models and different data owners and data stewards for each of the two divisions.

But small business owners usually belong to both divisions. In some transactions they act as private persons (B2C) and in some other transactions they act as a business contact (B2B). If you like to know your customer, have a single customer view , engage in social media and all that jazz, you must have a unique view of the person, the business and the household.

In several industries small business owners, the business and the household is a special target group with unique product requirements. This is true for industries as banking, insurance, telco, real estate, law.

So here are plenty of business cases for multi-domain Master Data Management embracing customer master data and product master data.

The capability to handle a single customer view of small business owners is in my experience very poorly fulfilled in Data Quality and Master Data Management solutions around. Here is certainly room for improvement and entrepreneurship.

Bookmark and Share


Indulgent Moderator or Ruthless Terminator?

30th January 2012

I am the founder/moderator of two small niche LinkedIn groups in the data quality and Master Data Management (MDM) realm:

As a moderator I feel responsible for keeping the discussions in the group on target.

I guess my challenges in doing so resemble what nearly every other moderator on LinkedIn groups are faced with.

The postings that keep creating trouble are related to:

  • Jobs
  • Promotions

LinkedIn does have a facility to place entries into these two alternative tabs. But people seldom do that voluntary.

Jobs

In fact I’m pleased when a job is posted in one of the groups. But I also know that many people don’t like job postings coming up among the “normal” discussions in the groups.

I’m not so naive that I think recruiters forget to post as a job or don’t know how to do it. Many recruiters don’t respect the rules even if reminded. And some recruiters keep on entering the same job over and over again.

Therefore I have to mark recruiters, who twice “forget”, as subject to indulgent moderation. As said, I like job postings, so until now I haven’t practiced ruthless termination apart from deleting double entries – but that is also a destination of data matching anyway.

Promotions

With the relative small number of members in the groups in question, and recognising that most participants are tool vendors and service providers, I find it refreshing and informative with entries with promotional content, however most pleased when it’s done with limited marketing triviality.     

My indulgence may be explained by that I’m interconnected with tool makers and service providers myself. So these promotions are great ready-made competitor monitoring.

However, my indulgence has its limits when it comes to off topic promotion.

A special case here is outsourcing promotions. I find it peculiar that those people practicing this trade don’t target the message for the group where posted. It shouldn’t be too hard to make an angle with data matching or Multi-Domain MDM for your services. But I find that most out-sourcing people copy-paste their usual stuff.

So, in this area I mostly am the ruthless terminator. And there is seldom any hasta la vista, baby.

Bookmark and Share


What to do in 2012

28th December 2011

The time between Christmas and New Year is a good time to think about if you are going to do the right things next year. In doing so, you will have to look back at the current year and see how you can develop from there.

In my professional life as a data quality and master data management practitioner my 2011 to do list included these three main activities:

  • Working with Multi-Domain Master Data Quality
  • Exploiting rich external reference data sources in the cloud
  • Doing downstream data cleansing

In a press release from May 2011 Gartner (the analyst firm) Highlights Three Trends That Will Shape the Master Data Management Market. These are:

  • Growing Demand for Multidomain MDM Software
  • Rising Adoption of MDM in the Cloud
  • Increasing Links Between MDM and Social Networks

It looks like I was working in the right space for the first two things but stayed in the past regarding the third activity being downstream data cleansing.

The third thing to embrace in the future, social MDM we may call it, has been an area of interest for me the last couple of years and actually some downstream data cleansing projects has touched making master data useful for including social media networks in the loop.  

I’m not sure if 2012 will be a breakthrough for social MDM, but I think there will be some exciting opportunities out there for paving the road for social MDM.

Bookmark and Share


Party, Product, Place. Period.

18th September 2011

In a recent post here on this blog the Master Data Management domain usually called locations was examined and followed by excellent comments.

Also in the DAMA International LinkedIn group there was a great discussion around the location domain.

The comments touched two subjects:

  • Are locations just geographic locations or can we deal with “digital locations” as eMail addresses, phone numbers, websites, go-to-meeting ID’s, social network ID’s and so as locations as well?
  • How do we model the relations between parties, products and locations?

Sometimes I like to use the word places instead of locations as we then have a P-trinity of parties, products and places.

I’m not sure if places have a stronger semantic link to geography than locations have. Anyway, my thoughts on the location domain were merely connected to geography. The digital locations mentioned also in my eyes are more related to parties and not so much products. The same is true for another good old substitute for a location or address being a mailbox (like “Postbox 1234”), which is a valid notion for the destination of a letter or small package, and often seen in database columns else filled with geographic locations.

So, sticking to places being physical, geographic locations: How do we model parties, products and places?

First of all it’s important that we are able to model different concepts within each domain in one single way. A very common situation in many enterprise data landscapes is that different forms of parties exist with different models, like a model for customers, a model for suppliers, a model for employees and other models for other business partner roles.  

The association between a party entity and a location entity is in most cases a time dependent relation like this consumer was billed on this address in this period. The relation between the party and the product is the good old basic data model, that we invoiced this and this product on that date. The product and place relation is very industry specific. One example will be that an on-site service contract applies to this address in this period.

Time, often handled as a period, will indeed add a fourth P to the P-trinity of party, product and place.

Bookmark and Share


The Location Domain

10th September 2011

When talking master data management we usually divide the discipline into domains, where the two most prominent domains are:

  • Customer, or rather party, master data management
  • Product, sometimes also named “things”, master data management

One the most frequent mentioned additional domains are locations.

But despite that locations are all around we seldom see a business initiative aimed at enterprise wide location data management under a slogan of having a 360 degree view of locations. Most often locations are seen as a subset of either the party master data or in some cases the product master data.  

Industry diversity

The need for having locations as focus area varies between industries.

In some industries like public transit, where I have been working a lot, locations are implicit in the delivered services. Travel and hospitality is another example of a tight connection between the product and a location. Also some insurance products have a location element. And do I have to mention real estate: Location, Location, Location.

In other industries the location has a more moderate relation to the product domain. There may be some considerations around plant and warehouse locations, but that’s usually not high volume and complex stuff.  

Locations as a main factor in exploiting demographic stereotypes are important in retailing and other business-to-consumer (B2C) activities. When doing B2C you often want to see your customer as the household where the location is a main, but treacherous, factor in doing so. We had a discussion on the house-holding dilemma in the LinkedIn Data Matching group recently.

Whenever you, or a partner of yours, are delivering physical goods or a physical letter of any kind to a customer, it’s crucial to have high quality location master data. The impact of not having that is of course dependent on the volume of deliveries.   

Globalization

If you ask me about London, I will instinctively think about the London in England. But there is a pretty big London in Canada too, that would be top of mind to other people. And there are other smaller Londons around the world.

Master data with location attributes does increasingly come in populations covering more than one country. It’s not that ambiguous place names don’t exist in single country sets. Ambiguous place names were the main driver behind that many countries have a postal code system. However the British, and the Canadians, invented a system including letters opposite to most other systems only having numbers typically with an embedded geographic hierarchy.

Apart from the different standards used around the possibilities for exploiting external reference data is very different concerning data quality dimensions as timeliness, consistency, completeness, conformity – and price.

Handling location data from many countries at the same time ruins many best practices of handling location data that have worked for handling location for a single country.

Geocoding

Instead of identifying locations in a textual way by having country codes, state/province abbreviations, postal codes and/or city names, street names and types or blocks and house numbers and names it has become increasingly popular to use geocoding as supplement or even alternative.

There are different types of geocodes out there suitable for different purposes. Examples are:

  • Latitude and longitude picturing a round world,
  • UTM X,Y coordinates picturing peels of the world
  • WGS84 X, Y coordinates picturing a world as flat as your computer screen.

While geocoding has a lot to offer in identifying and global standardization we of course has a gap between geocodes and everyday language. If you want to learn more then come and visit me at N55’’38’47, E12’’32’58.

Bookmark and Share


The Database versus the Hub

4th September 2011

In the LinkedIn Multi-Domain MDM group we have an ongoing discussion about why you need a master data hub when you already got some workflow, UI and a database.

I have been involved in several master data quality improvement programs without having the opportunity of storing the results in a genuine MDM solution, for example as described in the post Lean MDM. And of course this may very well result in a success story.

However there are some architectural reasons why many more organizations than those who are using a MDM hub today may find benefits in sooner or later having a Master Data hub.

Hierarchical Completeness

If we start with product master data the main issue with storing product master data is the diversity in the requirements for which attributes is needed and when they are needed dependent on the categorization of the products involved.

Typical you will have hundreds or thousands of different attributes where some are crucial for one kind of product and absolutely ridiculous for another kind of product.

Modeling a single product table with thousands of attributes is not a good database practice and pre-modeling tables for each thought categorization is very inflexible.

Setting up mandatory fields on database level for product master data tables is asking for data quality issues as you can’t miss either over-killing or under-killing.

Also product master data entities are seldom created in one single insertion, but is inserted and updated by several different employees each responsible for a set of attributes until it is ready to be approved as a whole.

A master data hub, not at least those born in the product domain, is built for those realities.

The party domain has hierarchical issues too. One example will be if a state/province is mandatory on an address, which is dependent on the country in question.

Single Business Partner View

I like the term “single business partner view” as a higher vision for the more common “single customer view”, as we have the same architectural requirements for supplier master data, employee master data and other master data concerning business partners as we have for the of course extremely important customer master data.

The uniqueness dimension of data quality has a really hard time in common database managers. Having duplicate customer, supplier and employee master data records is the most frequent data quality issue around.

In this sense, a duplicate party is not a record with accurately the same fields filled and with accurate the same values spelled accurately the same as a database will see it. A duplicate is one record reflecting the same real world entity as another record and a duplicate group is more records reflecting the same real world entity.

Even though some database managers have fuzzy capabilities they are still very inadequate in finding these duplicates based on including several attributes at one time and not at least finding duplicate groups.

Finding duplicates when inserting supposed new entities into your customer list and other party master data containers is only the first challenge concerning uniqueness. Next you have to solve the so called survivorship questions being what values will survive unavoidable differences.

Finally the results to be stored may have several constructing outcomes. Maybe a new insertion must be split into two entities belonging to two different hierarchy levels in your party master data universe.

A master data hub will have the capabilities to solve this complexity, some for customer master data only, some also for supplier master data combined with similar challenges with product master data and eventually also other party master data.

Domain Real World Awareness

Building hierarchies, filling incomplete attributes and consolidating duplicates and other forms of real world alignment is most often fulfilled by including external reference data.

There are many sources available for party master as address directories, business directories and citizen information dependent on countries in question.

With product master data global data synchronization involving common product identifiers and product classifications is becoming very important when doing business the lean way.

Master data hubs knows these sources of external reference data so you, once again, don’t have to reinvent the wheel.  

Bookmark and Share


Mutating Platforms or Intelligent Design

16th July 2011

How do we go from single-domain master data management to multi-domain master data management? Will it be through evolution of single-domain solutions or will it require a complete new intelligent design?

The MDM journey

My previous blog post was a book review of “Master Data Management in Practice” by Dalton Servo and Mark Allen – or the full title of the book is in fact “Master Data Management in Practice: Achieving True Customer MDM”.

The customer domain has until now been the most frequent and proven domain for master data management and as said in the book, the domain where most organizations starts the MDM journey in particular by doing what is usually called Customer Data Integration (CDI).

However some organizations do start with Product Information Management (PIM). This is mainly due to the magic numbers being the fact that some organizations have a higher number of products than customers in the database.

Sooner or later most organizations will continue the MDM journey by embracing more domains.

Achieving Multi-Domain MDM

John Owens made a blog post yesterday called “Data Quality: Dead Crows Kill Customers! Dead Crows also Kill Suppliers!” The post explains how some data structures are similar between sales and purchasing. For example a customer and a supplier are very similar as a party.

Customer Data Integration (CDI) has a central entity being the customer, which is a party. Product Information Management (PIM) has an important entity being a supplier, which is a party. The data structures and the workflows needed to Create, Read, Update and perhaps Delete these entities are very similar, not at least in business-to-business (B2B) environments.

So, when you are going from PIM to CDI, you don’t have to start from scratch, not at least in a B2B environment.

The trend in the master data management technology market is that many vendors are working their way from being a single domain vendor to being a multi-domain vendor – and some are promoting their new intelligent design embracing all domains from day one.

Some other vendors are breeding several platforms (often based on acquisition) from different domains into one brand, and some vendors are developing from a single domain into new domains.     

Each strategy has its pros and cons. It seems there will be plenty of philosophies to choose from when organizations are going the select the platform(s) to support the multi-domain MDM journey.

Bookmark and Share


Psychographic Data Quality

5th July 2011

I have just read an article on Mashable by Jamie Beckland called The End of Demographics: How Marketers Are Going Deeper With Personal Data.

The article explains how new sources of available data makes it possible for marketers to get a much closer look at potential customers and thereby going from delivering a broad message to a huge crowd to delivering a very targeted message to a small group of people with a high probability of getting a response.  In short: Marketers are going from demographic marketing to psychographic marketing.

I believe this is true and ongoing (as I have also been involved in such activities).

The data quality issues we have always known in direct marketing is surely very similar in the psychographic marketing which is going on in the social media realm and in connection with eBusiness.

In my eyes, the concept of a single customer view is also a key to getting success in psychographic marketing.  

You are not delivering a targeted message if you are delivering two different messages to two user profiles belonging to the same real world individual.

Your message will be very frustrating if you treat someone as a prospect customer if that someone already is an existing customer perhaps in another channel.

The effectiveness of psychographic marketing depends on a match between the psychographic variables, the behavioral variables and the demographic variables. As seen in the example in the Mashable article a good old thing as geocoding will be needed here.

An exciting thing in the rise of psychographic marketing is that it will add to the trend in data quality technology where it’s much more than simple name and address cleansing and deduplication.  Rich location data will despite the virtual playground be further important. The relations between customers and products as described in the post Customer Product Matrix Management will be further refined in psychographic marketing.       

Bookmark and Share


B2C versus B2B Data Quality

8th June 2011

The data quality issues in doing business with private consumers (business-to-consumer = B2C) and doing business with other business’s (business-to-business = B2B) have a lot of similar challenges but also differs in a lot of ways.

Some of my experiences (and thoughts) related to different master data domains are:

Customer master data

In B2C the number of customers, prospects and leads is usually high and characterized by relatively few interactions with each entity.  In B2B you usually have a relatively small number of customers with a high number of interactions.

One of the most automated activities in data quality improvement is matching master data records with information about customers. Many of the examples we see in marketing material, research documents, blog posts and so on is about matching in the B2C realm. This is natural since the high number of records typically with a low attached value calls for automation.

Data matching in the B2B realm is indeed more complex due to numerous challenges like less standardized names of companies and typically more options in what constitutes a single customer. The high value attached to each customer also makes the risk of mistakes a showstopper for too much automation.

So in B2B we see an increasing adaption of creating workflows that insures data quality during data capture often by exploiting external reference data which also in general are more available related to business entities.

Location master data

The location of B2C customers means a lot. Accurate and timely delivery addresses for everything from direct mails to bringing goods to the premises are essential. Location data are used to recognize household relations, assigning demographic stereotypes and in many cases calculating fees of different kind. I had a near disaster experience with a really bad address in my early career.

Even though location data for B2B activities theoretically is just as important, I have often seen that a little less precision is fit for purpose or anyway lower prioritized than more pressing issues.

Product master data

Theoretically there should be no difference between B2C and B2B here, but I guess there is in practice?

The most interesting aspect is probably the multi-domain aspect examining the relations between customers and products.   

I had some experiences some years ago with the B2B realm as described in the post What is Multi-Domain MDM?: 1,000 B2B customers buying 1,000 different finished products can be a quite complicated data quality operation.

Within the B2C realm the most predominant multi-domain data quality issues I have met is related to analytics. As discussed in the post Customer/Product Matrix Management it is about typifying your customers correctly and categorizing your products adequately at the same time.

Bookmark and Share


Non-Obvious Entity Relationship Awareness

16th March 2011

In a recent post here on this blog it was discussed: What is Identity Resolution?

One angle was the interchangeable use of the terms “Identity Resolution” and “Entity Resolution”. These terms can be seen as truly interchangeable, as that “Identity Resolution” is more advanced than “Entity Resolution” or as (my suggestion) that “Identity Resolution” is merely related to party master data, but “Entity Resolution” can be about all master data domains as parties, locations and products.

Another term sometimes used in this realm is “Non-Obvious Relationship Awareness”. Also this term is merely related to finding relationships between parties, for example individuals at a casino that seems to do better than the croupiers. Here’s a link to a (rather old) O’Reilly Radar post on Non-Obvious Relationship Awareness.

Going Multi-Domain

So “Non-Obvious Entity Relationship Awareness” could be about finding these hidden relationships in a multi-domain master data scope.

An example could be non-obvious relationships in a customer/product matrix.

The data supporting this discovery will actually not be found in the master data itself, but in transaction data probably being in an Enterprise Data Warehouse (EDW). But a multi-domain master data management platform will be needed to support the complex hierarchies and categorizations needed to make the discovery.   

One technical aspect of discovering such non-obvious relationships is how chains of keys are stored in the multi-domain master data hub.

Customer Master Data

The transactions or sums hereof in the data warehouse will have keys referencing customer accounts. These accounts can be stored in staging areas in the master data hub with references to a golden record for each individual or company in the real world. Depending on the identity resolution available the golden records will have golden relations to each other as they are forming hierarchies of households, company family trees, contacts within companies and their movements between companies and so on.

My guess as described in the post Who is working where doing what? is that this will increasingly include social media data.

Product Master Data

Some of the same transactions or sums hereof in the data warehouse will have keys referencing products. These products will exist in the master data hub as members of various hierarchies with different categorizations.

My guess is that future developments in this field will further embrace not just your own products but also competitor products and market data available in the cloud all attached to your hierarchies and categorizations.   

Bookmark and Share


Follow

Get every new post delivered to your Inbox.

Join 109 other followers