Multi-Purpose Data Quality

Say you are an organisation within charity fundraising. Since many years you had a membership database and recently you also introduced an eShop with related accessories.

The membership database holds the following record (Name, Address, City, YearlyContribution):

  •  Margaret & John Smith, 1 Main Street, Anytown, 100 Euro

The eShop system has the following accounts (Name, Address, Place, PurchaseInAll):

  • Mrs Margaret Smith, 1 Main Str, Anytown, 12 Euro
  • Peggy Smith, 1 Main Street, Anytown, 218 Euro
  • Local Charity c/o Margaret Smith, 1 Main Str, Anytown, 334 Euro

Now the new management wants to double contributions from members and triple eShop turnover. Based on the recommendations from “The One Truth Consulting Company” you plan to do the following:

  • Establish a platform for 1-1 dialogue with your individual members and customers
  • Analyze member and customer behaviour and profiles in order to:
    • Support the 1-1 dialogue with existing members and customers
    • Find new members and customers who are like your best members and customers

As the new management wants to stay for many years ahead, the solution must not be a one-shot exercise but must be implemented as a business process reengineering with a continuous focus on the best fit data governance, master data management and data (information) quality.

question-marksSo, what are you going to do with your data so they are fit for action with the old purposes and the new purposes?

Recently I wrote some posts related to these challenges:

Any other comments on the issues in how to do it are welcome.

Bookmark and Share

Follow Friday Master Data Hub

Social Networking needs Master Data Management.

brownbird_leftA recurring event every Friday on Twitter is the #FollowFriday with the acronym #FF, where people on Twitter tweets about who to follow.

I do it too and as every one else sometimes I perhaps forget someone, and then (s)he gets angry and don’t #FF me and that’s bad. Bad Data Management. Bad #mdm.

So now I have started building a Master Data Hub fit for the purpose of doing consistent #FF. I do see other purposes for this as well as I recognize the advantages of combining data sources, so I did a #datamatching with LinkedIn connections to improve #dataquality through Identity Resolution.

This is as far I am now (very convenient that WordPress lets me edit my blog posts):

@ReferenceData where http://www.linkedin.com/pub/carla-mangado/11/467/239 is Staff Writer

@KenOConnorData is http://www.linkedin.com/in/kenoconnor00

@ocdqblog is a blog where http://www.linkedin.com/in/jimharris is blogger-in-chief

@dataqualitypro is a community founded by http://www.linkedin.com/in/dylanjones

Dylan was a @Datanomic partner where @SteveTuck is http://www.linkedin.com/in/stevetuck

@InitiateSystems has a CTO = @wmmarty who is http://www.linkedin.com/pub/marty-moseley/0/57/43b

@VishAgashe is http://www.linkedin.com/in/vishagashe

@KeithMesser is http://www.linkedin.com/in/keithmesser running @GlobalMktgPros

@fionamacd is at @TrilliumSW as seen here http://www.linkedin.com/in/fionamacd

So is @stevesarsfield being http://www.linkedin.com/pub/steve-sarsfield/2/675/47a

Trillium is owned by Harte-Hanks where @MarkGoloboy also was http://www.linkedin.com/in/markgoloboy

@biknowledgebase is operated by http://www.linkedin.com/in/barryharmsen

@Dataexperts has a managing director who is http://www.linkedin.com/pub/gary-holland/1/101/135

@IDResolution (Infoglide) has several Data Matching members in http://www.linkedin.com/groups?gid=2107798 including http://www.linkedin.com/in/dougwood

@rdrijsen is http://www.linkedin.com/in/rdrijsen with possible duplicate http://www.linkedin.com/pub/resa-drijsen/1/389/58

@grahamrhind is http://www.linkedin.com/in/grahamrhind

@omathurin is http://www.linkedin.com/in/oliviermathurin

@zzubbuzz is probably http://www.linkedin.com/pub/charles-proctor/14/591/31

@CharlesBurleigh is http://www.linkedin.com/in/charlesburleigh

@wesharp is http://www.linkedin.com/in/williamesharp doing @dqchronicle

@decisionstats has an editor being http://www.linkedin.com/in/ajayohri

@jeric40 is my colleague at Omikron as shown here http://www.linkedin.com/in/janerikingvaldsen

Data Quality Milestones

milestoneI have a page on this blog with the heading “Data Quality 2.0”. The page is about what the near future in my opinion will bring in the data quality industry. In recent days there were some comments on the topic. My current summing up on the subject is this:

Data Quality X.X are merely maturity milestones where:

Data Quality 0.0 may be seen as a Laissez-faire state where nothing is done.

Data Quality 1.0 may be seen as projects for improving downstream data quality typically using batch cleansing with national oriented techniques in order to make data fit for purpose.

Data Quality 2.0 may be seen as agile implementation of enterprise wide and small business data quality upstream prevention using multi-cultural combined techniques exploiting cloud based reference data in order to maintain data fit for multiple purposes.

Fit for what purpose?

The goal of data quality improvement is often set as ”fit for purpose”. The first purpose addressed will almost naturally be within the domain where the data in question are captured. Then you address other domains where the same data also may be used, but probably with other purposes leading to additional or varying measures for fitness.

tricky_signIf an organisation identifies several domains where the same data are used the normal approach will be to gather all purposes and then start to align all the needs, find the highest common denominators and so on. This may be a very cumbersome process as you need to consider all the different dimensions of data quality: uniqueness, completeness, timeliness, validity, accuracy, consistency.

Another way will be to assume that if you gather many purposes the total needs will almost certainly tend to be a reflection of the real world objects to which the data refer.

So my thesis is, that there is a break even point when including more and more purposes where it will be less cumbersome to reflect the real world object rather than trying to align all known purposes.

Master Data are often used in many different functions in an organisation and not at least party data – names and addresses – are known to be a focus area for data quality improvement. Here it is very obvious that real world objects exists and they are basically the same to every organisation.acme

Earlier this year I wrote an entry on dataqualitypro about possibilities with external party reference data:  http://www.dataqualitypro.com/data-quality-home/external-reference-data-an-overview.html

In my previous post on this blog I noticed that governments around the world are releasing data stores that surely add traction to the real world approach to data quality improvement.

I will for sure touch this subject in forthcoming posts on this blog.

Bookmark and Share

Qualities in Data Architecture

Data architecture describes the structure of data used by a business and its applications by mapping the data artifacts to data qualities, applications, locations etc.

Pont_du_gard2000 years ago the roman writer, architect and engineer Marcus Vitruvius Pollio wrote that a structure must exhibit the three qualities of firmitas, utilitas, venustas — that is, it must be strong or durable, useful, and beautiful.

I have worked with data quality for many years and always been a bit disappointed about the lack of (at)traction that has been around data quality issues. Perhaps the lack of attraction is due to that we focus so much on strength, durability and usefulness and too little about beauty – or at least attractiveness.

But how do the three qualities apply to data quality?

  • Firmitas, strength and durability, is connected to technology and how we tend to make our data be as close to reflecting real world objects as possible in terms as uniqueness, completeness, timeliness, validity, accuracy and consistency.  
  • Utilitas, usefulness, is connected to how we use data as information in business processes. Often “fit for purpose” is stated as a goal for data quality improvement – which makes it hard when multiple purposes exist in an organization.
  • Venustas – beauty or attractiveness – is connected to the mindset of people. Often we blame poor data quality on the people putting data into the data stores and direct initiatives that way using a whip called data governance. But probably we will get more attraction from people if we make or show quality data more attractive.

SidneyOperaHouseCompared to buildings data quality are often the sewers beneath the old cathedrals and new opera houses – which also may explain the lack of attraction.

If you consider yourself a data quality professional – being a tool maker, expert, whatever – you got to get up from the sewers and make and show some attractive data in the halls of the fine buildings. You know how hard it is to make quality data – but do tell about the success stories.

GreatBeltBridge