Universal Pearls of Wisdom

When we are looking for what is really important and absolutely necessary to get data quality right some sayings could be:

  • “Change management is a critical factor in ensuring long-term data quality success”.
  •  “Focussing only on technology is doomed to fail”.
  •  “You have to get buy-in from executive sponsors”.

PearlsThese statements are in my eyes very true and I guess anyone else will agree.

But I also notice that they are true for many other disciplines like MDM, BI, CRM, ERP, SOA, ITIL… you name it.

Also take the new SOA manifesto. I have tried to swap SOA (and the full words) with XYZ, and this is the result:

 XYZ Manifesto

XY is a paradigm that frames what you do. XYZ is a type of Z that results from applying XY. We have been applying XY to help organizations consistently deliver sustainable business value, with increased agility and cost effectiveness, in line with changing business needs. Through our work we have come to prioritize:

Business value over technical strategy

Strategic goals over project-specific benefits

Intrinsic interoperability over custom integration

Shared services over specific-purpose implementations

Flexibility over optimization

Evolutionary refinement over pursuit of initial perfection

That is, while we value the items on the right, we value the items on the left more.

I think a Data Quality and several other manifestos could be very close.

But what I am looking for in Data Quality is the specific pearls of wisdom related to Data and Information Quality – while I of course value to be reminded about the universal ones.

Bookmark and Share

Man versus Computer

In a recent social network happening Jim Harris and Phil Simon discussed whether IT projects are like the board games Monopoly or Risk.

I notice that both these games are played with dice.

I remember back in the early 80’s I had some programming training by constructing a Yahtzee game on a computer. The following parts were at my disposal:

  • Platform: IBM 8100 minicomputer
  • Language: COBOL compiler
  • User Interface: Screen with 80 characters in 24 rows

As the user interface design options were limited the exiting part became the one player mode where I had to teach (program) the computer which dice to save in a given situation – and make that logic be based on patterns rather than every possible combination.

While having some other people testing the man versus computer in the one player mode I found out that I could actually construct a compact program that in the long run won more rounds than (ordinary) people.

Now, what about games without dice? Here we know that there has been a development even around chess where now the computer is the better one compared to any human.

So, what about data quality? Is it man or computer who is best at solving the matter. A blog post from Robert Barker called “Avoiding False Positives: Analytics or Humans?” has a sentiment.

diceAlso seen from a time and cost perspective the computer does have some advantages compared to humans.

But still we need humans to select what game to be played. Throw the dice…

Bookmark and Share

LinkedIn Group Statistics

LinkedInI am currently a member of 40 LinkedIn groups mostly targeted at Master Data Management, Data Quality and Data Matching.

As I have noticed that some groups covers the same topic I wondered if they have the same members.

So I did a quick analysis.

With Master Data Management the largest groups seems to be:

Using the LinkedIn Profile Organizer I found that 907 are members at both groups. This is not as many as I would have guessed.

With Data Quality the largest groups seems to be:

Using the LinkedIn Profile Organizer I found that 189 are members at both groups. This is not as many as I would have guessed despite the renaming of the last group.

As for Data Matching I have founded the Data Matching group. The group has 235 members where:

  • 77 are members in the two large Master Data Management groups also.
  • 80 are members in the two large Data Quality groups also.

Also this is not as many as I would have guessed.

You may find many other similar groups on my LinkedIn profile – among them:

Bookmark and Share

When to cleanse in Data Migration?

About a year ago I posted a question on DataMigrationPro about when it is the best time for executing data cleansing in a migration project. Is it:

  • Before the migration?
  • During the migration?
  • After the migration?

In a recent excellent blog post by Dalton Cervo he explains some of the points considered to this question in a particular MDM migration project.

As I am going to prepare a speech including this subject I will be very pleased to receive additional considerations made on this matter. Please comment here or on DataMigrationPro.

data-migration-pro-banner3

Bookmark and Share

Follow Friday Master Data Hub

Social Networking needs Master Data Management.

brownbird_leftA recurring event every Friday on Twitter is the #FollowFriday with the acronym #FF, where people on Twitter tweets about who to follow.

I do it too and as every one else sometimes I perhaps forget someone, and then (s)he gets angry and don’t #FF me and that’s bad. Bad Data Management. Bad #mdm.

So now I have started building a Master Data Hub fit for the purpose of doing consistent #FF. I do see other purposes for this as well as I recognize the advantages of combining data sources, so I did a #datamatching with LinkedIn connections to improve #dataquality through Identity Resolution.

This is as far I am now (very convenient that WordPress lets me edit my blog posts):

@ReferenceData where http://www.linkedin.com/pub/carla-mangado/11/467/239 is Staff Writer

@KenOConnorData is http://www.linkedin.com/in/kenoconnor00

@ocdqblog is a blog where http://www.linkedin.com/in/jimharris is blogger-in-chief

@dataqualitypro is a community founded by http://www.linkedin.com/in/dylanjones

Dylan was a @Datanomic partner where @SteveTuck is http://www.linkedin.com/in/stevetuck

@InitiateSystems has a CTO = @wmmarty who is http://www.linkedin.com/pub/marty-moseley/0/57/43b

@VishAgashe is http://www.linkedin.com/in/vishagashe

@KeithMesser is http://www.linkedin.com/in/keithmesser running @GlobalMktgPros

@fionamacd is at @TrilliumSW as seen here http://www.linkedin.com/in/fionamacd

So is @stevesarsfield being http://www.linkedin.com/pub/steve-sarsfield/2/675/47a

Trillium is owned by Harte-Hanks where @MarkGoloboy also was http://www.linkedin.com/in/markgoloboy

@biknowledgebase is operated by http://www.linkedin.com/in/barryharmsen

@Dataexperts has a managing director who is http://www.linkedin.com/pub/gary-holland/1/101/135

@IDResolution (Infoglide) has several Data Matching members in http://www.linkedin.com/groups?gid=2107798 including http://www.linkedin.com/in/dougwood

@rdrijsen is http://www.linkedin.com/in/rdrijsen with possible duplicate http://www.linkedin.com/pub/resa-drijsen/1/389/58

@grahamrhind is http://www.linkedin.com/in/grahamrhind

@omathurin is http://www.linkedin.com/in/oliviermathurin

@zzubbuzz is probably http://www.linkedin.com/pub/charles-proctor/14/591/31

@CharlesBurleigh is http://www.linkedin.com/in/charlesburleigh

@wesharp is http://www.linkedin.com/in/williamesharp doing @dqchronicle

@decisionstats has an editor being http://www.linkedin.com/in/ajayohri

@jeric40 is my colleague at Omikron as shown here http://www.linkedin.com/in/janerikingvaldsen

Alignment of business and IT

teamworkBeing a Data Quality professional may be achieved by coming from the business side or the technology side of practice. But more important in my eyes is the question whether you have made serious attempts and succeeded in understanding the side from where you didn’t start.

Many blog posts made around the data quality conundrum discusses the role of the business side versus the role of the technology side and various weights in different contexts are given to these sides. It should not be surprising for a Data Quality professional that there is no absolute true or absolute false simple answer to such a question. Fortunately I find most discussions, when they are taken, ends up with the “peace on earth” sentiment:

  • Of course it’s the business requirements striving for business value that governs any initiative using technology in order to improve business performance
  • Of course the emerge (or discovery) of new technology may change the way you arrange business processes in order to gain on competitive business performance

From that point of view I am looking forward to continued discussions over all the important issues around data and information quality improvement and prevention as, but not limited to:

  • What is the business value of better information quality
  • How to gather business requirement related to information quality in order to make data fit for purpose(s)
  • Who is needed to accomplish the data quality improvement tasks – probably people from business, IT and all those mixed ones (credit: Jim Harris of OCDQblog)
  • When is the data quality technology so mature that it will cope with issues in a way not seen before
  • Which different kinds of methodologies and techniques are best for different sort of data quality challenges
  • Where on earth is the answers to all these questions

Bookmark and Share

Data Quality Milestones

milestoneI have a page on this blog with the heading “Data Quality 2.0”. The page is about what the near future in my opinion will bring in the data quality industry. In recent days there were some comments on the topic. My current summing up on the subject is this:

Data Quality X.X are merely maturity milestones where:

Data Quality 0.0 may be seen as a Laissez-faire state where nothing is done.

Data Quality 1.0 may be seen as projects for improving downstream data quality typically using batch cleansing with national oriented techniques in order to make data fit for purpose.

Data Quality 2.0 may be seen as agile implementation of enterprise wide and small business data quality upstream prevention using multi-cultural combined techniques exploiting cloud based reference data in order to maintain data fit for multiple purposes.

Data Quality and Common Sense

My favourite story is the fairytale “The Emperor’s new clothes” by Hans Christian Andersen.

Hans_Christian_AndersenIn this tale an emperor hires two swindlers (aka consultants) who offer him the finest dress from the most beautiful cloth. This cloth, they tell him, is invisible to anyone who is either stupid or unfit for his position. In fact there is no cloth at all, but no one (but at the end a little child) dares to say.

The Data Quality discipline is tormented by belonging to both the business side and the technology side of practice. This means that we have to live with the buzzwords and the smartness coming from both the management consultants and the technology consultants and vendors – including myself.

So you really have to believe in a lot of things and terms said in order not to look stupid or unfit for your position.

A way to cope with this is to look behind all the fine terms and recognize that most things said and presented is just another way of expressing common sense. Some examples:

Business Process: What you do at work – e.g. selling some stuff and putting data about it into a database so it’s ready for invoicing.

Referential Integrity Error: When you sold something not in the database. You may pick another item from the current list. Bad Change Management: When someone tells you to do it in another way. Now.

Organisational Resistance: When you find that way completely ridiculous because no one tells you why.

Fuzzy logic: This is about the common nature of most questions in life. Statements are not absolutely true or absolutely false but somewhere in between depending on the angle from where you observe.

Business Intelligence: When someone puts your data along with some other data into a new context visualised in a graph in order to replace human gut feeling.

Poor Enterprise Wide Data Quality: The invoicing went well. The decision made from the graph didn’t. 

Data Governance: Meetings and documents about what went wrong with the data and how we can do better.

My experience is that the most successful data quality improvements is made when it is guided by common sense and expressed as being that. From there you may find great inspiration and practical skills and tools in each area of expertise.