There are several maturity models related to data quality out there. I have found a good collection in this document from NASCIO.
I guess the mother of all maturity models is the Capability Maturity Model (CMM). This model is related to software development.
There is also a parody model for that called the Capability Immaturity Model (CIMM). Inspired by an article yesterday by Jill Dyché on Information Management called Anti-Predictions for 2011 I have found that the CIMM model is easily adapted to a data quality immaturity model with levels from zero to minus three as this:
0 : Negligent
The organization pays lip service, often with excessive fanfare, to implementing data quality processes, but lacks the will to carry through the necessary effort. Whereas level 1 assumes eventual success in producing and measuring quality data, level 0 organizations generally fail to have any idea about the actual horrible quality of the data assets.
-1 : Obstructive
Processes, however inappropriate and ineffective, are implemented with rigor and tend to obstruct work. Adherence to process is the measure of success in a level -1 organization. Any actual creation of quality data is incidental. The quality of any data is not assessed, presumably on the assumption that if the proper process was followed, high quality data is guaranteed.
-2 : Contemptuous
While processes exist, they are routinely ignored by the staff and those charged with overseeing the processes are regarded with hostility. Measurements are fudged to make the organization look good.
-3 : Undermining
Not content with faking their own performance, undermining departments within the organization routinely work to downplay and sabotage the efforts of rival departments. This is worst where company policy causes departments to compete for scarce resources, which are allocated to the loudest advocates.
A recurring subject for me and many others is talking and writing about people, processes and technology including which one is most important, in what sequence they must be addressed and, which is my main concern, how they must be aligned.
As we practically always are referring to the three elements in the same order being people, processes and technology there is certainly an implicit sequence.
If we look at maturity models related to data quality we will recognize that order too.
In the low maturity levels people are the most important aspect and the subject that needs the first and most attention and people are the main enablers for starting moving up in levels.
Then in the middle levels processes are the main concerns as business process reengineering enables going up the levels.
At the top levels we see implemented technology as a main component in the description of being there.
An example of the growing role of technology is (not surprisingly of course) in the data governance maturity model from the data quality tool vendor DataFlux.
One thing is sure though: You can’t move your organization from the low level to the high level by buying a lot of technology.
It is an evolutionary journey where the technology part comes naturally step by step by taking over more and more of the either trivial or extremely complex work done by people and where technology becomes an increasingly integrated and automated part of the business processes.
I guess every data and information quality professional agrees that when fighting bad data upstream prevention is better than downstream cleansing.
Nevertheless most work in fighting bad data quality is done as downstream cleansing and not at least the deployment of data quality tools is made downstream were tools outperforms manual work in heavy duty data profiling and data matching as explained in the post Data Quality Tools Revealed.
In my experience the top 5 reasons for doing downstream cleansing are:
1) Upstream prevention wasn’t done
This is an obvious one. At the time you decide to do something about bad data quality the right way by finding the root causes, improving business processes, affect people’s attitude, building a data quality firewall and all that jazz you have to do something about the bad data already in the databases.
2) New purposes show up
Data quality is said to be about data being fit for purpose and meeting the business requirements. But new purposes will show up and new requirements have to be met in an ever changing business environment. Therefore you will have to deal with Unpredictable Inaccuracy.
3) Dealing with external born data
Upstream isn’t necessary in your company as data in many cases is entered Outside Your Jurisdiction.
4) A merger/acquisition strikes
When data from two organizations having had different requirements and data governance maturity is to be merged something has to be done. Some of the challenges are explained in the post Merging Customer Master Data.
5) Migration happens
Moving data from an old system to a new system is a good chance to do something about poor data quality and start all over the right way and oftentimes you even can’t migrate some data without improving the data quality. You only have to figure out when to cleanse in data migration.
I have a page on this blog with the heading “Data Quality 2.0”. The page is about what the near future in my opinion will bring in the data quality industry. In recent days there were some comments on the topic. My current summing up on the subject is this:
Data Quality X.X are merely maturity milestones where:
Data Quality 0.0 may be seen as a Laissez-faire state where nothing is done.
Data Quality 1.0 may be seen as projects for improving downstream data quality typically using batch cleansing with national oriented techniques in order to make data fit for purpose.
Data Quality 2.0 may be seen as agile implementation of enterprise wide and small business data quality upstream prevention using multi-cultural combined techniques exploiting cloud based reference data in order to maintain data fit for multiple purposes.