Data Quality tools mainly support you with automation of Data Profiling and Data Matching.
There has to be something called “Data Quality 2.0”. The term is already flashed here and there. My thoughts about what is included are as follows:
Exploiting rich external reference data
Using external reference data in data quality improvement is not new at all. But we will see a lot of new sources in the cloud and governments around the world are planning to release huge amounts of public sector data. More on this in the post: Government says so.
Sharing data is key to a single version of the truth. The rise of social networking is also a new and exiting source of shared data.
While working with data quality tools and services for many years I have found that many tools and services are very national. So you might discover that a tool or service will make wonders with data from one country, but be quite ordinary or in fact useless with data from another country.
As globalization moves forward this challenge becomes more and more important. Enterprises tend to standardize world wide on tools and services, shared service centres takes care of data covering many countries and so on. When an employee works with data from another country he often wrongly adapts his local standards to these data and there by challenges the data quality more than seen before.
It will be a must that data quality tools in the future works globally.
In my opinion the rise of Service Oriented Architecture is a golden opportunity for getting the benefits from data quality tools that we haven’t been able to achieve so much with the technology and approaches seen until now as described in the post: Service Oriented Data Quality.
One of the really great advantages of SOA components is support for multi purpose information quality as discussed in: Fit for what purpose?. This is where Data Quality 2.0 meets MDM 2.0. One of the often forgotten features here is upstream prevention by error tolerant search.
Small business support
The real world is crowded with organisations where bureaucratic data owner, steward and custodian hierarchies, comprehensive data governance policies and excessive technology implementation makes no sense (and ROI).
Nevertheless many modest organisations do store – and also in my experience use – huge amounts of data. So we need more agile methods and tools to cover the need of these organisations. Check this post: The Statue of Liberty versus The Little Mermaid.
Human like technology
User friendly configurable software that are able to think and make decisions as – or better than – humans is also in the field of data quality becoming more and more realistic. Add to that the classic advantages of software compared to humans: Speed and repeatability. In the end technology always prevails – and we will build some beautiful data stores around.