Paraphrasing Tim Berners-Lee:
“People may ask what Data Quality 3.0 is. I think what is looking misty on Web 2.0 and Data Quality 2.0 will eventually melt into a semantic Web integrated across a huge space of data where you’ll have access to an unbelievable data resource.”
Another way of putting it will be in a micro-manifesto like:
“While we value that data are of high quality if they are fit for the intended purpose of use we value more that data correctly represent the real-world construct to which they refer in order to be fit for current and future multiple purposes”.
My thesis is that there is a breakeven point when including more and more purposes where it will be less cumbersome to reflect the real world object rather than trying to align all known purposes.
You may divide the data held by an enterprise into 3 pots:
- Global data that is not unique to operations in your enterprise but shared with other enterprises in the same industry (e.g. product reference data) and eventually the whole world (e.g. business partner data and location data). Here “shared data in the cloud” will make your “single version of the truth” easier and closer to the real world.
- Bilateral data concerning business partner transactions and related master data. If you for example buy a spare part then also “share the describing data” making your “single version of the truth” easier and more accurate.
- Private data that is unique to operations in your enterprise. This may be a “single version of the truth” that you find superior to what others have found, data supporting internal business rules that make your company more competitive and data referring to internal events.
Data Management in the near future will in my eyes be closely related to the emerging web 3.0:
- Business Intelligence – and Data Science – will embrace internal (private) data and external (public) data in the cloud
- Data Warehouses – and Data Lakes – will link internal (private) data and external (public) data in the cloud
- Master Data Management will align internal (private) data with external (public) data in the cloud
- Data Quality Tools will profile internal (private) data and match internal (private) data with external (public) data in the cloud
- Data Governance may be a lot about balancing the use of internal (private) data and external (public) data – and internal and external business rules
Learn about some Data Quality 3.0 services here:
- The iDQ(tm) (instant Data Quality) service for sharing big reference data for the benefit of customer and other party master data.
- The Product Data Lake for sharing public and bilateral data within business ecosystems for the benefit of product master data.