Working with data governance and data quality can be a very backward looking quest. It often revolves around how to avoid a recent data disaster or catching up with the organizational issues, the process orchestration and new technology implementations needed to support current business objectives with current data types in a better way.
This may be hard enough. But you must also be prepared for the future.
The growth of available data to support your business is a challenge today. Your competitors take advantage of new data sources and better exploitation of known data sources while you are sleeping. New competitors emerge with business ideas based on new ways of using data.
The approach to inclusion of new data sources, data entities, data attributes and digital assets must be a part of your data governance framework and data quality capability. If you are not prepared for this, your current data quality will not only be challenged by decay of current data elements but also of not sufficiently governed new data elements or lack of business agility because you can’t include new data sources and elements in a safe way.
Some essentials in being prepared for inclusion of new kinds of data are:
- A living business glossary that facilitates a shared understanding of new data elements within your organization including how they relate to or replaces current data elements.
- Configurable data quality measurement facilities, data profiling functionality and data matching tools so on-boarding every new data element doesn’t require a new data quality project.
- Self-service and automation being the norm for data capture and data consumption. Self-service must be governed both internally in your organization and externally as explained in the post Data Governance in the Self-Service Age.
Intriguing points raised by you, but I feel we need to talk about another facet of data and its future. Of course, inclusion and rationalization of data sources will certainly help in being prepared for the future, but how do we get ready for the sheer volume of data which is being produced and which will only increase in the coming days?
In the last few years, we have moved from talking about data in terms of mega-bytes to peta-bytes. The scope of data volume itself has shifted drastically. What do you think organizations and individuals can do to ‘be prepared’ for this?
Another concern on the data has been the incompleteness of the data. or in other words not enough data available which can be qualified as rational data to do matching