This Monday I mingled in a tweetjam organized by the open source data integration vendor Talend.
One of the questions discussed was: Are free and open sources of reference data becoming more important in your projects?
When talking “free and open“, not at least in the open source realm, we can’t avoid talking about “free for a fee”. Some sources of open data like Geonames are free as in “free beer”. Other data comes with a fee. In my home country Denmark we have had some discussions about the reasoning in that the government likes to put a fee on mandatory collected data and I have observed similar considerations in our close neighbor country Sweden (By the way: The picture of a bridge that Talend uses a lot like on top of home page here looks like the bridge between Denmark and Sweden).
One challenge I have met over and over again in using free (maybe for a fee) and open data in data integration and data quality improvement is the cost of conformity. When using open government data there may, apart from the pricing, be a lot of differences between the countries in formats, coverage and so on. I think there is a great potential in delivering conformed data from many different sources for specific purposes.
I agree, Henrik. Free (or open) reference data is the future of data integration and data quality. We’re seeing just the start of it now.
It’s interesting to me that the lack of conformity is somewhat based on culture. Something to ponder is whether the differences between data from Denmark and Sweden is a reflection on those cultures. Cultural nuances are some of the most difficult data quality issues to reconcile, unless you have specific knowledge of the culture.
It’s comforting that there will be plenty of work for us to do in building conformity. 🙂
Thanks Steve. True, but I am often amazed how governments of else close cultures have reached very different solutions to the same issue.