The last couple of days I have been part of a so called Innovation Camp around how to exploit open public sector data in the private sector. In one of the inspirational keynotes Professor Birgitte Andersen of the Big Innovation Centre used the term “A Digital Sharing Revolution” to describe the trend of increasingly sharing data both within the public sector, between the public sector and the private sector and within the private sector.
During the two days a lot of ideas for how to exploit open public sector data within the private sector were put on the table. I was so lucky to win a SmartWatch as being part of the group with the winning concept that is a service for identifying buildings with potential for energy saving improvements. This service will be of benefit for both large enterprises as building material manufacturers (and in fact energy suppliers), local small and midsize businesses, the house owners and the society as a whole in order to fulfil climate change prevention goals.
At iDQ we see great potential in using such a service in conjunction with our current offerings for exploiting both open public sector data and other external big reference data sources. Of course, there is a dilemma for enterprises in the private sector in using the same data provided by the same services as their competitors. However there is still a lot of possibilities in sticking out from the crowd in how data and services are actually used in the way of doing business and concentrating on that and not reinventing the wheel in the way collecting data.
It is spring in Europe and the good news in Europe this week is that from December next year we finally have the end of paying exorbitant fees for having data access on your mobile phone outside a WiFi when in a another EU country as told by BBC here. As a person travelling a lot between EU countries this is, though years too late, fantastic news.
Being too late was unfortunately also the case as examined in the article Sale of postcodes data was a ‘mistake’ say Committee – in News from UK Parliament. When the UK Royal Mail was privatised last year the address directory, known as the PAF file, was part of the deal. It would have been a substantial better deal for the society as a whole if the address data had been set free. This calculation is backed up by figures from experiences in Denmark as reported in the post The Value of Free Address Data.
In the next week I’m looking forward to being part of an innovation camp arranged by the Danish authorities as a step in an initiative to exploit open public sector data in the private sector. Here public data owners, IT students, enterprise data consumers and IT tool and service vendors including iDQ A/S will meet openly and challenge each other in the development of the most powerful ideas for new ways to create valuable knowledge based on open public sector data.
Exploiting external data is an essential part of party master data management as told in the post Third-Party Data and MDM.
External data supports data quality improvement and prevention of party master data by:
- Ensuring accuracy of party master data entities best at point of entry but sometimes also by later data enrichment
- Exploring relationships between master data entities and thereby enhance the completeness of party master data
- Keeping up the timeliness of party master data by absorbing external events in master data repositories
External events around party master data are:
Updating with some of these events may be done automatically and some events requires manual intervention.
Right now I’m working with data stewardship functionality in the instant Data Quality MDM Edition where the relocation event, the deceased event and other important events in party master data life-cycle management is supported as part of a MDM service.
Today it has been announced that the European Union will regulate the use of the term “big data”.
“Volumes of misuse of the term big data has gone way over what is acceptable” says an EU spokesperson. Therefore the Commission will initiate a snap roadmap for legislation leading to that every use of the term big data has to be approved by the authorities beforehand.
A variety of ways to declare that your use of the term big data has been approved will be put into force for the different languages used within the Union. So far France has announced that “big data appellation d’originalité contrôlée” will be used there.
Velocity is the word that best describes the planned process for clamping down on the misuse of the term big data. As soon as in 2020 every member state must have started the legislation process and not later than 2025 the rules must be implemented in national laws. However there is a great deal of skepticism over if things could move that fast.
When I changed my laptop a few months ago, it was the easiest migration to a new computer ever.
Basically I just had to connect to all the services in the cloud I had been using before and for many services the path was to get connected to Google+, Twitter and FaceBook and then connect to many other services via these connections.
This was a personal win.
Most of the teams I am working with are sharing their data with me in the cloud. As in the bad old days I do not have to call and ask for progress on this and that. I can check the status myself and even get notifications on my phablet when a colleague completes a task.
This is a shared win.
Within my profession being data quality improvement and Master Data Management (MDM) sharing data is going to be a winning path too as told in the post Sharing is the Future of MDM.
There are several ways of sharing master data like using commercial third party data, digging into open government data, having your own data locker and relying on social collaboration. These options are examined in the post Ways of Sharing Master Data.
Identity resolution is a hot potato when we look into how we can exploit big data and within that frame not at least social data.
Some of the most frequent mentioned use cases for big data analytics revolves around listening to social data streams and combine that with traditional sources within customer intelligence. In order to do that we need to know about who is talking out there and that must be done by using identity resolution features encompassing social networks.
The first challenge is what we are able to do. How we technically can expand our data matching capabilities to use profile data and other clues from social media. This subject was discussed in a recent post on DataQualityPro called How to Exploit Big Data and Maintain Data Quality, interview with Dave Borean of InfoTrellis. In here InfoTrellis “contextual entity resolution” approach was mentioned by David.
The second challenge is what we are allowed to do. Social networks have a natural interest in protecting member’s privacy besides they also have a commercial interest in doing so. The degree of privacy protection varies between social networks. Twitter is quite open but on the other hand holds very little usable stuff for identity resolution as well as sense making from the streams is an issue. Networks as Facebook and LinkedIn are, for good reasons, not so easy to exploit due to the (chancing) game rules applied.
As said in my interview on DataQualityPro called What are the Benefits of Social MDM: It is a kind of a goldmine in a minefield.
Traditionally data governance has been around the people and process side of data management. However we now see tools marketed as data governance tools either as a pure play tool for data governance or as a part of a wider data management suite as told in the post Who needs a data governance tool?
The post refers to a report by Sunil Soares. In this report data governance tools are seen as tools related to six areas within enterprise data management: Data discovery, data quality, business glossary, metadata, information policy management and reference data management.
While IBM have tools for everything, according to the report it does not seem like a single tool cures it all – yet.
But will we go there? If we need tools at all, do we need an all-cure snake oil tool for data governance? Or will we be better off with different lubricants for data discovery, data quality, business glossary, metadata, information policy management and reference data management?