This post is involved in a good-natured contest (i.e., a blog-bout) with two additional bloggers: Charles Blyth and Jim Harris. Our contest is a Blogging Olympics of sorts, with the Great Britain, United States and Denmark competing for the Gold, Silver, and Bronze medals in an event we are calling “Three Single Versions of a Shared Version of the Truth.”
Please take the time to read all three posts and then vote for who you think has won the debate (see poll below). Thanks!
According to Wikipedia data may be of high quality in two alternative ways:
- Either they are fit for their intended uses
- Or they correctly represent the real-world construct to which they refer
In my eyes the term “single version of the truth” relates best to the real-world way of data being of high quality while “shared version of the truth” relates best to the hard work of making data fit for multiple intended uses of shared data in the enterprise.
My thesis is that there is a break even point when including more and more purposes where it will be less cumbersome to reflect the real world object rather than trying to align all known purposes.
The map analogy
In search for this truth we will go on a little journey around the world.
For a journey we need a map.
Traditionally we have the challenge that the real-world being the planet Earth is round (3 dimensions) but a map shows a flat world (2 dimensions). If a map shows a limited part of the world the difference doesn’t matter that much. This is similar to fitting the purpose of use in a single business unit.
If the map shows the whole world we may have all kind of different projections offering different kind of views on the world having some advantages and disadvantages. A classic world map is the rectangle where Alaska, Canada, Greenland, Svalbard, Siberia and Antarctica are presented much larger than in the real-world if compared to regions closer to equator. This is similar to the problems in fulfilling multiple uses embracing all business units in an enterprise.
Today we have new technology coming to the rescue. If you go into Google Earth the world indeed looks round and you may have any high altitude view of a apparently round world. If you go closer the map tends to be more and more flat. My guess is that the solutions to fit the multiple uses conondrum will be offered from the cloud.
Exploiting rich external reference data
But Google Earth offers more than powerfull technolgy. The maps are connected with rich information on places, streets, companies and so on obtained from multiple sources – and also some crowdsourced photos not always placed with accuracy. Even if external reference data is not “the truth” these data, if used by more and more users (one instance, multiple tenants), will tend to be closer to “the truth” than any data collected and maintained solely in a single enterprise.
Shared data makes fit for pupose information
You may divide the data held by an enterprise into 3 pots:
- Global data that is not unique to operations in your enterprise but shared with other enterprises in the same industry (e.g. product reference data) and eventually the whole world (e.g. business partner data and location data). Here “shared data in the cloud” will make your “single version of the truth” easier and closer to the real world.
- Bilateral data concerning business partner transactions and related master data. If you for example buy a spare part then also “share the describing data” making your “single version of the truth” easier and more accurate.
- Private data that is unique to operations in your enterprise. This may be a “single version of the truth” that you find superior to what others have found, data supporting internal business rules that make your company more competitive and data referring to internal events.
While private and then next bilateral data makes up the largest amount of data held by an enterprise it is often seen that it is data that could be global that have the most obvious data quality issues like duplicated, missing, incorrect and outdated party master data information.
Here “a global or bilateral shared version of the truth” helps approaching “a single version of the truth” to be shared in your enterprise. This way accurate raw data may be consumed as valuable information in a given context at once when needed.
Call to action
If not done already, please take the time to read posts from fellow bloggers Charles Blyth and Jim Harris and then vote for who you think has won the debate. A link to the same poll is provided on all three blogs. Therefore, wherever you choose to cast your vote, you will be able to view an accurate tally of the current totals.
The poll will remain open for one week, closing at midnight on 19th November so that the “medal ceremony” can be conducted via Twitter on Friday, 20th November. Additionally, please share your thoughts and perspectives on this debate by posting a comment below. Your comment may be copied (with full attribution) into the comments section of all of the blogs involved in this debate.
Excellent entry in the debate Henrik,
Your map analogy is remarkable, especially the clever cloud computing pun. I also like your definition of external reference data as (my paraphrasing) a “crowdsourced version of the truth.”
Since external reference data is both essential to the enterprise and not under its control, a “shared version of the truth” is an unavoidable reality.
I also really like how you have framed a “single version of the truth” to be defined as (again, my paraphrasing) the enterprise’s competitive differentiation and advantage – in order words, the truth that unites the entire organization as single enterprise for collaborative success.
I will honestly admit that I voted for you.
Thanks a lot Jim. Your paraphrasing is unique.
Great view on this shared topic. As I have said, I’m realy enjoying the different takes that we have come up with.
As I mentioned on Jim’s post I think there is further discussion required on his definition of “subjective information quality standards”, my “contextual single verion of the truth” and your “making data fit for multiple intended uses of shared data in the enterprise”. I see a lot of similarities here worth discussing.
Well done on getting Jim’s vote, I would have voted for you if I had not voted for myself already 🙂
Thanks Charles, I am certain we will keep on debating years ahead. I will hold my vote until the 19th – actually votes, I have several computers at my disposal 😉
Great post Henrik, really enjoying the bout.
I think the key for me here is accessibility and maintainability. The reason crowdsourcing works particularly well in the cloud etc. is because we’re all connected to a single fact that we can update. Google hold the master, and you’re right, the more people who use the service and update then the better chance we have of creating an accurate source of truth.
We see this a lot with event reporting and twitter now. Stories can break quickly but they’re not always true, it’s only when a “cloud” of similar messages come through that we can start to validate the results, in the past we would simply wait for a trusted agent ie. a reporter, to report truthfully on the facts.
This is the danger of crowd-sourcing, it can be manipulated but I agree it has merits.
However, the biggest challenge of sharing is providing a simple form of maintainability. Often shared data has the accessibility factor but lacks the maintainability aspect so shared actually means distributed inconsistent data.
Just providing an alternative viewpoint, my vote is firmly under wraps so I’m not necessarily disagreeing with you!
Thanks Dylan. I agree about maintainability. Having Master Data stored with public ID’s so they are maintainable and easy deduplicated and matched is very widespread in Scandinavia. Much of the work I do is bringing Master Data from a stage without maintainability to a stage with maintainability.
Henrik, Charles and Jim,
Thanks for a particularly interesting and enjoyable event. This topic will surely generate useful debate for some time. Here’s what occurred to me as I read your entries:
What if we changed “version” to “vision”? Would we be more comfortable with a “single vision of the truth” than we are with a “single version”? In this context, “vision” implies that, whether or not there exists an absolute truth, we acknowledge that we’re human, and that we succeed together to the extent that we agree on what’s true.
Great comment Dean!
A “Single Vision of the Truth?”
I definitely like that as a term more than a “Single Version of the Truth.”
Thanks for contributing to the discussion.
Thanks Dean and Jim.
Agree with Jim – great input Dean.
Anyone for “shared vision of the truth”?
As I mentioned, your comment has got me thinking, which is always a dangerous thing …
My only thought re the word ‘vision’ could imply future state or desired state, as in “I have a vision”.
As I said, you’ve got me thinking!
Thanks for joining in the discussion.