In here, the authors emphasizes on the need to adapt traditional IT functions to the opportunities and challenges of emerging technologies that embraces business ecosystems. I fully support that sentiment.
In my eyes, some of the emerging technologies we see are in large misunderstood as something meant for being behind the corporate walls. My favorite example is the data lake concept. I do not think a data lake will be an often seen success solely within a single company as explained in the post Data Lakes in Business Ecosystems.
The raise of technology for business ecosystems will also affect the data management roles we know today. For example, a data steward will be a lot more focused towards external data than before as elaborated in the post The Future of Data Stewardship.
Encompassing business ecosystems in data management is of course a huge challenge we have to face while most enterprises still have not reached an acceptable maturity when it comes internal data and information governance. However, letting the outside in will also help in getting data and information right as told in the post Data Sharing Is The Answer To A Single Version Of The Truth.
Whether the data lake concept is a good idea or not is discussed very intensively in the data management social media community.
The fear, and actual observations made, is that that a data lake will become a data dump. No one knows what is in there, where it came from, who is going to clean up the mess and eventually have a grip on how it should be handled in the future – if there is a future for the data lake concept.
Please folks. We have some concepts from the small data world that we must apply. Here are three of the important ones:
In short, metadata is data about data. Even though the great thing about a data lake is that the structure and all purposes of the data does not have to be cut in stone beforehand, at least all data that is delivered to a data lake must be described. An example of such an implementation is examined in the post Sharing Metadata.
You must also have the means to tag who delivered the data. If your data lake is within a business ecosystem, this should include the legal entity that has provided the data as told in the post Using a Business Entity Identifier from Day One.
Above all, you must have a framework to govern ownership (Responsibility, Accountability, Consultancy and who must be Informed), policies and standards and other stuff we know from a data governance framework. If the data lake expand across organizations by incorporating second party and third party data, we need a cross company data governance framework as for example highlighted on Product Data Lake Documentation and Data Governance.
Within Product Information Management (PIM) there is a growing awareness about that sharing product information between trading partners is a very important issue.
So, how do we do that? We could do that, on a global scale, by using:
2,345,678 customer data portals
901,234 supplier data portals
Spreadsheets is the most common mean to exchange product information between trading partners today. The typical scenario is that a receiver of product information, being a downstream distributor, retailer or large end user, will have a spreadsheet for each product group that is sent to be filled by each supplier each time a new range of products is to be on-boarded (and potentially each time you need a new piece of information). As a provider of product information, being a manufacturer or upstream distributor, you will receive a different spreadsheet to be filled from each trading partner each time you are to deliver a new range of products (and potentially each time they need a new piece of information).
Customer data portals is a concept a provider of product information may have, plan to have or dream about. The idea is that each downstream trading partner can go to your customer data portal, structured in your way and format, when they need product information from you. Your trading partner will then only have to deal with your customer data portal – and the 1,234 other customer data portals in their supplier range.
Supplier data portals is a concept a receiver of product information may have, plan to have or dream about. The idea is that each upstream trading partner can go to your supplier data portal, structured in your way and format, when they have to deliver product information to you. Your trading partner will then only have to deal with your supplier data portal – and the 567 other supplier data portals in their business-to-business customer range.
Product Data Lake is the sound alternative to the above options. Hailstorms of spreadsheets does not work. If everyone has either a passive customer data portal or a passive supplier data portal, no one will exchange anything. The solution is that you as a provider of product information will push your data in your structure and format into Product Data Lake each time you have a new product or a new piece of product information. As a receiver you will set up pull requests, that will give you data in your structure and format each time you have a new range of products, need a new piece of information or each time your trading partner has a new piece of information.
Since then the use of the term blockchain has been used more and more in general and related to Master Data Management (MDM). As you know, we love new fancy terms in our else boring industry.
However, there are good reasons to consider using the blockchain approach when it comes to master data. A blockchain approach can be coined as centralized consensus, which can be seen as opposite to centralized registry. After the MDM discipline has been around for more than a decade, most practitioners agree that the single source of truth is not practically achievable within a given organization of a certain size. Moreover, in the age of business ecosystems, it will be even harder to achieve that between trading partners.
This way of thinking is at the backbone of the MDM venture called Product Data Lake I’m working with right now. Yes, we love buzzwords. As if cloud computing, social network thinking, big data architecture and preparing for Internet of Things wasn’t enough, we can add blockchain approach as a predicate too.
In Product Data Lake this approach is used to establish consensus about the information and digital assets related to a given product and each instance of that product (physical asset or thing) where it makes sense. If you are interested in how that develops, why not follow Product Data Lake on LinkedIn.
One of the most promising aspects of digitalization is sharing information in business ecosystems. In the Master Data Management (MDM) realm, we will in my eyes see a dramatic increase in sharing product information between trading partners as touched in the post Data Quality 3.0 as a stepping-stone on the path to Industry 4.0.
Standardization (or standardisation)
A challenge in doing that is how we link the different ways of handling product information within each organization in business ecosystems. While everyone agrees that a common standard is the best answer we must on the other hand accept, that using a common standard for every kind of product and every piece of information needed is quite utopic. We haven’t even a common uniquely spelled term in English.
Also, we must foresee that one organization will mature in a different pace than another organisation in the same business ecosystem.
Product Data Lake
These observations are the reasons behind the launch of Product Data Lake. In Product Data Lake we encompass the use of (in prioritized order):
The same standard in the same version
The same standard in different versions
In order to link the product information and the formats and structures at two trading partners, we support the following approaches:
Ambassadorship, which is a role taken by a product information professional, who collaborates with the upstream and downstream trading partner in linking the product information. Read more about becoming a Product Data Lake ambassador here.
Upstream responsibility. Here the upstream trading partner makes the linking in Product Data Lake.
Downstream responsibility. Here the downstream trading partner makes the linking in Product Data Lake.
Regardless of the mix of the above approaches, you will need a cross company data governance framework to control the standards used and the rules that applies to the exchange of product information with your trading partners. Product Data Lake have established a partnership with one of the most recommended authorities in data governance: Nicola Askham – the Data Governance Coach.
In short, metadata is data about data. Handling metadata is an important facet of data management including in data governance, data quality management and Master Data Management (MDM). When it comes to the new trends in data management as big data and handling data in data lakes, the importance of metadata management will in my eyes become even more obvious.
In a current venture (Product Data Lake) we are working on building in metadata management for business ecosystems, meaning that trading partners can share product information either using the same metadata or linking their different metadata.
Using international, national and industry standards for product information will be the perfect solution within business ecosystem sharing of metadata and indeed this is the preferred option we support. However, there are many competing standards for product information and they come in developing versions, so having everyone on the same page at the same time is quite utopic.
Add to that everyone do not speak English – and even not the same variant of English. Metadata originates and should exist in the languages that is used in trading partnerships.
Product attributes can be tagged with an attribute type telling about what standard (if any) in terms of product identification, product classification or product feature it adheres to. More about that in the post Connecting Product Information.
Attribute short and long descriptions can be represented in different languages.
Trading partners can link their product attributes and have visibility in the Product Data Lake of the standards and descriptions used in the different languages they exist.
I will very much welcome your input to this quest and if you want to be involved please do not hesitate to be in touch with me here or on Xing, Viadeo or LinkedIn.
One of my current engagements is within jewelry – or is it jewellery? The use of these two respectively US English and British English words is a constant data quality issue, when we try to standardize – or is it standardise? – to a common set of reference data and a business glossary within an international organization – or is it organisation?
Looking for international standards often does not solve the case. For example, a shop that sells this kind of bijouterie, may be classified with a SIC code being “Jewelry store” or a NACE code being “Retail sale of watches and jewellery in specialised stores”.
A pearl is a popular gemstone. Natural pearls, meaning they have occurred spontaneously in the wild, are very rare. Instead, most are farmed in fresh water and therefore by regulation used in many countries must be referred to as cultured freshwater pearls.
My pearls of wisdom respectively cultured freshwater pearls of wisdom for building a business glossary and finding the common accepted wording for reference data to be used within your company will be:
Start looking at international standards and pick what makes sense for your organization. If you can live with only that, you are lucky.
If not, grow the rest of the content for your business glossary and reference data by imitating the international or national standards for your industry, and use your own better wording and additions that makes the most sense across your company.
And oh, I know that pearls of wisdom are often used to imply the opposite of wisdom 🙂