The Product Data Lake is a cloud service for sharing product data in the eco-systems of manufacturers, distributors, retailers and end users of product information.
As an upstream provider of products data, being a manufacturer or upstream distributor, you have these requirements:
- When you introduces new products to the market, you want to make the related product data and digital assets available to your downstream partners in a uniform way
- When you win a new downstream partner you want the means to immediately and professionally provide product data and digital assets for the agreed range
- When you add new products to an existing agreement with a downstream partner, you want to be able to provide product data and digital assets instantly and effortless
- When you update your product data and related digital assets, you want a fast and seamless way of pushing it to your downstream partners
- When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pushing it to your downstream partners.
The Product Data Lake facilitates these requirements by letting you push your product data into the lake in your in-house structure that may or may not be fully or partly compliant to an international standard.
As an upstream provider, you may want to push product data and digital assets from several different internal sources.
The product data lake tackles this requirement by letting you operate several upload profiles.
As a downstream receiver of product data, being a downstream distributor, retailer or end user, you have these requirements:
- When you engage with a new upstream partner you want the means to fast and seamless link and transform product data and digital assets for the agreed range from the upstream partner
- When you add new products to an existing agreement with an upstream partner, you want to be able to link and transform product data and digital assets in a fast and seamless way
- When your upstream partners updates their product data and related digital assets, you want to be able to receive the updated product data and digital assets instantly and effortless
- When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pulling it from your upstream partners
- If you have a backlog of product data and digital asset collection with your upstream partners, you want a fast and cost effective approach to backfill the gap.
The Product Data Lake facilitates these requirements by letting you pull your product data from the lake in your in-house structure that may or may not be fully or partly compliant to an international standard.
In the Product Data Lake, you can take the role of being an upstream provider and a downstream receiver at the same time by being a midstream subscriber to the Product Data Lake. Thus, Product Data Lake covers the whole supply chain from manufacturing to retail and even the requirements of B2B (Business-to-Business) end users.
The Product Data Lake uses the data lake concept for big data by letting the transformation and linking of data between many structures be done when data are to be consumed for the first time. The goal is that the workload in this system has the resemblance of an iceberg where 10% of the ice is over water and 90 % is under water. In the Product Data Lake manually setting up the links and transformation rules should be 10 % of the duty and the rest being 90 % of the duty will be automated in the exchange zones between trading partners.
Hi Lilian, what’s the role of the GS1 Standards and the GDSN? isn’t redundant with this concept as an upstream provider should be able to publish his data many times directly through the GDSN? I am impatiently waiting for the MR3 to see if it’s going to improve the efficiency of the GS1 standards and the GDSN. Best regards. Naadim
Thanks a lot for commenting Naadim. Yes, the Product Data Lake shares the same vision as GSDN. The difference is the same as the difference between a data warehouse and a data lake. In a data warehouse, you make a lot of effort into building the right structure before loading data into the data warehouse. The downside of this approach is pace. I think it is fair to say that the penetration of GSDN have been slow. In the data lake you load the structure as it is and then do the transformation and linking when it is needed for consumption of the data – perhaps in different ways for different purposes. The Product Data Lake has the same concept thus being able to serve many more trading partners with a higher volume of data, in far more varying industries with a much higher velocity – to use the famous three Vs of big data.
Thanks for your clarification which makes sense; as I see this concept being developed through “new” SaaS solutions providing data, mainly and often, to the ecommerce channel of retailers or to pure ecommerce players (GDSN could be the cherry on the cake, sometimes). I see, also, that upstream provider face the problem of the governance of the attribute (different vision and understanding of the attribute by the retailers). So he needs to rework his data, for the same attribute according to the channel or the retailer… to be followed.
Hey, sorry I’m slow getting to this. As someone who works for a retailer and presumably would be a downstream consumer, I have a couple questions:
1. Who would be responsible for understanding the data models, lineage in import mappings?
2. Do you see this as an extension of B2B EDI communications or more as an extension of catelogue systems?
I’m also curious if you’re putting this forward as an idea or if it’s a business concept that you’re building?
Hi Andy. Thanks for joining in and comments and questions are welcomed at any time in my blog post life cycle 🙂
1) The concept is built on partnerships between trading partners and the mapping can be done by either the upstream provider or the downstream receiver. To help with that there will be some sub concepts:
A) Ambassadors who are system integrators or other Product Information Management professionals who are super users of the Product Data Lake helping their clients with connecting with trading partners.
B) Corporation with and support for the various general and industry specific product classification organisations. This helps where trading partners follows the same classification system.
C) A cross company data governance framework with specific artefacts for sharing product data.
2) It is mostly an extension to catalogue systems but with the same kind of operation as EDI communication.
This started as an idea after working with several organisations all with huge challenges around sharing product data with trading partners. Then it developed into a business concept and now finally, after seeing very slow momentum on this on the tool and service offering marked, I will launch a general available service for everyone in August 2016.