1,000 Blog Posts and More to Come

number_1000I just realized that this post will be number 1,000 published on this blog. So, let me not say something new but just recap a little bit on what it has been all about in the last nearly 10 years of running a blog on some nerdy stuff.

Data quality has been the main theme. When writing about data quality one will not avoid touching Master Data Management (MDM). In fact, the most applied category used here on this site, with 464 and counting entries, is Master Data.

The second most applied category on this blog is, with 219 entries, Data Architecture.

The most applied data quality activity around is data matching. As this is also where I started my data quality venture, there has been 192 posts about Data Matching.

The newest category relates to Product Information Management (PIM) and is, with 20 posts at the moment, about Product Data Syndication.

Even though that data quality is a serious subject, you must not forget to have fun. 66 posts, including a yearly April Fools post, has been categorized as Supposed to be a Joke.

Thanks to all who are reading this blog and not least to all who from time to time takes time to make a comment, like and share.

B2C vs B2B in Product Information Management

The difference between doing Business-to-Consumer (B2C) or Business-to-Business (B2B) reflects itself in many IT enabled disciplines.

Yin and yangWhen it comes to Product Information Management (PIM) this is true as well. As PIM has become essential with the rise of eCommerce, some of the differences are inherited from the eCommerce discipline. There is a discussion on this in a post on the Shopify blog by Ross Simmonds. The post is called B2B vs B2C Ecommerce: What’s The Difference?

Some significant observations to go into the PIM realm is that for B2B, compared to B2C:

  • The audience is (on average) narrower
  • The price is (on average) higher
  • The decision process is (on average) more thoughtful

How these circumstances affect the difference for PIM was exemplified here on the blog in the post Work Clothes versus Fashion: A Product Information Perspective.

To sum up the differences I would say that some of the technology you need, for example PIM solutions, is basically the same but the data to go into these solutions must be more elaborate and stringent for B2B. This means that for B2B, compared to B2C, you (on average) need:

  • More complete and more consistent attributes (specifications, features, properties) for each product and these should be more tailored to each product group.
  • More complete and consistent product relations (accessories, replacements, spare parts) for each product.
  • More complete and consistent digital assets (images, line drawings, certificates) for each product.

How to achieve that involves deep collaboration in the supply chains of manufacturers, distributors and merchants. The solutions for that was examined in the post The Long Tail of Product Data Synchronization.

The Long Tail of Product Data Synchronization

When discussing with peers and interested parties about Product Data Lake, some of the alternatives are often brought up. These are EDI and GDSN.

So, what is the difference between those services and Product Data Lake.

Electronic Data Interchange (EDI) is the concept of businesses electronically communicating information that was traditionally communicated on paper, such as purchase orders and invoices. EDI also has a product catalog functionality encompassing:

  • Seller name and contact information
  • Terms of sale information, including discounts available
  • Item identification and description
  • Item physical details including type of packaging
  • Item pricing information including quantity and unit of measure

The Global Data Synchronization Network (GDSN) is an internet-based, interconnected network of interoperable data pools and a global registry known as the GS1 Global Registry, that enables companies around the globe to exchange standardised and synchronised supply chain data with their trading partners using a standardised Global Product Classification (GPC).

This service focuses on retail, healthcare, food-service and transport / logistics. In some geographies GS1 is also targeting DIY – do it yourself building materials and tools for consumers.

Product Data Lake is a cloud service for sharing product information (product data syndication) in the business ecosystems of manufacturers, distributors / wholesalers, merchants, marketplaces and large end users of product information.

Our vision is that Product Data Lake will be the process driven key service for exchanging any sort of product information within business ecosystems all over the world, with the aim of optimally assist self-service purchase – both B2C and B2B – of every kind of product.

In that way, Product Data Lake is the long tail of product data synchronization supplementing EDI and GDSN for a long range of product groups, product attributes, digital assets, product relationships and product classification systems:

EDI GDSN PDLFind out more in the Product Data Lake Overview.

Are You Still Scared by the Data Lake?

4 years ago, a post on this blog was called The Scary Data Lake. The post was about the fear about if the then new data lake concept would lead to data swamps with horrific data quality, data dumps no one would ever use, data cesspools with all the bad governed data and data sumps that would never be part of the business processes.

For sure, there have been mistakes with data lakes. But it seems that the data lake concept has matured and the understanding of what a data lake can do good is increasing. The data lake concept has even grown out of the analytic world and into more operational cases as told in the post Welcome to Another Data Lake for Data Sharing.

Some of the things we have learned is to apply well known data management principles to data lakes too. This encompasses metadata management, data lineage capabilities and data governance as reported in the post Three Must Haves for your Data Lake.

Data Lake at Halloween

How Wholesalers and Dealers of Building Materials can Improve Product Information Efficiency

MaterialsBuilding materials is a very diverse product group. As a wholesaler or dealer, you will have to manage many different attributes and digital assets depending on which product classification we are talking about.

Getting these data from a diverse crowd of suppliers is a hard job. You may have a spreadsheet for each product group where you require data from your suppliers, but this means a lot of follow up and work in putting the data into your system. You may have a supplier portal, but suppliers are probably reluctant to use it, because they cannot deal with hundreds of different supplier portals from you and all the other wholesalers and dealers possibly across many countries. In the same way that you are not happy about if you must fetch data from hundreds of different customer portals provided by manufacturers and other brand owners.

This also means that even if you can handle the logistics, you must limit your regular assortment of products and therefore often deal with special ad hoc products when they are needed to form a complete range of products asked for by your customers for a given building project. Handling of “specials” is a huge burden and the data gathering must usually be repeated if the product turns up again.

At Product Data Lake we have developed a solution to these challenges. It is a cloud service where your suppliers can provide product information in their way and you can pull the information in the way that fits your taxonomy, structure and format.

Learn about and follow the solution on our Product Data Pull LinkedIn page.

If you are interested, please ask for more information here:

 

Linked Product Data Quality

Some years ago the theme of Linked Data Quality was examined here on the blog.

As stated in the post a lot of product data is already out there waiting to be found, categorized, matched and linked.

Doing this is at the core of the Product Data Lake venture I am involved with. What we aim to do is linking product information stored using different taxonomies at trading partners, preferable by referencing international and industry standards as eCl@ss, ETIM, UNSPSC, Harmonized System, GPC and more.

Our approach is not to reinvent the wheel, but to collaborate with partners in the industry. This include:

  • Experts within a type of product as building materials and sub-sectors in this industry, machinery, chemicals, automotive, furniture and home-ware, electronics, work clothes, fashion, books and other printed materials, food and beverage, pharmaceuticals and medical devices. You may be a specialist in certain standards for product data. As an ambassador you will link the taxonomy in use at two trading partners or within a larger business ecosystem.
  • Product data cleansing specialists who have proven track records in optimizing product master data and product information. As an ambassador you will prepare the product data portfolio at a trading partner and extend the service to other trading partners or within a larger business ecosystem.
  • System integrators who can integrate product data syndication flows into Product Information Management (PIM) and other solutions at trading partners and consult on the surrounding data quality and data governance issues. As an ambassador, you will enable the digital flow of product information between two trading partners or within a larger business ecosystem.
  • Tool vendors who can offer in-house Product Information Management (PIM) / Master Data Management (MDM) solutions or similar solutions in the ERP and Supply Chain Management (SCM) sphere. As an ambassador you will able to provide, supplement or replace customer data portals at manufacturers and supplier data portals at merchants and thus offer truly automated and interactive product data syndication functionality.
  • Technology providers with data governance solutions, data quality management solutions and Artificial Intelligence (AI) / machine learning capacities for classifying and linking product information to support the activities made by ambassadors and subscribers.
  • Reservoirs, as Product Data Lake is a unique opportunity for service providers with product data portfolios (data pools and data portals) for utilizing modern data management technology and offer a comprehensive way of collecting and distributing product data within the business processes used by subscribers.

See more on the Product Data Link site, on the Product Data Link showcase page on LinkedIn or get in contact right away:

 

Become a Product Data lake ambassador!

cropped-badgesmall.png

 

 

Welcome to Another Data Lake for Data Sharing

A couple of weeks ago Microsoft, Adobe and SAP announced their Open Data Initiative. While this, as far as we know, is only a statement for now, it of course has attracted some interest based on that it is three giants in the IT industry who have agreed on something – mostly interpreted as agreed to oppose Salesforce.com.

Forming a business ecosystem among players in the market is not new. However, what we usually see is that a group of companies agrees on a standard and then each one of them puts a product or service, that adheres to that standard, on the market. The standard then caters for the interoperability between the products and services.

In this case its seems to be something different. The product or service is operated by Microsoft based on their Azure platform. There will be some form of a common data model. But it is a data lake, meaning that we should expect that data can be provided in any structure and format and that data can be consumed into any structure and format.

In all humbleness, this concept is the same as the one that is behind Product Data Lake.

The Open Data Initiative from Microsoft, Adobe and SAP focuses at customer data and seems to be about enterprise wide customer data. While it technically also could support ecosystem wide customer data, privacy concerns and compliance issues will restrict that scope in many cases.

At Product Data Lake, we do the same for product data. Only here, the scope is business ecosystem wide as the big pain with product data is the flow between trading partners as examined here.

Open Data Initiative SAP Adobe Microsoft

Digitalization has Put Data in the Forefront

20 years ago, when I started working as a contractor and entrepreneur in the data management space, data was not on the top agenda at many enterprises. Fortunately, that has changed.

An example is displayed by Schneider Electric CEO Jean-Pascal Tricoire in his recent blog post on how digitization and data can enable companies to be more sustainable. You can read it on the Schneider Electric Blog in the post 3 Myths About Sustainability and Business.

Manufacturers in the building material sector naturally emphasizes on sustainability. In his post Jean-Pascal Tricoire says: “The digital revolution helps answering several of the major sustainability challenges, dispelling some of the lingering myths regarding sustainability and business growth”.

One of three myths dispelled is: Sustainability data is still too costly and time-consuming to manage.

From my work with Master Data Management (MDM) and Product Information Management (PIM) at manufacturers and merchants in the building material sector I know that managing the basic product data, trading data and customer self-service ready product data is hard enough. Taking on sustainability data will only make that harder. So, we need to be smarter in our product data management. Smart and sustainable homes and smart sustainable cities need smart product data management.

In his post Jean-Pascal Tricoire mentions that Schneider Electric has worked with other enterprises in their ecosystem in order to be smarter about product data related to sustainability. In my eyes the business ecosystem theme is key in the product data smartness quest as pondered in the post about How Manufacturers of Building Materials Can Improve Product Information Efficiency.

MDMDG 2013 wordle

 

It is time to apply AI to MDM and PIM

The intersection between Artificial Intelligence (AI) and Master Data Management (MDM) – and the associated discipline Product Information Management (PIM) – is an emerging topic.

A use case close to me

In my work at setting up a service called Product Data Lake the inclusion of AI has become an important topic. The aim of this service is to translate between the different taxonomies in use at trading partners for example when a manufacturer shares his product information with a merchant.

In some cases the manufacturer, the provider of product information, may use the same standard for product information as the merchant. This may be deep standards as eCl@ss and ETIM or pure product classification standards as UNSPSC. In this case we can apply deterministic matching of the classifications and the attributes (also called properties or features).

Product Data Syndication

However, most often there are uncovered areas even when two trading partners share the same standard. And then again, the most frequent situation is that the two trading partners are using different standards.

In that case we initially will use human resources to do the linking. Our data governance framework for that includes upstream (manufacturer) responsibility, downstream (merchant) responsibility and our ambassador concept.

As always, applying too much human interaction is costly, time consuming and error prone. Therefore, we are very eagerly training our machines to be able to do this work in a cost-effective way, within a much shorter time frame and with a repeatable and consistent outcome to the benefit of the participating manufacturers, merchants and other enterprises involved in exchanging products and the related product information.

Learning from others

This week I participated in a workshop around exchanging experiences and proofing use cases for AI and MDM. The above-mentioned use case was one of several use cases examined here. And for sure, there is a basis for applying AI with substantial benefits for the enterprises who gets this. The workshop was arranged by Camelot Management Consultants within their Global Community for Artificial Intelligence in MDM.

Share or be left out of business

Enterprises are increasingly going to be part of business ecosystems where collaboration between legal entities not belonging to the same company family tree will be the norm.

This trend is driven by digital transformation as no enterprise possibly can master all the disciplines needed in applying a digital platform to traditional ways of doing business.

Enterprises are basically selfish. This is also true when it comes to Master Data Management (MDM). Most master data initiatives today revolve around aligning internal silos of master data and surrounding processes to fit he business objectives within an enterprise as a whole. And that is hard enough.

However, in the future that is not enough. You must also be able share master data in the business ecosystems where your enterprise will belong. The enterprises that, in a broad sense, gets this first will survive. Those who will be laggards are in danger of being left out of business.

This is the reason of being for Master Data Share.

Master Data Share or be OOB