Big data and PIM: A match made in space

Product Information Management (PIM) have over the recent years emerged as an important technology enabled discipline for every company taking part in a supply chain. These companies are manufacturers, distributor, retailers and large end users of tangible products requiring a drastic increased variety of product data to be used in ecommerce and other self-service based ways of doing business.

At the same time we have seen the raise of big data. Now, if you look at every single company, product data handled by PIM platforms perhaps does not count as big data. Sure, the variety is a huge challenge and the reason of being for PIM solutions as they handle this variety better than traditional Master Data Management (MDM) solutions and ERP solutions.

The variety is about very different requirements in data quality dimensions based on where a given product sits in the product hierarchy. Measuring completeness has to be done for the concrete levels in the hierarchy, as a given attribute may be mandatory for one product but absolutely ridiculous for another product. An example is voltage for a power tool versus for a hammer. With consistency, there may be attributes with common standards (for example product name) but many attributes will have specific standards for a given branch in the hierarchy.

Product information also encompasses digital assets, being PDF files with product sheets, line drawings and lots of other stuff, product images and an increasing amount of videos with installation instructions and other content. The volume is then already quite big.

Image coming soon
A missing product image is a sign of a broken product data business process

Volume and velocity really comes into the game when we look at eco-systems of manufacturers, distributors and retailers. The total flow of product data can then be described with the common characteristics of big data: Volume, velocity and variety. Even if you look at it for a given company and their first degree of separation with trading partners, we are talking about big data where there is an overwhelming throughput of new product links between trading partners and updates to product information that are – or not least should have been – exchanged.

Within big data we have the concept of a data lake. A key success factor of a data lake solution is minimizing the use of spreadsheets. In the same way, we can use a data lake, sitting in the exchange zone between trading partners, for product information as elaborated further in the post Gravitational Collapse in the PIM Space.

Bookmark and Share

Gravitational Collapse in the PIM Space

The previous post on this blog was called Gravitational Waves in the MDM World. Building further on space science, I would like to use the concept of gravitational collapse, which is the process that happens when a star or other space object is born. In this process, a myriad of smaller objects are gathered into a more dense object.

PIM (Product Information Management) is part of the larger MDM (Master Data Management) world. PIM solutions offered today serves very well the requirements for organizing and supporting the handling of product information inside each organization.

However, there is an instability when observing two trading partners. Today, the most common mean to share product data is to exchange one or several spreadsheets with product identification and product attributes (sometimes also called properties or features). Such spreadsheets may also contain links to digital assets being product images, line drawing documents, installation videos and other rich media stuff.

PIM1

As an upstream provider of product data, being a manufacturer or upstream distributor, you have these requirements:

  • When you introduces new products to the market, you want to make the related product data and digital assets available to your downstream partners in a uniform way
  • When you win a new downstream partner you want the means to immediately and professionally provide product data and digital assets for the agreed range
  • When you add new products to an existing agreement with a downstream partner, you want to be able to provide product data and digital assets instantly and effortless
  • When you update your product data and related digital assets, you want a fast and seamless way of pushing it to your downstream partners
  • When you introduce a new product data attribute or digital asset type, you want a fast and seamless way of pushing it to your downstream partners.
  • You may want to push product data and digital assets from several different internal sources.

As a downstream receiver of product data, being a downstream distributor, retailer or end user, you have these requirements:

  • When you engage with a new upstream partner you want the means to fast and seamless link and transform product data and digital assets for the agreed range from the upstream partner
  • When you add new products to an existing agreement with an upstream partner, you want to be able to link and transform product data and digital assets in a fast and seamless way
  • When your upstream partners updates their product data and related digital assets, you want to be able to receive the updated product data and digital assets instantly and effortless
  • When you choose to use a new product data attribute or digital asset type, you want a fast and seamless way of pulling it from your upstream partners
  • If you have a backlog of product data and digital asset collection with your upstream partners, you want a fast and cost effective approach to backfill the gap.

Fulfilling this with exchanging spreadsheets (and other peer-to-peer solutions) in the eco-system of trading partners is a chaotic mess.

PIM2

If you look at it from upstream being a manufacturer or upstream distributor the challenge is that you probably have hundreds of downstream receivers of product information. Each one requires their form of spreadsheet or other interface. They may even ask you to use their specific supplier portal meaning hundreds of different learning exercises on your side.

As a downstream receiver of product information being a downstream distributor, retailer or end user you have the opposite challenges. You probably have hundreds of upstream providers. If you go for having your own supplier portal you need to teach each of your suppliers and you have the software license and others burdens.

There is a need for a service that sits between the upstream and downstream trading partners. This service should help the upstream trading partners being manufactures and upstream distributors with sharing product data to many different downstream trading partners as well as it should eliminate or reduce the downstream trading partners need for implementing and maintaining supplier portals.

PIM3

In the end such a service will collapse the doomed galaxy of spreadsheets into an agile process driven service for sharing product data – called the Product Data Lake.

PIM4

Bookmark and Share

The Evolution of MDM

Master Data Management (MDM) is a bit more than 10 years old as told in the post from last year called Happy 10 Years Birthday MDM Solutions. MDM has developed from the two disciplines called Customer Data Integration (CDI) and Product Information Management (PIM). For example, the MDM Institute was originally called the The Customer Data Integration Institute and still have this website:http://www.tcdii.com/.

Today Multi-Domain MDM is about managing customer, or rather party, master data together with product master data and other master data domains as visualized in the post A Master Data Mind Map.

You may argue that PIM (Product Information Management) is not the same as Product MDM. This question was examined in the post PIM, Product MDM and Multi-Domain MDM. In my eyes the benefits of keeping PIM as part of Multi-Domain MDM are bigger than the benefits of separating PIM and MDM. It is about expanding MDM across the sell-side and the buy-side of the business eventually by enabling wide use of customer self-service and supplier self-service.

MDM

The external self-service theme will in my eyes be at the centre of where MDM is going in the future. In going down that path there will be consequences for how we see data governance as discussed in the post Data Governance in the Self-Service Age. Another aspect of how MDM is going to be seen from the outside and in is the increased use of third party reference data and the link between big data and MDM as touched in the post Adding 180 Degrees to MDM.

Besides Multi-Domain MDM and the links between MDM and big data a much mentioned future trend in MDM is doing MDM in the cloud. The latter is in my eyes a natural consequence of the external self-service themes and increased use of third party reference data.

If you happen to be around Copenhagen in the late January I can offer you the full story at a late afternoon event taking place in the trendy meatpacking district and arranged by the local IT frontrunner company ChangeGroup. The event is called Master Data Management: Before, now and in the future.

The Future of Master Data Management

Back in 2011 Gartner, the analyst firm, predicted that these three things would shape the Master Data Management (MDM) market:

  • Multi-Domain MDM
  • MDM in the Cloud
  • MDM and Social Networks

The third point was in 2012, after the raise of big data, rephrased to MDM and Big Data as reported in the post called The Big MDM Trend.

In my experience all these three themes are still valid with slowly but steadily uptake.

open-doorBut, have any new trends showed up in the past years?

In a 2015 post called “Master Data Management Merger Tardis and The Future of MDM” Ramon Chen of Reltio puts forward some new possibilities to be discussed, among those Machine Learning & Cognitive computing. I agree with Ramon on this theme, though these have been topics around in general for decades without really breaking through. But we need more of this in MDM for sure.

My own favourite MDM trend is a shift from focussing on internally captured master data to collaboration with external business partners as explained in the post MDM 3.0 Musings.

In that quest, I am looking forward to my next speaking session, which will be in Helsinki, Finland on the 8th December. There is an interview on that with yours truly available on the Talentum Master Data Management 2015 site.

CDI, PIM, MDM and Beyond

The TLAs (Three Letter Acronyms) in the title of this blog post stands for:

  • Customer Data Integration
  • Product Information Management
  • Master Data Management

CDI and PIM are commonly seen as predecessors to MDM. For example, the MDM Institute was originally called the The Customer Data Integration Institute and still have this website: http://www.tcdii.com/.

Today Multi-Domain MDM is about managing customer, or rather party, master data together with product master data and other master data domains as visualized in the post A Master Data Mind Map. Some of the most frequent other master domains are location master data and asset master data, where the latter one was explored in the post Where is the Asset? A less frequent master data domain is The Calendar MDM Domain.

QuadrantYou may argue that PIM (Product Information Management) is not the same as Product MDM. This question was examined in the post PIM, Product MDM and Multi-Domain MDM. In my eyes the benefits of keeping PIM as part of Multi-Domain MDM are bigger than the benefits of separating PIM and MDM. It is about expanding MDM across the sell-side and the buy-side of the business eventually by enabling wide use of customer self-service and supplier self-service.

The external self-service theme will in my eyes be at the centre of where MDM is going in the future. In going down that path there will be consequences for how we see data governance as discussed in the post Data Governance in the Self-Service Age. Another aspect of how MDM is going to be seen from the outside and in is the increased use of third party reference data and the link between big data and MDM as touched in the post Adding 180 Degrees to MDM.

Besides Multi-Domain MDM and the links between MDM and big data a much mentioned future trend in MDM is doing MDM in the cloud. The latter is in my eyes a natural consequence of the external self-service themes and increased use of third party reference data which all together with the general benefits of the SaaS (Software as a Service) and DaaS (Data as a Service) concepts will make MDM morph into something like MDaaS (Master Data as a Service) – an at least nearly ten year old idea by the way, as seen in this BeyeNetwork article by Dan E Linstedt.

Bookmark and Share

IDQ vs iDQ™

The previous post on this blog was called Informatica without Data Quality? This post digs into the messaging around the recent takeover of Informatica and the future for the data quality components in the Informatica toolbox.

In the comments Julien Peltier and Richard Branch discusses the cloud emphasis in the messaging from the new Informatica owners and especially the future of Master Data Management (MDM) in the cloud.

open-doorMy best experience with MDM in the cloud is with a service called iDQ™ – a service that shares TLA (Three Letter Acronym) with Informatica Data Quality by the way. The former stands for instant Data Quality. This is a service that revolves around turning your MDM inside-out as latest touched on this blog in the post The Pros and Cons of MDM 3.0.

iDQ™ specifically deals with customer (or rather party) master data, how to get this kind of master data right the first time and how to avoid duplicates as explored in the post The Good, Better and Best Way of Avoiding Duplicates.

Bookmark and Share

The Pros and Cons of MDM 3.0

A recent post on this blog was called Three Stages of MDM Maturity. This post ponders the need to extend your Master Data Management (MDM) solution to external business partners and take more advantage of third party data providers. We may call this MDM 3.0.

In a comment on LinkedIn Bernard PERRINEAU says:

MDM 3.0 Pros and Cons

Starting with the most often mentioned point against extending your MDM solution to the outside Vipul Aroh of Verdantis rightfully in a comment to the post mentions a wide spread hesitancy around. I think/hope this hesitancy is the same as the hesitancy we saw when Salesforce.com first emerged. Many people didn’t foresee a great future for Salesforce.com, because putting your customer base into the cloud was seen as a huge risk. But eventually the operational advantages in most cases have trumped the thought risks.

Ironically the existents of CRM systems, in the cloud or not, is a hindrance for MDM solutions to be system of entry or support data entry for the customer master data domain.  I remember when talking to a MDM vendor CEO about putting such features for customer data entry into a MDM solution his reply was something like: “Clients don’t want that, they want to consolidate downstream”. I think it is a pity that “clients want” to automate the mess and that MDM and other vendors wants to help them with that.

That said, there are IT system landscape circumstances to be overcome in order to put your MDM solution to the forefront.

But when doing that, and even when starting to do that, the advantages are plentiful. A story about a start of such a journey for customer master data is shared in the post instant Data Quality at Work. This approach is examined more in the post instant Single Customer View. To summarize you will gain both on getting data quality right the first time and at the same time save time (and time is money) in the data collection stage.

When it comes to product master data I think everyone working in that field acknowledges the insanity in how the same data are retyped, or messed around in spreadsheets, between manufactures, distributors, retailers and end users. Some approaches to overcome this are explored in the post Sharing Product Master Data. Each of these approaches has their pros and cons.

The rise of big data also points in the direction of having your MDM solution exposed to the outside as touched in the post Adding 180 Degrees to MDM.

Bookmark and Share

Happy 10 Years Birthday MDM Solutions

326px-10piece-blank-R_k.svgEvery year Information Difference publishes a report about the Master Data Management (MDM) Landscape. This year’s report celebrates the 10th year of MDM solutions around. Of course, the MDM industry didn’t start on a certain date 10 years ago, but the use of MDM as a common accepted notation for a branch of IT solutions within data management, and in my eyes as a much needed spinoff of the data quality discipline, was commonly being accepted.

A birthday is a good occasion to look ahead. The Information Difference report takes on some of the trends in the MDM solutions around, being that:

  • Most MDM vendors today claims to be multi-domain MDM providers, but certainly they are on different stages coming from different places
  • Providing MDM in the cloud is slowly but steadily adapted
  • Integrating big data into MDM solutions has, in my words, reached the marketing and R&D departments at the MDM vendors and will someday also reach the professional service and accounting folks there

Read the MDM landscape Q2 2014 report from Information Difference here.

Bookmark and Share

Winning by Sharing Data

When I changed my laptop a few months ago, it was the easiest migration to a new computer ever.

Basically I just had to connect to all the services in the cloud I had been using before and for many services the path was to get connected to Google+, Twitter and FaceBook and then connect to many other services via these connections.

ShareThis was a personal win.

Most of the teams I am working with are sharing their data with me in the cloud. As in the bad old days I do not have to call and ask for progress on this and that. I can check the status myself and even get notifications on my phablet when a colleague completes a task.

ShareThis is a shared win.

Within my profession being data quality improvement and Master Data Management (MDM) sharing data is going to be a winning path too as told in the post Sharing is the Future of MDM.

There are several ways of sharing master data like using commercial third party data, digging into open government data, having your own data locker and relying on social collaboration. These options are examined in the post Ways of Sharing Master Data.

Bookmark and Share

Another Facet of MDM: Master Relationship Management

When talking about Master Data Management (MDM) we deal with something that maybe could be better coined as Master Entity Management. As a good old (logical or not) data model in the relational database world also have relations between entities there must of course then also be something called Master Relationship Management. And indeed there is as mentioned by Aaron Zornes in the interview called MDM and Next-Generation Data Sources on Information Management.

As touched by Aaron Zornes the solution to handling relations in the future may come from outside the relational database world in the form of graph databases. This was also discussed in the post Will Graph Databases become Common in MDM?

An often mentioned driver for looking much more into relationships is the promise of finding customer, and other, insights in social data based on the match between traditional master entity data and social network profiles. Handling these relations is an important facet of social MDM, an often mentioned subject on this blog.

puzzleBuilding the relations doesn’t stop with party master entities. There are valuable relations to location master entities and not at least crucial relations between party master entities and product master entities as told in the post Customer Product Matrix Management.

So Master Relationship Management fits very well with the current main trends in the MDM world being embracing big data not at least social data and encompassing multi-domain MDM. The third main trend being MDM in the cloud also fits. It’s not that we can’t explore all the relations out there from on-premise solutions; it’s just that there is a better relationship in doing so in the cloud.

Bookmark and Share