Is that piece of data wrong or right? This may very well be a question about in what language we are talking about.
In an earlier double post on this blog I had a small quiz about the name of the Pope in the Catholic church. The point was that all possible answers were right as explained in post When Bad Data Quality isn’t Bad Data. The thing is that the Pope over the wold has local variants over the English name Francis. François in French, Franziskus in German, Francesco in Italian, Francisco in Spanish Franciszek in Polish, Frans in Danish and Norwegian and so on.
In today’s globalized, or should I say globalised, world, it is important that our data can be represented in different languages and that the systems we use to handle the data is built for that. The user interface may be in a certain flavor/flavour of English only, but the data model must cater for storing and presenting data in multiple languages and even variants of languages as English in its many forms. Add to that the capability of handling other characters than Latin in other script systems than alphabets as examined in the post called Script Systems.
This challenge is very close to me right when we are building a service for sharing product information in business ecosystems. So will the Product Data Lake be multilingual? Mais oui! Natürlich. Jo da.
PS: The Product Data Lake will actually help with collecting product information in multiple languages through the supply chains of product manufacturers, distributors, retailers and end users.