Happy Easter

If you are in a country with Western Christian roots this weekend is Easter weekend. Countries with Eastern Christian roots have it the next weekend.

Many countries (and states or provinces within) have holidays around Easter. For many Easter Monday is a day off. Some had Good Friday as a none working day and a few countries even had Maundy Thursday as a none productive day for most people.

The passed over Maundy Thursday was the day of The Last Supper. The famous Last Supper painting by Leonardo da Vinci has in my eyes, as told in a post from last year, something in common with Data Quality Evangelism.

Happy Easter.

Bookmark and Share

Yin and Yang Data Quality

The old Chinese concept of yin and yang, or simply yīnyáng, is used to describe how polar opposites or seemingly contrary forces are interconnected and interdependent in the natural world. The concept is probably best known materialized as sweet and sour sauce.

Lately we had a debate in the data quality community on social media about if data quality is a journey or a destination, nicely summarized by Jim Harris in the post Quo Vadimus. I guess the prevailing sentiment is that it is kind of both a journey and a destination.

We also have the good old question about if data are of high quality if they are “fit for the purpose of use” or “aligned with the real world”. Sometimes these benchmarks go in opposite directions and we like to fulfill both goals at the same time.

The Data Quality discipline is tormented by belonging to both the business side and the technology side of practice. These sides are often regarded as contrary, but in my experience we get the best sauce by having both sides represented.

And oh yes, do we actually have to call it one of two diametrically different terms being Data Quality or Information Quality. Bon appetit.

Bookmark and Share

Informatics for adding value to information

Recently the Global Agenda Council on Emerging Technologies within the World Economic Forum has made a list of the top 10 emerging technologies for 2012. According to this list the technology with the greatest potential to provide solutions to global challenges is informatics for adding value to information.

As said in the summary: “The quantity of information now available to individuals and organizations is unprecedented in human history, and the rate of information generation continues to grow exponentially. Yet, the sheer volume of information is in danger of creating more noise than value, and as a result limiting its effective use. Innovations in how information is organized, mined and processed hold the key to filtering out the noise and using the growing wealth of global information to address emerging challenges.”

Big data all over

Surely “big data” is the buzzword within data management these days and looking for extreme data quality will be paramount.

Filtering out the noise and using the growing wealth of global information will help a lot in our endurance to make a better world and to make better business.

In my focus area, being master data management, we also have to filtering out the noise and exploit the growing wealth of information related to what we may call Big Master Data.

Big external reference data

The growth of master data collections is also seen in collections of external reference data.

For example the Dun & Bradstreet Worldbase holding business entities from around the world has lately grown quickly from 100 million entities to over 200 millions entities. Most of the growth has been due to better coverage outside North America and Western Europe, with the BRIC countries coming in fast. A smaller world resulting in bigger data.

Also one of the BRICS, India, is on the way with a huge project for uniquely identifying and holding information about every citizen – that’s over a billion. The project is called Aadhaar.

When we extend such external registries also to social networking services by doing Social MDM, we are dealing with very fast growing number of profiles in Facebook, LinkedIn and other services.

Surely we need informatics for adding the value of big external reference data into our daily master data collections.

Bookmark and Share

Extreme (Weather) Information Quality

This morning I had my scheduled train journey from London, UK to Manchester, UK cancelled.

It’s not that I wasn’t warned. The British press has been hysterical the last days because temperature was going to be below freezing and some snowfall was expected. For example BBC had a subject matter expert in the studio showing how to pack the trunk of your car with stuff feasible for a trip across the North Pole.

Anyway, encouraged by that the train was set to go on the online status I made my way to Euston Station, where I was delighted to see the train was announced for none delayed departure on the screen there. Only to be very disappointed by the message, 10 minutes after scheduled departure, saying that the service was cancelled “due to the severe weather conditions”.

Well, well, well. The temperature is above freezing this lovely Sunday morning. There is practically no wind and only some watery remains of tonight’s snowfall on the ground. With that interpretation of the raw data I guess you couldn’t go around in Scandinavia a considerable part of the year.

But that is how it is when making raw data into information. Different results indeed.

I guess it is good business for Virgin Train not to be prepared for a little bit of snow when operating in England thus making the first sign of the white fluffy stuff from above being “severe weather conditions”.

My next business analysis with Virgin Train will be targeting at the refund procedure. Hope the customer experience will be just fine.

Bookmark and Share

Painted Data Quality

In a recent blog post called Plato’s Data by Jim Harris we are reminded about that data isn’t the real world but only an illusion of reality.

This makes me think about in what degree the data quality discipline is an exact science or merely an art. And surely there is a large element of art in some activities within data quality improvement as I also participated in a radio show on Jim’s blog discussing The Art of Data Matching.

One kind of (real) art is painting. Within painting good art may be that a painting reflects the real world as precisely as possible. But good art may certainly also be that the painting, like a surrealistic painting, doesn’t look like the real world, but makes you think.

With today’s technology you might also say that why bother making a painting that looks like the real world if you can simply take a photo.

However, with many good (famous) photos there is usually a controversy about if the photo was staged. An example is Raising the Flag on Iwo Jima, that also made it to a stamp.

For the record: The photo is believed not to be staged by the photographer, but it was the second raising of the flag where a smaller flag was replaced by a more impressive one. There wasn’t a hard fighting for the mountain top where the flag was raised. The fierce fighting on the island was down in the caves.   

My 3 cents….

Bookmark and Share

Klout Data Quality

Today it was announced that yet a social media service has passed a 100 million mark, as now 100 Million People have Klout.

Klout is a service that measures your online influence based on your activity on Twitter, LinkedIn, FaceBook and so on. The main measure is a score between 1 and 100.

 

As many others I have from time to time been tempted to have a narcissistic look at my profile. I haven’t recorded it, but it seems to me that some of the other attributes on Klout changes a lot. Or maybe it’s just me who is moving around in the social media realm in all directions.

Today my Klout style is being a “broadcaster”. And that may be right, as I’m re-tweeting a lot of links. But I’m sure I was a “specialist” the last time I checked, and that is in the opposite corner of the style quadrant. Well, never mind, every description of the styles is positive.

Klout also have beliefs in what topics you are influential about. One of my top 10 topics is “magic”. I think I must be more careful about tweeting about “data quality magic”. Another topic of mine is “Tripoli”. That’s right too; I did make one tweet about Tripoli that ended up as an information quality trainwreck.

Unfortunately I’m not influential about data quality or MDM at all. I’ll have to work on that.

Bookmark and Share

Oranges, Apples and Pears go Bananas

My post yesterday about Data Quality Evangelism included the fruit oranges and a comment from Jim Harris added apples to the analogies by using the idiom about comparing apples and oranges.

There are a lot of linguistic musings around the words apples and oranges.

In many languages we use the similar idiom as comparing apples and pears. But it may be geographic depended as in European French it is apples and pears but in Quebec French it is apples and oranges.

In some Germanic languages the fruit orange can be translated as “Chinese apple”. For example the Dutch word is “sinaasappel”  and the Danish/Norwegian word is “appelsin”. In Germany it is “Apfelsine” in the North and “Orange” in the South. The linguistic line across Germany is by the way called the apple-line, but for the opposite reason.

In English a “Chinese apple” is a pomegranate.

The word orange has two meanings in English: A fruit and the color (as they write in American English) or a colour (as they write on the British English).

The two meanings make Google Translate go bananas. When Google translates between languages it does it via English. So if I translate “appelsin” from Danish to Dutch I don’t get “sinaasappel”. Instead I get “oranje”, the Dutch national color.

No wonder Data Quality Evangelism most often isn’t fruitful.

Bookmark and Share

Hit by an Outlier

Yesterday something weird happened on this blog. Usually I’m pleased to have between 100 and 250 so called page views on workdays. But yesterday there were 751. This Saturday morning everything is back to normal again:

I have no clue about who visited and why. I didn’t write anything very clever yesterday. Most views were on the home page. The count of referrers indicates a quiet day in the office:

 

Also the search terms counter doesn’t help:

Well, I guess I just have to consider this an outlier, being an observation that appears to deviate markedly from other members of the sample in which it occurs.

That is anyway my gut feeling without performing the Grubb’s test for outliers

Bookmark and Share

Some Flyover Information

My Follow Friday World Tour stop today was at some Flyover States, being states in the United States bicoastal people only see from above when flying over them going from coast to coast.

If I were to fly from (A) Copenhagen to (B) Los Angeles one should, by looking at a traditional flat world map, think that the flight also would pass over these inland states.

But the world isn’t flat. The shortest route for an east to west flight will tend to follow the so called great circle being a much more northerly swing.  

However, this isn’t the shortest route either. The polar route, being flying over the North Pole, is the shortcut in the real round world. Actually the Copenhagen (CPH) to Los Angeles (LAX) connection established in 1954 was the world’s first commercial polar route.

I find great analogies in looking at a map and solving data and information quality issues like in the post Sharing data is key to a single version of the truth which was a blog-bout with a UK guy and a Flyover guy.

Bookmark and Share

World Population Excluding Greenland?

According to a newly published paper called The population of the world (2011) we are now 6,987 million citizens on the planet Earth.

However something makes me wonder if they counted Greenland. It’s not that inclusion or exclusion of the 57,564 Greenlanders will rock the figure, but I think we should all be in there.

Greenland does cover a great deal of area on a world map as the big white island on top of the world, not at least when the projection makes areas close to the poles bigger than on a globe.

But is Greenland visible in the population statistics at all?

First I looked for Greenland in North America where Greenland belongs in a geophysical context.

 

Not there.

Then I looked for Greenland in Northern Europe where Greenland belongs in a political context.

 

Not there – or maybe there as part of (the Kingdom of) Denmark?

The population of Denmark is stated as 5.6 million citizens.

If I look up the Kingdom of Denmark on Wikipedia we have these numbers:

It’s a close call. If we round the numbers the 5.6 million citizens is without the North Atlantic dependencies and Greenland, and the Faroe Islands, isn’t anywhere else. And anyway the area clearly suggest that Greenland isn’t included as part of Denmark. So it could be a case of rounding or a case of timeliness – or most probably a case of incompleteness.

Maybe we have passed 7 billion people on earth already if someone else (also) is missing in the statistics.

Bookmark and Share