Coma, Wetsuit and Dedoop

The sehr geehrte damen und herren at Universität Leipzig (Leipzig University) are doing a lot of research in the data management realm and puts some good efforts in naming the stuff.

Here are some of the inventions:

COMA is a system for flexible Combination Of schema Matching Approaches. Let’s hope the thing is still alive.

WETSUIT (Web EnTity Search and fUsIon Tool) is a new powerful mashup tool – and what a nice seven letter abbreviation not sticking only to the first letters.

Tilia_tomentosaDedoop (Deduplication with Hadoop) is a prototype for entity matching for big data. Big phonetic Dedupe will be around of course.

Well, you should expect fuzzy abbreviations from this city, as Leipzig means “settlement where the linden trees stand”.

Bookmark and Share

Who Killed Big Data?

No Bulls
Please, no big data bullsh…

I guess everyone is sick and tired of seeing the term “big data” attached to just about everything larger than 1 kilobyte.

But who is responsible? Who do we hold accountable for overusing the term big data? Who killed big data?

Was it first and foremost the vendors who made the kill? A recent blog post called “Big Data is Dead. What’s Next?” by John De Goes suggest that the vendors are to be blamed for stabbing big data from behind.

Could it be the analysts? I have, as mentioned in the post The Big MDM Trend, seen how Gartner (the analyst firm) have put big data forward in the shouting gallery in order to explain something already explained with other terms.

Big data has often been personalized by the data scientist. So maybe it was a Californian girl called Jill Dyché who caused an extinction of the data scientist and thereby big data. She wrote the blog post called Why I Wouldn’t Have Sex with a Data Scientist.

What do you think? Who killed big data?

Bookmark and Share

The Dangers of being a Global Shopper

The global shopper is a multi-channel beast.

A global shopper may be a tourist or a business traveler buying goods in exciting cities around the world in shops most probable operated by the very same brands that occupies his local high street. The global shopper may also do his business from his living room by shopping online on sites with strange foreign privacy rules and unusual registration forms.

Oxford_StreetBeing a global shopper is risky business.

For example it’s unbelievable why Oxford Street in London hasn’t been made into a pedestrian street long time ago like any other respectable high street in major cities. But no, global shoppers on Oxford Street are constantly in danger of being hit by a red double-decker bus when crossing the street for a good bargain while looking to the right wrong side.

And how about shoe sizes? Measuring systems and standards around the world is a jungle and as a global shopper you will in 8 ½ out of 10 trials pick the wrong number 42.

Going online isn’t any better.

When registering your home address on a foreign site you are on very slippery ground.

If the site is from the United States, and you are not, you have to choose living in one of 50 different states meaning nothing to you. But there is no way around. My favorite state then is Alaska usually being on the top of the list.

Having a postal code with letters in it can be a no go. Not having a postal code is much like not existing at all.

But don’t give up. As a global shopper you will be able to find sites online with absolutely no clue about what an address looks like. Only thing of course will be the question about if you actually will get your goods or have to settle with the credit card withdrawal only.

Bookmark and Share

Is the Holiday Season called Christmas Time or Yuletide?

Johansen_Viggo_Radosne_Boże_NarodzenieIn English we have these two different terms for the coming holiday season: Christmas Time or yuletide. Christmas Time has a religious touch while yuletide is old English and resembles the term juletid still used in Scandinavia. Also notice that Christmas Time is two words (unless written as Christmastime) while yuletide is a compound word like common in Germanic language. And oh, Christmas Time must be written with upper case as first letters while yuletide doesn’t have to (unless maybe in a blog post title). I still struggle a lot with English grammar.

The holiday season may be seen as a religious celebration or, which I think has become prevailing, a special occasion for business. Yuletide is high activity in Business-to-Consumer (B2C) both for brick and mortar shops and for eCommerce, while Christmas Time is almost a stand still for Business-to-Business (B2B) as no one is able to make any decisions because it is the holiday season.

By the way: The only thing I wish for xmas is that people start to standardize on the terms used for the same concept. Not at least at Christmastide it is so disturbing when we don’t have any form of standardisation.

Bookmark and Share

Multi-Domain MDM, Santa Style

How would a Multi-Domain Master Data Management (MDM) solution look like at Santa Claus’s organization?

julemandenI think it may look like this:

Santa’s MDM solution covers all 4 classic domains:

  • Party
  • Product
  • Location
  • Calendar

Party

A main business improvement achieved through Santa’s MDM solution is better Nice or Naughty management. The old CRM system didn’t have a dedicated field for Nice or Naughty assignment, so this information was found in many different fields used during the years including as part of a street address or as a “send Christmas card” check mark. Today Santa handles Nice and Naughty information including historical tracking as a kid may be Nice one year but Naughty the next. This also helps with predictive analysis for future present demand. Ho ho ho.

Party master data management at Santa’s also includes keeping track of all the business partners as manufacturers of toys and other stuff, the shopping malls where Santa has to sit in December and so on. A given legal entity may have different roles in different business processes. For example a reindeer insurance company may also require Santa’s presence at the company’s Christmas tree family party.

Product

Product Information Management (PIM) has always been a complex operation at Santa’s. In Wish List Fulfillment (Wishful) you may have kids wishing for the same thing with different wording. The new MDM solutions flexible hierarchy management features helps a lot when the wishes are matched with specifications obtained by the purchase elves. At Santa’s they increasingly work with the suppliers in sharing complete and timely product descriptions and specifications.

Location

Handling location information relates to different locations where Santa is supposed to live be that at the North Pole, in Greenland, in Lapland or any other believes as discussed in the post Notes about the North Pole.

Also related to knowing where to deliver all the presents Santa has realized that maintaining an address as part of the record for each boy and girl isn’t the best way. Today each boy and girl record has a relation with a start and end date to a location entity where location specific information, including precise chimney positions, are kept.

Calendar

Christmas present delivery timing is crucial for Santa. In some countries Christmas morning the 25th December is the right time for the stuff to be there. In other countries Christmas evening the 24th December is the right time. Add to that doing present delivery across all time zones. Ho ho ho.

The MDM implementation at Santa’s has indeed helped a lot with Santa Quality. But it is an ongoing journey.

Right now Santa is looking for a smart Information management firm to help with defining to what time zone the North Pole belongs.

Anyone out there?

Bookmark and Share

Fighting Identity Fraud with Identity Fraud

I have earlier had issues with SEO agencies posting comments on this blog in their quest to help data quality tool vendors in getting better search rank for data quality related terms. Example here.

This happened again today with a recent post called Addressing Digital Identity.

I find it quite funny that the SEO guy is talking about fighting identity fraud while posting a comment under a name that I bet is not his/her real name:

InfoGlide SEO scam

Bookmark and Share

Who is accountable for melting ice?

When working with data quality issues some of the big questions are: How bad is it? Is it getting worse? Can we do something about it? Who should do something about it?

These questions are basically the same as those around the changing climate on this planet including rising sea levels.

This morning I read an article on BBC news telling that several scientific teams have joined forces in an attempt to quantify exactly how it is with rising sea levels. The short answer is that the sea level now is 11.1 millimeters (7⁄16 of an inch) higher than in 1992.

The sea is rising because of melting ice primary on Antarctica and Greenland as seen below:

Ice_sheet_contribution_464

So I think it’s high time to ask the people of Antarctica and not at least the people of Greenland to do something serious about that their ice is melting and flooding innocent people in the rest of the world.

Bookmark and Share

The Three Big M’s of Data Quality

Most organizations have a lot of data quality issues where there is a wealth of possible solutions to deal with these challenges.

What you usually do is that that you categorize the problems into three different types of best resolutions:

Mañana

You could go ahead with solving the data quality problems today but probably you have better and more important things to do right now.

Your organization may have a global SAP rollout going on or other resource demanding implementations. Therefore it is most wise to deal with the data quality issues when everything is running smoothly.

Mission impossible

Maybe a resolution has been tried before and didn’t work. Chances that alternate people management, different orchestration of processes and development in available technology will change that are very slim.

May the force be with you

Many problems solve themselves over time or hopefully don’t get noticed by anyone. If things get ugly you always have your lightsaber.

Bookmark and Share

Sometimes Google Translate is a Foolish Friendship

This morning I stumbled upon an article in a Norwegian online newspaper. A rather unlikely incident actually happened to a driver, as he avoided hitting an elk on the road, but then ran into a bear.

The original text in Norwegian is here:

As I wanted to see how that would be in English, I hit the Google Translate button:

In the headline the two animals are translated from “elg” to “elk” and from “bjørn” to “bear”. Very well.

But in the subtitle the two words are translated differently. Now “elg” is “moose” and “bjørn” is “disservice”.

Hmmm…

Not sure why elk is substituted to moose. The two words are used synonymously. As I understand it, it must have been a moose, which is called an elk. Wkipedia has the details here.

But how did the bear become a disservice. Well, I guess it relates to an old fable called “The Bear and the Gardener” or the variant “The Hermit and the Bear”. Here a human becomes friend with a bear. While the man takes a nap, the bear helps driving off the flies, but eventually crushes the mans head in doing so. The moral is that you should not make foolish friendships.

In Danish/Norwegian such a well-meant but very bad attempt to help is a “bear’s service” (bjørnetjeneste) also known in German as a bärendienst. Just like Google Translate in this case became a disservice.

Bookmark and Share

Olympic Darlings and Big Data Experts

The Olympic Games produces two kinds of darlings.

One kind is the big winners as Usain Bolt and Michael Phelps.

The other kind is the big losers. As reported in the post Olympic Moments the 1988 Winter Games had the Brit “Eddie the Eagle” in ski jumping. The 2000 Sydney Summer Games had the swimmer Eric “The Eel” Moussambani. The 2012 London Summer Games now has Hamadou Djibo Issaka in rowing.

The ski jumper Eddie the Eagle came from a country that hates snow and comes to a full stop at the first sight of the white fluffy stuff from above. The rower Hamadou Djibo Issaka comes from Niger, a country almost only covered by desert.

Such braveness in competing way out of your comfort zone naturally brings me to the subject of big data experts.

A while ago I noticed a tweet by Neil Raden:

Oh yes. It’s amazing how many big data experts we have seen emerging in the short life of the big data buzz.

Bookmark and Share