Word Quality

One of the top blogging advices is to be careful about your spelling and grammar and you might say that this should be even more important on a data quality blog.

Unfortunately I have to admit that I’m not particularly good at that.

Perhaps I’m somewhat excused because I’m blogging in English and English isn’t my mother tongue. When I write articles and other stuff in English for companies I work for, there is always someone with English skills to catch my mistakes. But when I’m blogging, I’m on my own.

I do strive to get it right. I always write my texts in a word processor with English spell check and grammar on. But there is a lot of mistakes that aren’t corrected by the spell checker as use of a wrong word, forgetting a word and not concatenating words that should (or might) be a compound word.

Many times I also try to google the terms I’m using. It’s a helpful trick, but sometimes you are cheated by hitting other people’s mistakes.

Occasionally folks are kind to help me by saying that I should use another word instead of some rare word I have found in an English dictionary.

So, not at least to the subscribers on this blog, who gets my first takes, please forgive my occasional bad spelling, grammar and odd words. I’m constantly thinking about continuous word quality improvement.            

Bookmark and Share

It’s Hard to Be a Data Geek

Sometimes I, along with other folks in my social network circles and groups, describe myself as a data geek.

Another none anonymous data geek, Rich Murnane, recently started a series of excellent cartoons on his blog about DataGeek’s first days on a new job. Hard work indeed.

Then the data geeky corporate twitter account of IBM Initiate has made a twittpoll asking: Do you consider yourself a data geek or a management geek?

It’s a hard question. Because you know that a lot of things about better data is about better management and it’s much more admirable to be a management geek than a poor data geek.

Anyway I stood firm and admitted that I am a data geek. Because the world has always been crowded with management consultants with little attention to the needs of the data. Someone has to take care about the data. It’s hard, but it’s worth it.

Bookmark and Share

More Social Master Data Management

Yesterday my American cyberspace friend Jim Harris was so kind to send an invitation for Google+ – the new social network service you must hook into. Thanks Jim, now I had to fill in yet a profile, upload the same picture as always and start networking from scratch once again 🙂

As many people I have several profiles in different social network services as Twitter, Facebook and LinkedIn. As I’m doing business also with German speaking countries I also use XING as alternative to LinkedIn as told in the post LinkedIn and the other Thing.

In a comment to that post my Austria based French connection Olivier Mathurin noted: “Disconnected duplicated siloed professional profiles, mmm…”

In a post on this blog called Social Master Data Management made one year ago it is discussed how social CRM will add new sources from social networks to the external reference data sources we already know from old time CRM.

With all the different faces everyone are wearing in the social media realm this isn’t going to be easy and one may consider if social master data management is a wrong path giving the individual nature and built-in privacy in social networking services.    

Well, Gartner (the analyst firm) says that increasing links between MDM and social networks is one of the Three Trends That Will Shape the Master Data Management Market.

So, acknowledging that Gartner predictions are self-fulfilling, you better get moving into LinkedIn, Xing, Viadeo, Twitter, Facebook, (forget MySpace), Google+  and what’s next.

Bookmark and Share

Psychographic Data Quality

I have just read an article on Mashable by Jamie Beckland called The End of Demographics: How Marketers Are Going Deeper With Personal Data.

The article explains how new sources of available data makes it possible for marketers to get a much closer look at potential customers and thereby going from delivering a broad message to a huge crowd to delivering a very targeted message to a small group of people with a high probability of getting a response.  In short: Marketers are going from demographic marketing to psychographic marketing.

I believe this is true and ongoing (as I have also been involved in such activities).

The data quality issues we have always known in direct marketing is surely very similar in the psychographic marketing which is going on in the social media realm and in connection with eBusiness.

In my eyes, the concept of a single customer view is also a key to getting success in psychographic marketing.  

You are not delivering a targeted message if you are delivering two different messages to two user profiles belonging to the same real world individual.

Your message will be very frustrating if you treat someone as a prospect customer if that someone already is an existing customer perhaps in another channel.

The effectiveness of psychographic marketing depends on a match between the psychographic variables, the behavioral variables and the demographic variables. As seen in the example in the Mashable article a good old thing as geocoding will be needed here.

An exciting thing in the rise of psychographic marketing is that it will add to the trend in data quality technology where it’s much more than simple name and address cleansing and deduplication.  Rich location data will despite the virtual playground be further important. The relations between customers and products as described in the post Customer Product Matrix Management will be further refined in psychographic marketing.       

Bookmark and Share

Timing Your Social Media Activity

When engaging in social media I often consider what time and what day to publish a new blog post, tweeting about it and promoting it on LinkedIn.

My audience is roughly distributed as 40 % Americas (almost all in North America), 40 % EMEA (almost all in Europe) and 20 % Asia and Pacifics.

That makes it a pretty much around the clock audience with a peak in page views and comments between UTC 14:00 and 17:00, which is when the working day in Europe is still on and the Americans are waking up.  However that doesn’t necessary mean that I should publish in the peak hours. In fact I haven’t been able to measure that the time of publishing affects number of page views and comments. So I’ll keep on publishing at the time I have anything to say and have the time to write about it.

If I look at weekdays working days are double as busy as weekends with Monday as best day probably catching up with people that don’t do social media in weekends. So I’ll keep on publishing at the days I have anything to say and feel inspired to write about it.

Bookmark and Share

Do You Have an Official SnoopBook Account?

I have earlier written about how Facebook resembles a typical Business-to-Consumer customer table in the post Out of Facebook.

Like any customer table the Facebook member table will suffer from a number of different data quality issues like:

  • Some individuals are signed up more than once using different profiles.
  • Some individuals who created a profile are not among us anymore.
  • Some profiles are not an individual person, but a company or other form of establishment.

One type of the latter one seems to be government and other authorities who want to snoop into your daily whereabouts in order to see if you are paying the taxes you should and not receiving welfare services you shouldn’t.

Recently I read a story about a British woman who got jailed on such an account. Link here.

It was not said if the authorities used a special account for the investigation or it was the civil servants personal accounts that were used.

This morning I read an article (in Danish) about the Danish tax authority’s activities in this field. They have realized that they illegally have used personal accounts for such activities, but have stopped that now. However, they will now create an account for the organization to be used for snooping.         

Bookmark and Share

#MDM is dead, long live #XXX

When tweeting about Master Data Management (MDM) it has been custom to use the #MDM hashtag.

However you sometimes have seen other subjects tagged with #MDM, often in other languages than English as for example “Matin de Merde”.

But now #MDM has been completely taken over by the Tourism Queensland (Australia) Million Dollar Memo campaign.

So, Master Data Management tweeps: Do we have to find a new hashtag?.

Is #MasterDataManagement too long?

Other suggestions?

Bookmark and Share

Data Quality and Data Visualization

This is a self-centric blog post about data quality and data visualization.

The figure to the right is a statistic about who viewed my profile in a certain period on LinkedIn.

Looking at that makes me think about a couple of data quality and data visualization issues especially linked to visualization of data on a world map.

Hidden value

Fortunately there is both a map and some numbers below, because the map is too small to show from where I have the most views: My very small home country Denmark.

Misleading proportions

I have no views from the grey countries. So I should certainly concentrate on Greenland (the big grey land in the top of the map) to get more viewers, right?

Well, the Mercator projections make areas close to the poles like Greenland look much bigger than in the real world. Greenland is a big island, but in fact only less than 1/3 of Australia (the almost as big light blue land in the down under right corner) – and Greenland only has 1/400 of the population of Australia.

Cultural dependency

My blogging and LinkedIn activities are in English due to the moderate population of Denmark. Therefore, and because of the spread of LinkedIn biased in the English speaking world, it’s no surprise most viewers are from English speaking countries.

Bookmark and Share

We Will Become More Open

Yesterday I read a post called Taking Stock Of DQ Predictions For 2011 by Clarke Patterson of Informatica Corporation. Informatica is a well established vendor within data integration, data quality and master data management. The post is based on post called Six Data Management Predictions for 2011 by Steve Sarsfield of Talend. Talend is an open source vendor within data integration, data quality and master data management.

One of the six predictions for 2011 is: Data will become more open.

Steves (open source based) take on this is:

“In the old days good quality reference data was an asset kept in the corporate lockbox. If you had a good reference table for common misspellings of parts, cities, or names for example, the mind set was to keep it close and away from falling into the wrong hands.  The data might have been sold for profit or simply not available.  Today, there really is no “wrong hands”.  Governments and corporations alike are seeing the societal benefits of sharing information. More reference data is there for the taking on the internet from sites like data.gov and geonames.org.  That trend will continue in 2011.  Perhaps we’ll even see some of the bigger players make announcements as to the availability of their data. Are you listening Google?”

Clarkes (propriety software based) take is as follows:

“As data becomes more open, data quality tools will need to be able to handle data from a greater number of sources used for a broader number of purposes.  Gone are the days of single domain data manipulation.  To excel in this new, open market, you’ll need a data quality tool that can profile, cleanse and monitor data regardless of domain, that is also locale-aware and has pre-built rules and reference data.”

I agree with both views which by the way are on each of The Two Sides To The IT Coin – Data Centric IT vs Process Centric IT as explained by Robin Bloor in another recent post on the blog by data integration vendor Pervasive Software.

Steves and Clarkes perspectives are also close to me as my 2011 to do list includes:

  • Involvement in a solution called iDQ (instant Data Quality). The solution is about how we can help system users doing data entry by adding some easy to use technology that explores the cloud for relevant data related to the entry being done.
  • Helping enhancing a hot MDM hub solution with further data quality and multi-domain capabilities.

Bookmark and Share

Diversity in Data Quality in 2010

Diversity in data quality is a favorite topic of mine and diversity has been my theme word in social media engagement this year.

Fortunately I’m not alone. Others have been writing about diversity in data quality in the past year. Here are some of the contributions I remember:

The Dutch data quality tool vendor Human Inference has a blog called Data Value Talk. Here several posts are about diversity in data quality including the post World Languages Day – Linguistic diversity rules in Switserland!

Another blog based in the Netherlands is from Graham Rhind. Graham (a Brit stranded in Amsterdam) is an expert in international issues with data quality and one of his blog posts this year is called Robert the Carrot.

The MDM Vendor IBM Initiate has a lively blog about Master Data Management and Data Quality. One of the posts this year was an introduction to a webinar. The post by Scott Schumacher (in which I’m proud to be mentioned) is called Join Us to Demystify Multi-Cultural Name Matching.

Rich Murnane posted a funny but learning video with Derek Sivers about Japanese addresses called What is the name of that block? (Again, thanks Rich for the mention).

In the eLearningCurve free webinar series there was a very educational session with Kathy Hunter called Overcoming the Challenges of Global Data.  There is also an interview with Kathy Hunter on the DataQualityPro site.

I also remember we debated the state of the art of data quality tools when it comes to international data in the post by Jim Harris called OOBE-DQ, Where Are You? As Jim mentions in his later post called Do you believe in Magic (Quadrants)?: “It must be noted that many vendors (including the “market leaders”) continue to struggle with their International OOBE-DQ”.

I guess that international capabilities in data quality tools and party master data management solutions will be on the agenda in 2011 as well.

Bookmark and Share