What does Twitter Know?

We all know the pain of receiving e-mails with offers that is totally beside what you need.

Now Twitter has joined this spamming habit, which is a bit surprising, because with all the talk about big data and what it can do for prospect and customer insight, you should think that Twitter knows something about you.

Well, apparently not.

I operate two Twitter accounts. One named @hlsdk used for my general interaction with the data management community and one named @ProductDataLake used for a start-up service called Product Data Lake.

For both accounts, I am flooded with e-mails from Twitter about increasing my Holiday sales by using their ad services.

Twitter

Strange, because:

  • My businesses is not Business-to-Consumer (B2C) being about selling stuff to consumers, where the coming season is a high peak in the Western World. My business is Business-to-Business (B2B) where the coming season when it comes to sales is a stand still in the Western World.
  • In my part of the Western World we don’t use the term Holidays for the coming season. We (still) call it Christmas as told in the post Is the Holiday Season called Christmas Time or Yuletide?
  • In my home country, Denmark, you are not allowed to e-mail businesses with offers in e-mails unless you have actually asked for it. Not sure if Twitter is on the right side of the law here.

Winning by Sharing Data

When I changed my laptop a few months ago, it was the easiest migration to a new computer ever.

Basically I just had to connect to all the services in the cloud I had been using before and for many services the path was to get connected to Google+, Twitter and FaceBook and then connect to many other services via these connections.

ShareThis was a personal win.

Most of the teams I am working with are sharing their data with me in the cloud. As in the bad old days I do not have to call and ask for progress on this and that. I can check the status myself and even get notifications on my phablet when a colleague completes a task.

ShareThis is a shared win.

Within my profession being data quality improvement and Master Data Management (MDM) sharing data is going to be a winning path too as told in the post Sharing is the Future of MDM.

There are several ways of sharing master data like using commercial third party data, digging into open government data, having your own data locker and relying on social collaboration. These options are examined in the post Ways of Sharing Master Data.

Bookmark and Share

Identity Resolution and Social Data

Fingerprint
Identity Resolution

Identity resolution is a hot potato when we look into how we can exploit big data and within that frame not at least social data.

Some of the most frequent mentioned use cases for big data analytics revolves around listening to social data streams and combine that with traditional sources within customer intelligence. In order to do that we need to know about who is talking out there and that must be done by using identity resolution features encompassing social networks.

The first challenge is what we are able to do. How we technically can expand our data matching capabilities to use profile data and other clues from social media. This subject was discussed in a recent post on DataQualityPro called How to Exploit Big Data and Maintain Data Quality, interview with Dave Borean of InfoTrellis. In here InfoTrellis “contextual entity resolution” approach was mentioned by David.

The second challenge is what we are allowed to do. Social networks have a natural interest in protecting member’s privacy besides they also have a commercial interest in doing so. The degree of privacy protection varies between social networks. Twitter is quite open but on the other hand holds very little usable stuff for identity resolution as well as sense making from the streams is an issue. Networks as Facebook and LinkedIn are, for good reasons, not so easy to exploit due to the (chancing) game rules applied.

As said in my interview on DataQualityPro called What are the Benefits of Social MDM: It is a kind of a goldmine in a minefield.

Bookmark and Share

Please Retweet

Many moons ago I wondered how my social influence is measured as told in the post Klout Data Quality.

Since then my Klout has dropped a bit from 59 to 57. It does not ruin my day, but I wonder why. A thing that strikes me is from where I get my Klout. It seems Twitter is the place as it counts for 73 % of my Klout. LinkedIn is only 8 %. Personally, I would give them opposite importance.

Klout Network Breakdown

Recently I noticed I was included in a list called Top 200 Thought Leaders in Bigdata Analytics. Honorable maybe. However, I am afraid it merely is a count of how many #Bigdata tags I have used on Twitter relative to others.

What matters to me in social influence seems to be out of scope for Klout, as it is readers and comments on this blog.

What about you. Do you have the right Klout? Is it measured the right way?

Bookmark and Share

Everyday Year 2000 Problems

14 years ago this was busy times for computer professionals, including yours truly, because of the upcoming year 2000 apocalypse. The handling of the problem indeed had elements of hysteria, but all in all it was a joint effort by heaps of IT people in meeting a non-postponable deadline around fixing date fields that were too short.

everyday y2k problemsData entry and data storage fields that are too short, have an inadequate format or are missing are frequent data quality issues. Some everyday issues are:

Too short name fields

Names can be very long. But even a moderate lengthy name as Henrik Liliendahl Sørensen can be a problem here and there. Not at least typing your name on Twitter, where the 20 characters name field corresponds very well to the 140 character message length, forces many of us to shorten our name. I found a remedy here from a fellow Sørensen on a work around in the post Getting around the real name length limit in Twitter. Not sure if I’m prepared to take the risk.

Too short and restricted postal code fields

When working with IT solutions in Denmark you see a lot of postal code fields defined as 4 digits. Works fine with Danish addresses but is a real show stopper when you deal with neighboring Swedish and German 5 digit postal codes and not at least postal codes with letters from the Netherlands and the United Kingdom and most other postal codes from around the world.

Missing placeholder for social identities

The rise of social media has been incredible during the last years. However IT systems are lacking behind in support for this. Most systems haven’t a place where you can fill in a social handle. Recently James Taylor wrote the blog post Getting a handle on social MDM. Herein James describes a work around in a IBM MDM solution. Indeed we need ways to link the old systems of records with the new systems of engagement.

Bookmark and Share

Introducing the Famous Person Quote Checker

quoteAs reported in the post Crap, Damned Crap, and Big Data there are data quality issues with big data.

The mentioned issue is about the use of quotes in social data: A famous person apparently said something apparently clever and the one who makes an update with the quote gets an unusual large amount of likes, retweets, +1s and other forms of recognition.

But many quotes weren’t actually said by that famous person. Maybe it was said by someone else and in many cases there is no evidence that the famous person said it. Some quotes, like the Einstein quote in the Crap post, actually contradicts what they apparently also has said.

As I have worked a lot with data entry functionality checking for data quality around if a certain address actually exist, if a typed in phone number is valid or an eMail address will bounce I think it’s time to make a quote checker to be plugged in on LinkedIn, Twitter, Facebook, Google Plus and other social networks.

So anyone else out there who wants to join the project – or has it already been said by someone else?

Bookmark and Share

On Washing Rental Cars and Shared Data

Recently a tweet from Doug Laney of Gartner has been retweeted a lot:

Rented Car

As most analogies it may fit or maybe not fit seen in different perspectives. Actually rental cars are probably some of the most washed cars as the rental company wash and clean the car between every rental.

In the same way as rental cars usually are quite clean I have also found that sharing data is a powerful way to have clean data as told on the page about Data Quality 3.0. This is also the grounding concept behind the instant Data Quality solution I’m working with, where we have just released our iDQ™ MDM Edition.

Bookmark and Share