Referrers

I have earlier written about how search terms are a way people gets to my blog in the post Picture This.

Another way is being referred from other sources. Lately WordPress, which is my blog service, improved the statistics so the referring sources are consolidated which gives you much more meaningful information about your referrers.

My current all time statistics looks like this:

At the time the total number of pageviews was 46,263.

LinkedIn seems to be my main supplier of readers. I am regularly sharing my posts as status updates and as news items in different LinkedIn groups.

But I do think that the figures for Twitter is lying though as they are counted based on where from the tweets and re-tweets are read. Twitter is probably only the twitter site. Hootsuite is another way of reading and clicking on links to a blog in a tweet. People who read and click via TweetDeck is as I understand it not counted as a referring source as TweetDeck is a desktop application.

Though I write in English I do from time to time post user blogs and comments with links on Danish language sources as the local Computerworld and another IT online news site called Version2.   

When someone, which in my case mainly is Rich Murnane I think, StumblesUpon a blog post you sometimes get a lot of pageviews within an hour or so.

Else Jim Harris’s blog called OCDQ Blog is a constant source of referring either due to Jim’s kind links to my blog posts or my self-promoting links in my comments on Jim’s blog posts.

Bookmark and Share

Free and Open Sources of Reference Data

This Monday I mingled in a tweetjam organized by the open source data integration vendor Talend.

One of the questions discussed was: Are free and open sources of reference data becoming more important in your projects?

When talking “free and open“, not at least in the open source realm, we can’t avoid talking about “free for a fee”. Some sources of open data like Geonames are free as in “free beer”. Other data comes with a fee. In my home country Denmark we have had some discussions about the reasoning in that the government likes to put a fee on mandatory collected data and I have observed similar considerations in our close neighbor country Sweden (By the way: The picture of a bridge that Talend uses a lot like on top of home page here looks like the bridge between Denmark and Sweden).

One challenge I have met over and over again in using free (maybe for a fee) and open data in data integration and data quality improvement is the cost of conformity. When using open government data there may, apart from the pricing, be a lot of differences between the countries in formats, coverage and so on. I think there is a great potential in delivering conformed data from many different sources for specific purposes.

Bookmark and Share

Script Systems

This Friday my blog post was called Follow Friday diversity. In my hope to reach for more equalized worldwide interaction I wonder if writing in English with roman (latin) characters is enough?

Take a look at the diversity in script systems around the world:

Alphabets

In an alphabet, each letter corresponds to a sound. These are also referred to as phonographic scripts. Examples of Alphabets: Roman (Latin); Cyrillic; Greek

Abjads

Abjads consist exclusively of consonants. Vowels are omitted from most words, because they are obvious for native speakers, and are simply inserted when speaking. In addition, Abjads are normally written from right to left. Examples of Abjads: Hebrew; Arabic

Abugidas

Abugidas are characteristic for scripts in India and Ethiopia. In this style, only the consonants are normally written, and standard vowels are assumed. If a different vowel is required, it is indicated with a special mark. Abugidas form an intermediate level between alphabetic and syllabic scripts. Examples of Abugidas: Hindi (Devanagari); Singhalese

Syllabic Scripts

Like alphabets, syllabic scripts are another type of phonographic script. In a syllabic script, each character stands for a syllable. Examples of Syllabic Scripts: Japanese (Hiragana, Katakana); Cherokee

Symbol Scripts

In symbolic scripts, each character is an ideogram standing for a complete word. Compound terms or concepts are composed of multiple symbols. Symbolic scripts are also called logographic scripts. Examples of Symbolic Scripts: Chinese; Japanese (Kanji)

Source: Worldmatch® Comparing International Data by Omikron Data Quality – full version here.

Bookmark and Share

Follow Friday Diversity

Every Friday on Twitter people are recommending other tweeps to follow using the #FollowFriday (or simply #FF) hashtag.

So do I.

Below please find my follow Friday recommendations grouped by global region:

 

Canada: @carrni @datamartist @sheezaredhead @andrewsinfotech @aniagl @DQamateur @bivcons @projmgr @DQStudent @datachickUnited States: @GarnieBolling @stevesarsfield @UtopiaInc @bbreidenbach @fionamacd @RobertsPaige @BIMarcom @IDResolution @FirstSanFranMDM @dan_power @merv @NISSSAMSI @jilldyche @howarddresner @GartnerTedF @RobPaller @marc_hurst @dcervo @datamentors @VishAgashe @IBMInitiate @RamonChen @JackieMRoberts @philsimon @Nick_Giuliano @DataInfoCom @juliebhunt  @Futureratti  @dqchronicle  @jonrcrowell @elc  @Experian_QAS @paulboal @im4infomgt @WinstonChen @ocdqblog @KeithMesser @murnane @BrendaSomich @alanmstein @JGoldfed @jaimefitzgerald @tedlouie @bslarkin

Venezuela: @pigbar

Ireland: @daraghobrien @KenOConnorData @MapMyBusiness: United KIngdom: @SteveTuck @VeeMediaFactory @mktginsightguy @Daryl70 @Teresacottam @AnishRaivadera @ExperianQAS_UK @DataQualityPro @SarahBurnett @faropress @jschwa1 @mikeferguson1 @jtonline @Master_OBASHI @Nicola_Askham; France: @DataChannel @mydatanews @jmichel_franco @ydemontcheuil;Switzerland: @alexej_freund @openmethodology; Austria: @omathurin; Germany: @stiebke @dwhp @dakoller @marketingBOERSE; Belgium: @guypardon; Netherlands: @harri00413 @GrahamRhind; Denmark: @jeric40 @eobjects @StiboSystems;Norway @Orvei; Sweeden: @MrPerOlsson @DarioBezzina; Finland: @JoukoSalonen; Lithuania: @googlea; Italy: @Stray__Cat

Algeria: @aboussaidi; South Africa: @MarkGStacey

Pakistan: @monisiqbal; India: @MDMAnswers @twitrvenky @ashwinmaslekar; Indonesia: @VaiaTweets

Australia: @emx5 @vmcburney;New Zeeland: @JohnIMM @Intelligentform

It’s my hope, that I in the future will be able to interact even more diverse.

Bookmark and Share

Can Anybody Hear Me?

Blogging and evangelizing about data quality is a fairly lonely trade.

Hopefully it is not because it is not a good cause. And I don’t think so. Also this week I followed another good cause not getting much attention.

After the disaster with the oil spill in the Gulf of Mexico Greenpeace launched an operation aimed at getting attention to the probably even more dangerous deep water drilling in the fragile Arctic environment.

The ship Esperanza sailed to the Baffin Bay, launched inflatables with 4 climbers who hanged in under an oil rig in 40 hours in the blistering cold wind while practically no one cared.

Oh yes, there were live tweeting from the ship on the @gp_espy account – followed by 1,500 tweeps world-wide, including yours truly.

Surely, a few articles was written by the press – mainly in Britain where the drilling company Cairn Energy belong and in Denmark because the waters belongs to Greenland/Kingdom of Denmark.

But I guess Greenpeace must be pretty disappointed with the overall attention. I guess they chose the wrong right place (platform you might say). Not much press in the Baffin Bay.   

And hey, I guess I chose the wrong time for publishing this post (based on my reader demographics as I know it). No one is online in the Pacifics now, it’s early Saturday morning in Europe and it’s the night before a 3 day weekend in the United States.

Bookmark and Share

Out of Facebook

Some while ago it was announced that Facebook signed up member number 500,000,000.

If you are working with customer data management you will know that this doesn’t mean that 500,000,000 distinct individuals are using Facebook. Like any customer table the Facebook member table will suffer from a number of different data quality issues like:

  • Some individuals are signed up more than once using different profiles.
  • Some profiles are not an individual person, but a company or other form of establishment.
  • Some individuals who created a profile are not among us anymore.

Nevertheless the Facebook member table is a formidable collection of external reference data representing the real world objects that many companies are trying to master when doing business-2- consumer activities.

For those companies who are doing business-2-business activities a similar representation of real world objects will be the +70,000,000 profiles on LinkedIn plus profiles in other social business networks around the world which may act as external reference data for the business contacts in the master data hubs, CRM systems and so on.

Customer Master Data sources will expand to embrace:

  • Traditional data entry from field work like a sales representative entering prospect and customer master data as part of Sales Force Automation.
  • Data feed and data integration with traditional external reference data like using a business directory. Such integration will increasingly take place in the cloud and the trend of governments releasing public sector data will add tremendously to this activity.
  • Self registration by prospects and customers via webforms.
  • Social media master data captured during social CRM and probably harvested in more and more structured ways as a new wave of exploiting external reference data.

Doing “Social Master Data Management” will become an integrated part of customer master data management offering both opportunities for approaching a “single version of the truth” and some challenges in doing so.

Of course privacy is a big issue. Norms vary between countries, so do the legal rules. Norms vary between individuals and by the individuals as a private person and a business contact. Norms vary between industries and from company to company.

But the fact that 500,000,000 profiles has been created on Facebook in a very few years by people from all over world shows that people are willing to share and that much information can be collected in the cloud. However no one wants to be spammed by sharing and indeed there have been some controversies around how data in Facebook is handled. 

Anyway I have no doubt that we will see less data entering clerks entering the same information in each company’s separate customer tables and that we increasingly will share our own master data attributes in the cloud.

Bookmark and Share

Follow Friday Data Quality

Every Friday on Twitter people are recommending other tweeps to follow using the #FollowFriday (or simply #FF) hash tag.

My username on twitter is @hlsdk.

Sometimes I notice tweeps I follow are recommending the username @hldsk or @hsldk or other usernames with my five letters swapped.

It could be they meant me? – but misspelled the username. Or they meant someone else with a username close to mine?

As the other usernames wasn’t taken I have taken the liberty to create some duplicate (shame on me) profiles and have a bit of (nerdish) fun with it:

@hsldk

For this profile I have chosen the image being the Swedish Chef from the Muppet show. To make the Swedish connection real the location on the profile is set as “Oresund Region”, which is the binational metropolitan area around the Danish capital Copenhagen and the 3rd largest Swedish city Malmoe as explained in the post The Perfect Wrong Answer.

@hldsk

For this profile I have chosen the image being a gorilla originally used in the post Gorilla Data Quality.

This Friday @hldsk was recommended thrice.

But I think only by two real life individuals: Joanne Wright from Vee Media and Phil Simon who also tweets as his new (one-man-band I guess) publishing company.

What’s the point?

Well, one of my main activities in business is hunting duplicates in party master databases.

What I sometimes find is that duplicates (several rows representing the same real world entity) have been entered for a good reason in order to fulfill the immediate purpose of use.

The thing with Phil and his one-man-band company is explained further in the post So, What About SOHO Homes.

By the way, Phil is going to publish a book called The New Small. It’s about: How a New Breed of Small Businesses is Harnessing the Power of Emerging Technologies.

Bookmark and Share

Social Master Data Management

The term ”Social CRM” has been around for a while. Like traditional CRM (Customer Relationship Management) is heavily dependent on proper MDM (Master Data Management) we will also see that enterprise wide social CRM will be dependent on a proper social MDM element in order to be a success.

The challenge in social MDM will be that we are not going to replace some data sources for MDM, but we are actually going to add some more sources and handle the integration of these sources with the sources for traditional CRM and MDM and other new sources coming from the cloud.

Customer Master Data sources will expand to embrace:

  • Traditional data entry from field work like a sales representative entering prospect and customer master data as part of Sales Force Automation.
  • Data feed and data integration with external reference data like using a business directory. Such integration will increasingly take place in the cloud and the trend of governments releasing public sector data will add tremendously to this activity.
  • Self registration by prospects and customers via webforms.
  • Social media master data captured during social CRM and probably harvested in more and more structured ways.

Social media master data are found as profiles in services as Facebook mainly for business-to–consumer activities, LinkedIn mainly for business-to-business activities and Twitter somewhere in between. These are only some prominent examples of such services. Where LinkedIn may be dominant for professional use in English speaking countries and countries where English is widely spoken as Scandinavia and the Netherlands other regions are far less penetrated by LinkedIn. For example for German speaking countries the similar network service called Xing is much more crowded. So, when embracing global business you will have to acknowledge the diversity found in social network services.

A good way to integrate all these sources in business processes is using mashup’s. An example will be a mashup for entering customer data. If you are entering a business entity you may want to know:

  • What is already known in internal databases about that entity – either via a centralized MDM hub or throughout disparate databases?
  • Is the visit address correct according to public sector data?
  • How is the business account related to other business entities learned from a business directory?
  • Do we recognize the business contact in social networks – maybe we did have contact before in another relation?

If you are entering a consumer entity you may want to know:

  • Does that person already exist in our internal databases – as an individual and as a household?
  • What do we know about the residence address from public sector data?
  • Can we obtain additional data from phone book directories, nixie lists and what else being available, affordable and legal in the country in question?
  • How do we connect in social media?

Of course privacy is a big issue. Norms vary between countries, so do the legal rules. Norms vary between individuals and by the individuals as a private person and a business contact. Norms vary between industries and from company to company.

If aligning people, processes and technology didn’t matter before, it will when dealing with social master data management.

Bookmark and Share

What’s in a Blog Post Title?

I don’t know about you. But I am a slave to numbers and statistics and can’t help following my WordPress statistics telling me about pageviews – not at least pageviews per post.

There are huge differences in the number of visitors who views the different posts. The post with highest number of views on my blog has +2.500 views and the post with the lowest number has only 15 views.

To be honest, the ones with over 500 views are mainly visited due to some image search circumstances explained here, so views actually related to data quality varies between 15 and approximately 500. That’s still a huge difference.

I have still to find out precisely what makes the difference.

It can’t be the content, can it? Basically people don’t know the content before opening.

No doubt that time of posting – not to mention time of telling about posting on sites as Twitter and LinkedIn has something to say. On twitter the re-tweet action is important I have noticed. And of course re-tweet action relies on time and that the first readers found the content worth a re-tweet.

There is surely also a relation between number of comments and numbers of views. I see that in my numbers.

Obviously the title of the blog must be important. But from my numbers I can’t figure out how, except from an observation about that a technical title seem to rule over philosophical stuff as discussed here last year on DataQualityPro.

So, the title of this post is not the preface of explaining it all but a genuine question to you who by some reason came by:  What’s in a Blog Post Title?

Why do you watch it?

Statler and Waldorf is a pair of Muppet characters. They are two ornery, disagreeable old men. Despite constantly complaining about the show and how terrible some acts were, they would always be back the following week in the best seats in the house. At the end of one episode, they looked at the camera and asked: “Why do you watch it?”.

This is a bit like blogging about data quality, isn’t it? Always describing how bad data is everywhere. Bashing executives who don’t get it. Telling about all the hard obstacles ahead. Explaining you don’t have to boil the ocean but might get success by settling for warming up a nice little drop of water.

Despite really wanting to tell a lot of success stories, being the funny Fuzzy Bear on the stage, well, I am afraid I also have been spending most time on the balcony with Statler and Waldorf.

So, from this day forward: More success stories.

This is the start of a series of 1.3 blog posts…. No, just kidding.

Bookmark and Share