Man versus Computer

In a recent social network happening Jim Harris and Phil Simon discussed whether IT projects are like the board games Monopoly or Risk.

I notice that both these games are played with dice.

I remember back in the early 80’s I had some programming training by constructing a Yahtzee game on a computer. The following parts were at my disposal:

  • Platform: IBM 8100 minicomputer
  • Language: COBOL compiler
  • User Interface: Screen with 80 characters in 24 rows

As the user interface design options were limited the exiting part became the one player mode where I had to teach (program) the computer which dice to save in a given situation – and make that logic be based on patterns rather than every possible combination.

While having some other people testing the man versus computer in the one player mode I found out that I could actually construct a compact program that in the long run won more rounds than (ordinary) people.

Now, what about games without dice? Here we know that there has been a development even around chess where now the computer is the better one compared to any human.

So, what about data quality? Is it man or computer who is best at solving the matter. A blog post from Robert Barker called “Avoiding False Positives: Analytics or Humans?” has a sentiment.

diceAlso seen from a time and cost perspective the computer does have some advantages compared to humans.

But still we need humans to select what game to be played. Throw the dice…

Bookmark and Share

8 thoughts on “Man versus Computer

  1. Jim Harris 15th October 2009 / 15:58

    Excellent post Henrik,

    In data quality, I definitely vote for “Man” over “Computer.”

    Risking (pardon the pun) the mixture of metaphors, I have blogged about how “There are no Magic Beans for Data Quality”:

    http://www.ocdqblog.com/home/there-are-no-magic-beans-for-data-quality.html

    And in Phil Simon’s recent post “Kranzberg’s Six Laws of Technology”:

    http://philsimonblog.com/2009/09/30/kranzberg_six/

    On Law # 6: “Technology is a very human activity,” I commented about how many talk about how “people, process, technology” are all important for successful initiatives, but without people, process and technology are useless.

    Although incredible advancements continue, technology alone cannot provide the solution.

    Best Regards…

    Jim

  2. Vish Agashe 15th October 2009 / 23:30

    Henrik,

    Excellent post as always. I would say that it is the human who makes the ultimate decision but as you say in the article …. we have to make wise decisions about using computers/technology where ever appropriate to reduce the cost and to increase scalability of humans. Use humans to make the final/critical decision but let the machine do most of the grunt work based on patterns and algorithms.

    Regards

    Vish

  3. Francisco Correia 18th October 2009 / 23:08

    In Portuguese, dices and data share the same word: dados. But dices are more trustful …

  4. Henrik Liliendahl Sørensen 19th October 2009 / 06:25

    Jim, thanks. Your post about Magic Beans – and its commendable comments – is really worth reading. Perhaps we should have a law that every data quality vendor offering must have this blog post text included. A bit – but not completely – like warnings on cigarettes.

    Thanks Vish, your perception is so close to mine. My mission is trying to make and configure technology that accurately helps people with the hard and repeatable work in data quality and makes room for people advancing on more challenges.

    Francisco – what a quote!! So lucky I now have that one on my blog. Made my day. Thanks.

  5. kenoconnordataconsultant 19th October 2009 / 13:18

    Henrik,

    This is an important debate, thank you for starting it. Jim’s Magic Beans post, and the comments – are very informative.

    Like you, I agree with Vish Agashe “Use humans to make the final/critical decision but let the machine do most of the grunt work based on patterns and algorithms.”

    I have worked on the development of Anti Money Laundering (AML) systems. AML systems perform Financial Transaction Monitoring. They could not function without analytics. They monitor Transaction Activity on millions of accounts. The purpose of the analytics is to identify “Transaction Activity that is unusual when compared to an account holder’s peers”. The AML system alerts a human to study the unusual transaction activity. The human then seeks to “explain away” the unusual activity as ‘normal’, e.g. Once off sale of an asset. If the human cannot find a good reason for the unusual transaction activity, they report it to the authorities as “Suspicious”.

    In my opinion, AML systems provide a good example of the pragmatic combining of analytics and humans – for the good of society.

    Having said the above, the best AML system in the world cannot provide meaningful AML alerts if the quality of the underlying data does not permit the identification of peer groups (for example).

    This brings us right back to your question “So, what about data quality? Is it man or computer who is best at solving the matter.”

    I believe we face two distinct data quality challenges:
    1. How to ‘stop the rot” to prevent more Garbage (poor quality data) entering systems.
    2. How to clean up the existing Garbage.

    Both solutions need to apply the same ‘business rules’. I believe Man is best at devising, and proving the solutions. Once tried and proven, Computer is best at allowing the solution to be applied at scale.

    Rgds Ken

  6. Henrik Liliendahl Sørensen 19th October 2009 / 14:30

    Thanks Ken. When you talk about large volumes of transaction data that has to analyzed, identified with and measured based on master data I am working with exactly the same kind of challenges in a project within public transportation. Solving the data quality issues is needed before any meaningful decisions can be made upon these data. This couldn’t be done without computers doing the hard work.

    My approach in such a project is not for man to settle the rules and controls and then apply the technology. It’s an iterative process where we start with the known main pain, put the computer at work, we evaluate the results, consider the options and probably apply some more knowledge into the computer and then goes around the circle once again.

  7. kenoconnordataconsultant 20th October 2009 / 10:27

    Henrik, if I understand you correctly, your iterative process is something like:

    1. Use what we know about the business rules – e.g. we expect that datafield1 should contain values X,Y,Z

    2. Perform Data Profiling to find out what datafield1 actually contains.

    3. Update the business rules to incorporate the new knowledge about ‘valid exceptions’ etc.

    Repeat

    Rgds Ken

  8. Henrik Liliendahl Sørensen 20th October 2009 / 15:34

    Ken, yes – and

    • The tasks (in further iterations) you will have the computer doing also includes standardization, correction, matching, linking, enriching
    • You may use the computer for settling the obvious situations (based on business rules) and flagging the dubious for human intervention

    The iteration goes for:

    • Batch processing of initial data inventory where you discover the bulk of challenges
    • Ongoing prevention where you may act on new challenges
    • Often new rule inventions may require batch reprocessing on current data inventory
    • Even new technology (or reference data) may be applied on issues not solvable before

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s