Extreme Data Quality

This blog post is inspired by reading a blog post called Extreme Data by Mike Pilcher. Mike is COO at SAND, a leading provider of columnar database technology.

The post circles around a Gartner approach to extreme data. While the concept of “Big Data” is focused on the volume of data the concept of “Extreme Data” also takes into account the velocity and the variety of data.

So how do we handle data quality with extreme data being data of great variety moving in high velocity and coming in huge volumes? Will we be able to chase down all root causes of eventual poor data quality in extreme data and prevent the issues upstream or will we have to accept the reality of downstream cleansing of data at the time of consumption?

We might add a sixth reason being the rise of extreme data to the current Top 5 Reasons for Downstream Cleansing.

Bookmark and Share

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s