I guess every data and information quality professional agrees that when fighting bad data upstream prevention is better than downstream cleansing.
Nevertheless most work in fighting bad data quality is done as downstream cleansing and not at least the deployment of data quality tools is made downstream were tools outperforms manual work in heavy duty data profiling and data matching as explained in the post Data Quality Tools Revealed.
In my experience the top 5 reasons for doing downstream cleansing are:
1) Upstream prevention wasn’t done
This is an obvious one. At the time you decide to do something about bad data quality the right way by finding the root causes, improving business processes, affect people’s attitude, building a data quality firewall and all that jazz you have to do something about the bad data already in the databases.
2) New purposes show up
Data quality is said to be about data being fit for purpose and meeting the business requirements. But new purposes will show up and new requirements have to be met in an ever changing business environment. Therefore you will have to deal with Unpredictable Inaccuracy.
3) Dealing with external born data
Upstream isn’t necessary in your company as data in many cases is entered Outside Your Jurisdiction.
4) A merger/acquisition strikes
When data from two organizations having had different requirements and data governance maturity is to be merged something has to be done. Some of the challenges are explained in the post Merging Customer Master Data.
5) Migration happens
Moving data from an old system to a new system is a good chance to do something about poor data quality and start all over the right way and oftentimes you even can’t migrate some data without improving the data quality. You only have to figure out when to cleanse in data migration.