Besides being a memoir by Karen Blixen (or the literary double Isak Dinesen) Out-of-Africa is a hypothesis about the origin of the modern human (Homo Sapiens). Of course there is a competing scientific hypothesis called Multiregional Origin of Modern Humans. Besides that there is of course religious beliefs.
The Out-of-Africa hypothesis suggests that modern humans emerged in Africa 150,000 years ago or so. A small group migrated to Eurasia about 60,000 years ago. Some made it across the Bering Strait to America maybe 40,000 years ago or maybe 15,000 years ago. The Vikings said hello to the Native Americans 1,000 years ago, but cross Atlantic movement first gained pace from 500 years ago, when Columbus discovered America again again.
½ year ago (or so) I wrote a blog post called Create Table Homo_Sapiens. The comment follow up added to the nerdish angle with discussing subjects as mutating tables versus intelligent design and MAX(GEEK) counting.
But on the serious side comments also touched the intended subject about making data models reflect real world individuals.
Tables with persons are the most common entity type in databases around. As in the Out-of-Africa hypothesis it could have been as a simple global common same structural origin. But that is not the way of the world. Some of the basic differences practiced in modeling the person entity are:
- Cultural diversity: Names, addresses, national ID’s and other basic attributes are formatted differently country by country and in some degree within countries. Most data models with a person entity are build on the format(s) of the country where it is designed.
- Intended purpose of use: Person master data are often stored in tables made for specific purposes like a customer table, a subscriber table a contact table and so on. Therefore the data identifying the individual is directly linked with attributes describing a specific role of that individual.
- “Impersonal” use: Person data is often stored in the same table as other party master types as business entities, projects, households et cetera.
Many, many data quality struggles around the world is caused by how we have modeled real world – old world and new world – individuals.
I saw recently on a PBS TV special that you could do a DNA test to discover the tribe from which you originated. I guess that profiling is our DNA test in a way. It’s a way for us to do forensics on the data and figure out its origin. Good stuff, Henrik.
Thanks Steve. I have a feeling that data quality “science” has an opportunity to evolve in the direction of DNA science. We have a lot to learn.
To extend your metaphor even further, Henrik, people have two types of DNA in their bodies: Cellular and Mitochondrial. The same is true for not just individual persons, but all entities: they have a real-world aspect and a digital one.
The key (pardon the pun) to solid data modelling is not to lose sight of that very simple but critical fact.
The other aspect that this dual nature of information management supports is one you use in your graphic: triangulation of coordinates to form maps. Your graphic contains an infinite number of data points of which you have used a handful. If you wanted to add any new data points you would only need to identify lat/long label to locate it.
All done!
Thanks John. Good point about a real-world aspect and a digital one.