Data is a (bad) representation of reality. So data science will save the world.

Regardless of your worldview, I think there is near-universal agreement that humanity is facing a bristling array of existential threats. Here are a few obvious ones:

In fact, it seems to me that the possibility of none of these threats materializing in the next 100 years is damn slim… unless we collectively get a lot smarter about detection, decision, and action.

But I think we may be able to save ourselves. Because we have lots of data, and (to a lesser degree) we have data scientists.

 Data

Most data is an abstract representation of the world. Whether it is the decoded genome of a disease, or untold reams of data that represent possible future states of the world in large-scale simulations, or measurements of honeybee populations. Unfortunately, data is a very thin representation of reality (which is to say: most dimensions of reality are not captured in most datasets) and the data itself is usually siloed, dirty, and riddled with errors.

 Data Scientists

This is where data scientists come in. They clean, correct, and unify data, and then reconstitute it to create as complete a representation of reality as they possibly can. Then they use it to: detect disease outbreaks (and maybe in the future create novel vaccines on the fly), to run simulations to help us make decisions to avoid disastrous wartime and economic scenarios, and aid (natural) scientists in understanding and taking action to heal our environment.

Data and data scientists alone will not suffice. We face monumentally complex problems that will require expertise and hard work from much of humanity to solve. But by interfacing between data (a representation of reality that computers can manipulate), and the real world, they are uniquely positioned to substantially contribute to most, if not all of humanities biggest challenges.

My optimism is only slightly tempered by the fact that there are far too few data scientists in the world today. So… have you considered a career in data science? It’s lucrative, it’s sexy, and it’ll save the world.

 
3
Kudos
 
3
Kudos

Now read this

Don’t start an algorithm business.

If you’re thinking about starting a company around proprietary algorithms… don’t! The types of companies I’m most concerned about are: recommendation engine companies predictive analytics companies natural language processing companies... Continue →