Personalized Medicine, Big Data, and Computing Fundamentals

Personalized medicine promises to improve outcomes across the field of medicine.  Whether it is vaccines, aspirin for heart attacks, chemotherapy, so many medical interventions help some people while doing little for others…and sometimes doing serious harm.  If we could identify who would be helped and who would be harmed before treating them, many lives could be saved.

The tools of modern molecular biology are providing the potential basis for personalized medicine.  Genomics, proteomics, and other approaches provide a deluge of data for analysis.  If this “big data” can be appropriately analyzed, surely we can figure out what the effects of a potential treatment will be on an individual in advance…right?

I don’t think it will be that simple.

One of the fundamental results of computer science is that there are certain kinds of programs that we simply cannot write, at least if we expect them to perform correctly 100% of the time.  A particularly famous and seemingly simple version of this result is the halting problem.  Put simply, there is no foolproof way to tell whether a program will ever stop running (halt).  You can of course run it and if it stops, you know that it halts.  But if you run the program and it keeps going, you can’t be sure whether it will stop eventually.  Of course, it is possible to solve this “decision problem” in many special cases – we know, however, that a general solution is impossible.

A surprising number of problems can be reduced to the halting problem, meaning that they are essentially equivalent to the halting problem in difficulty.  One such problem is determining arbitrary program characteristics, i.e., whether a program will ever print “hello” or the works of William Shakespeare.

For personalized medicine to work, we have to be able to analyze data about a person and decide what effect a given medical intervention – an operation, a drug, a lifestyle change – will have on that person before actually doing the intervention.  In other words, solving personalized medicine is equivalent to determining an arbitrary property of a person.  Substitute “program” for “person”, and we have something equivalent to the halting problem.  Uh oh.

What are the implications of this insight?  Well first, we should accept that we’ll never be able to perfectly predict what will happen when we give anyone a pill.  We’re all different, and that uniqueness is irreducible.

A perhaps more useful insight, though, is that true personalized medicine will come when we can meaningfully simulate the physiology of an individual and/or when we can monitor how our bodies work in real time.  In computer terms, we have to move beyond static analysis to dynamic analysis.

Big data in medicine will give us insights that may allow for a limited form of personalized medicine; however, sample size limits and the massive diversity of our bodies make me suspect that any gains will be incremental and very limited in scope.  But a biological debugger that let us go step by step through a detailed simulation of a biological process?  That would be a game changer.  It is also a long way away.