Caleb Stanford Blog


Big Data Health Monitoring Should Be Mainstream Already


research health

The healthcare system in the US is often rightfully criticized for the cost to patients: insurance costs, “out-of-network” costs, and so on. But in my own interactions with healthcare over the last year or two, I’ve been equally frustrated by something else: medical data is not being sufficiently gathered and utilized. In the age of big data, medicine seems to be many years behind:

  1. A large amount of medical data is generated in current healthcare routine, but outside of scientific studies it is only utilized on a per-patient basis.

  2. An even larger amount of medical data could be easily gathered, through patient-recorded logs and wearable monitoring devices, but is not being gathered or utilized.

This is sad to me, because if we’ve learned anything in the last 10 years, it’s that data is extremely valuable when used in aggregate, at maximum scale, to update models and make predictions. In particular, we are probably missing out on more sophisticated, more accurate, and more proactive health monitoring.

I’m not suggesting that computers should replace humans as doctors; at least, not anytime soon. Although algorithms can analyze data faster and at a larger scale than humans, they probably can’t replace doctor expertise in the short term, especially with interactive tasks (like deciding what questions to ask in response to patient symptoms, or determining what data would be most valuable to collect).

What I am suggesting is that we: (1) gather more data, (2) make inferences at a larger scale from a larger amount of data, that can be applied in a more individualized fashion. It was encouraging to read that at least one researcher is taking this approach.

Can you be more specific about “medical data”?

Medical data points are taken constantly, even under the current system. If you go in for a doctor visit, several data points are gathered: the problem or symptoms you have (or lack thereof), as well as basic vital signs (pulse, blood pressure) and sometimes blood tests, urine tests, etc. Additionally, many people take data measurements at home, e.g. fitness trackers, sleep trackers, and blood pressure measurements; and people often notice symptoms and record them (e.g. stomach ache today, chest pain or back pain, feeling lousy in general, etc.). Many people would also be happy to record and track more data. I personally keep a log of minor health problems — just in case some pattern emerges, I suppose.

How is the way we currently use medical data insufficient?

It seems to me that current medical data is not used in aggregate to update medical models and improve future predictions. In fact, my doctors have not looked at my individual data points at all. Instead, you go in for a sick visit and they listen briefly to your array of symptoms, or you go in to get help with a specific thing like sleep, or you go in for a yearly checkup and they review any problems or questions you have. Then, the doctor (a human) makes a prediction or diagnosis about a specific issue, but does not compare with your past data in aggregate and has often forgotten about other minor or major problems you may have experienced. Additionally, many minor issues are ignored if they cannot be easily diagnosed. Finally, doctors do not apply the information about your case to future cases; thus, revision to standard practices only happens over a longer period of time through medical research and controlled studies.

The result of this limited use of health data is that while doctors are good at diagnosing specific sicknesses and medical problems, they are not good at overall monitoring “healthy” individuals to make sure they continue to stay healthy, without risk for future problems. Such monitoring would require a much more subtle understanding of our medical data; an understanding that is evidently not possible using the current approach.

But is there any guarantee that more aggressive data mining would be successful? Would models and predictions actually improve?

No strict guarantee, no. But from my perspective there is good reason to believe the answer to the second question is “yes”; all the recent breakthroughs in big data and machine learning have shown that with enough data and computational resources, it’s usually possible to outperform humans on concrete tasks (e.g. predicting future data from past data). These methods are only continuing to improve and be applied to various domains, not just computer science problems. So I am confident that machine learning algorithms will get better, and there is no reason that that will not extend to medical diagnosis and monitoring. However, there is one important precondition, and that is that machine learning algorithms require a lot of data to perform well.

Right now, no one has access to that kind of data. But the data exists and much of it is already collected, or else easy to collect. Perhaps in the near future, we can utilize it?