Sunday, June 2, 2013

On the DSM-5, and the Nature of Mental Illness

The field of mental health care is abuzz with the American Psychiatric Association's release of the newest Diagnostic and Statistical Manual of Mental Disorders, known as DSM-5. Insofar as the DSM is the diagnostic bible of mental health problems, and plays a hugely important role in guiding diagnosis, treatment, and provider payment, every new edition is a big deal. Invariably, many clinicians find themselves apprehensive and upset over the changes each new version produces, and DSM-5 is no exception.  Over a thousand clinical psychologists have signed a petition protesting some of the changes to the DSM's diagnostic categories, and the timing of DSM-5's release was unfortunate in light of a blog post by Thomas Insel, director of the National Institute of Mental Health, which declared that the NIMH will move away from using the DSM classification in its research program. The rationale for this shift is that researchers need to focus on understanding the biological processes underlying mental health problems, as opposed to a purely symptom-based understanding of these conditions.

On one hand, the NIMH is right to abandon the DSM model. The DSM classifications aren't based on a clear theory of any biological illness, but rather on sets of symptoms that seem to co-occur regularly (and even here, the process by which the DSM authors arrive at their categories is opaque at best). Which has a couple of big problems. For one, whether a certain behavior or set of behaviors is considered pathological is heavily dependent on the social context in which that behavior occurs. As many have pointed out, early editions of the DSM established formal diagnostic categories for homosexuality and female hysteria. Social conventions have changed, but the "symptoms" themselves haven't changed at all; the same behaviors just aren't viewed as illnesses in need of medical treatment. What's more, many diagnoses in the current DSM could be "cured" by simply changing the life circumstances around the individual in question. Take alcohol abuse: according to the DSM, this diagnosis applies if an individual's drinking creates problems in his or her personal life (e.g., personal relationships, ability to work, and so on). Yet doesn't this imply that independently wealthy people with no family or friends can drink as much as they like without ever "abusing" it? The other big problem with defining mental illness by symptoms is that it becomes impossible to understand the effects of any treatment. Insofar as symptoms of many common problems (e.g., major depression, anxiety disorders) are known to wax and wane, and spontaneous remission is not uncommon, there's really no way of knowing what effects you can attribute to treatment. (Yes, controlled trials are used to evaluate anti-depressant treatments, but the duration of these trials is typically only weeks, and longer-term data rarely include control treatments.)

On the other hand, there's one small problem with the NIMH's plan to move toward a more "scientific" framework for defining mental illness: the field has essentially no idea what biological abnormalities correspond to symptoms of disease. Decades of research into brain anatomy and genetics have failed to offer much insight into the physical mechanisms that produce psychopathology; more alarmingly, knowing how drugs like SSRIs and anxiolytics work hasn't taught psychiatry anything about where the treated disorders come from. If you know your Descartes, of course, you might be surprised that we're still debating whether the mind can be reduced to the physical organs. Either way, the failure of research to find the roots of mental illness in the brain suggests the possibility that the model the NIMH is pursuing isn't very good. Indeed, if mental illness is solely a product of some dysfunction in the brain, then how come non-pharmacologic treatments like cognitive-behavioral therapy (to say nothing of placebos) seem to consistently work?

So, if we don't have a consistent, context-free definition of psychopathology, and we don't have any clear understanding of the disease process that produces it, doesn't this imply that we have no idea what the hell mental illness is? Over the years, a handful of voices, like the libertarian psychologist Thomas Szasz, have suggested that mental health problems are largely a myth. While it's obvious that many people struggle with troubling thoughts and perceptions that make it harder for them to negotiate their daily lives, conceiving of these troubles as a medical problem in need of treatment is just a narrative. And it's not the only possible narrative one can choose. With the obvious implication that we should be very careful in the steps we take to "treat" these struggles. That said, don't expect the DSM framework to go anywhere soon. For one thing, it guides the diagnostic codes that clinicians need to bill payers for the care they provide; insofar as they can't be paid without codes, don't expect them to give them up easily. And second, all the populations in comparative-effectiveness studies of mental health care are defined using DSM criteria, and the studies usually measure effects in terms of symptom remission. Given the current push to make care in the United States "evidence-based", it's unlikely that policy-makers are going to want to cast all these studies aside. So, warts and all, the DSM is probably going to be with us a while.

Friday, May 24, 2013

Why We Still Don't Know if Medicaid Affects Health Status

The following is a piece I wrote recently for the tHEORetically Speaking blog at, wherein I go over the rationale for not taking p-values too seriously. Thanks as always to Patti and her staff for their help, and for their willingness to let me share my work on their site.

If you’re reading this, chances are you’ve heard about the Oregon Medicaid study recently published in the New England Journal of Medicine. In case you’ve been on vacation, however, the results were, at best, a disappointment to advocates of Medicaid: a large, multi-year randomized controlled trial failed to conclude that coverage through the program produced meaningful health benefits, relative to remaining uninsured. As landmark publications of policy interventions are pretty rare, and insofar as the ACA health reform is counting on a dramatic expansion of Medicaid to achieve its goal of universal coverage, the paper has predictably set off a spirited debate within the health policy community. Those on the right are pointing to the findings in calling for a halt to the ACA Medicaid expansion. Among mainstream analysts, we’ve seen a call for caution in interpreting these results (though similar restraint was not advocated by the same crowd when more positive interim results were published in 2011). As in the case of other studies published in areas of contentious debate (see also: screening mammography), having a high-quality trial testing the question of interest, and a p-value associated with that test, is not enough to clarify the matter. I’ve written at length about the Oregon study already, but I’ll be honest: I don’t much care about the study’s findings. For the simple reason that there’s no way a single study can offer a conclusive answer to a research question.

One reason for this is what you might call the random-effects problem. If you ran the Oregon study again, the influences of random error suggest you’d end up with another set of estimates; run it again in a different sample of people, and you’d get still another set of findings. If you replicated the study a bunch of times, eventually you’d start to see the results converge on the “true” effect. As such, findings from single studies are best thought of as single draws from a distribution of possible results. This, of course, implies that it’s really hard to conclude anything from empirical studies, and that you need an enormous amount of data to say anything with confidence, but . . . that’s actually completely the case. When you estimate a treatment effect from a trial, there’s no real way of knowing whether that estimate is close to the mean of that theoretical distribution of results, or whether it’s an outlier. (Note: if you’re familiar with the work of John Ioannidis, this won’t surprise you much.)

Many policy researchers, of course, don’t really care about this; they care less about whether the effect estimate is perfectly accurate than about whether it’s “significant”. The debate over whether or not Medicaid has downstream effects on health has had an either/or flavor, and we’ve been told over and over that the ACA’s Medicare pilots and comparative effectiveness research will tell us “what works” in medical care. In practice, concluding that something “works” from empirical research involves looking at the p-value and seeing whether its effect is “significant” (i.e., small enough to conclude that the observed effect is unlikely due to random influences). But does a “significant” test result really tell you that?

I know this runs counter to everything smart people tell you about statistics, but bear with me. The process of inferring “significance” starts with a null hypothesis H, which states that the groups don’t differ. If the null hypothesis is true, then the probability of observing no difference between the groups is high (with “equality between groups” E defined via reference to a statistical distribution suggesting the threshold at which an observed difference is unlikely to occur by chance). Then we do our study, and observe a difference between groups that’s larger than our “likely” threshold (i.e., not-E). The conclusion most researchers will draw is: “then H is probably not true”.

One obvious problem here is that H isn’t a random variable — hypotheses are either true or they aren’t — so discussing it in terms of probabilities is nonsense. But even if we toughen up the inference from our hypothesis test (i.e., “if we see not-E, we assume that H is false”), our conclusion still doesn’t really follow. In many ways, hypothesis testing is a sort of game that researchers agree to play, wherein observing a difference believed to be unlikely based on sampling error alone leads to the conclusion that the null is false. Yet proving a conjecture requires a lot more than this. The conjecture either needs to be impossible to dispute without contradicting yourself, or you need to be able to demonstrate it inductively by observing it over samples of homogeneous subjects. Clearly, the first approach rarely works in health services research, since few claims about services or policy are necessarily true. To prove a contention via the second approach, you’d need to go through all the subjects in each treatment group and show that the difference between groups is maintained throughout the sample. This is a first step — generalizing to different subject samples is one problem, as is the fact that people adapt to policy interventions over time (meaning the underlying relationships aren’t constant) — but the bigger point is that a conclusion drawn from a comparison of mean differences between groups doesn’t come close to either proving or refuting the hypothesis. Put simply, the information from a hypothesis test is not sufficient for proving the truth or falsity of H. And as such, we still don’t know whether or not Medicaid has any effect on health status.

When I write things like the above, I’m sometimes accused of being anti-research or anti-science. Nothing could be further from the truth. What I’m opposed to is the misuse of science. Had the Oregon study found significant effects on health status, I’d feel the same way about the trial. Insofar as physicians, patients and policymakers all carry biases into their work, and most people tend to generalize wildly from their own experiences, repeated observation over time can perform a valuable service in helping us to understand health care delivery in a more objective way. But this is very different from using limited sets of observations to leap to broad conclusions, and from asserting the truth or falsehood of theories without doing anything resembling a rigorous proof. If health services researchers want the responsibility of their work being used to guide medical practice, they really need to start stepping up their game.

Thursday, May 23, 2013

This Day in Mistrusting Evidence-Based Medicine: The NLST

I realize I spend a lot of time criticizing evidence-based medicine (see here, here, and here for examples), but every time I start to get tired of doing so, I see something new that simply can't pass without comment. Well, once again, the scientific community doesn't disappoint: the New England Journal of Medicine published new results from the National Lung Screening Trial this week. The NLST, if you haven't heard of it, is a large trial meant to evaluate low-dose CT scanning as a screening tool for lung cancer. The headline results published this week were as follows: with a sensitivity of 93.8% and a specificity of 73.4%, CT had three times as many positives as chest radiography (which had a sensitivity of 73.5% and a specificity of 91.3%), and detected twice as many stage 1 lung cancers.

Imagine you're the director of a large medical group: what would you do with this information? Let's say, for the sake of argument, that your patient population contains 10,000 people like the subjects in the NLST (i.e., asymptomatic current or former smokers at least 55 years of age), and you want to know what would happen if you screened them using either CT or radiography. Well, based on the NLST, you'd expect about 90 of those 10,000 people to have lung cancer (i.e., 482 NLST patients, or 0.9%, had the disease), implying that 9,910 wouldn't. The sensitivity estimates suggest that CT would correctly identify 85 lung cancers, compared to only 66 that would be detected by radiography. On the other hand, the specificities imply that CT screening would lead to 2,636 false positives, against only 862 resulting from radiography. Put another way: in order to detect 19 more cases of lung cancer, we'd have to do unnecessary diagnostic workups on 1,774 more people. Framed in this manner, it's not clear that either scan is all that great: we see 13 false positives for every true positive when we screen via radiography, versus 31 false positives for every true positive identified by CT.

The obvious question to answer here: how many false positives are worth dealing with to catch a case of lung cancer early? Unfortunately, diagnostic workup procedures for suspected lung tumors are not a trivial matter, and your medical group would expect to spend a lot of money on them and expose your patients to considerable risk in the process. One diagnostic approach involves needle biopsy, but aspirating tissue out of the lesion for the biopsy is tricky, and punctures the lung in 10-15% of cases. (In your medical group, that would translate into between 272 and 408 punctured lungs under CT screening, or between 93 and 139 punctured lungs under radiography.) Bronchoscopy provides an alternative to needle biopsy, but requires the patient to be sedated, and collapses the lung in 1-2% of cases. Peripheral lesions may also require thoracotomy for proper diagnosis; this is a major surgery, in which the ribs are moved aside for direct access to the lung tissue. If a patient actually has lung cancer, this is all generally viewed as necessary to prevent the spread of a very deadly illness. If the patient doesn't actually, you know, have lung cancer at all, this is a tremendous waste of resources at best and a source of considerable patient harm at worst.

And unfortunately, the NLST can't really tell us if it's all worthwhile, because these results tell us nothing about the incremental effect of CT screening on patient survival, which is what we care about. The awful fact about tumors of the lung is that they're extremely aggressive, and as such it's debatable how much can be gained through early detection. Additionally, research on screening interventions frequently overlooks the issue of lead time; strictly speaking, any length of time that a patient would survive with an undiagnosed tumor can't be considered part of the benefit of screening. So, assuming that the survival benefit is simply life expectancy minus the patient's age at detection is almost certainly wrong.

I've said it before and I'll continue to do so: trial data like these are useful for framing a discussion between the patient and his/her physician about a proper course of treatment. And that's it. Unfortunately, as I've just done, you may have to do the work of sifting through the data yourself if you want a serious presentation of the risks and the benefits. You can't rely on study authors to do it for you.

Thursday, May 2, 2013

Decoding Oregon: Why Doesn't Medicaid Improve Health?

The health policy world is abuzz this week with the publication of a major study in the New England Journal of Medicine. Researchers randomly assigned some 20,000 uninsured Oregon residents to receive either an invitation to enroll in Medicaid or no invitation, and prospectively followed them for two years. The just-published findings suggested that, in general, Medicaid coverage made no difference in the health status of those who received it, as measured by cholesterol, control of diabetes, or use of medications for hypertension or dyslipidemia. Predictably, analysts on the right have seized on the study as evidence of the bankrupt thinking behind the ACA's Medicaid expansion. In turn, the pro-Medicaid health policy community has dusted off its seldom-used critical thinking skills to explain why some positive secondary findings are actually the real story and the non-significant health effects are actually meaningful. Reason has complete coverage of the latter, but for the most contorted interpretation I'd recommend these two posts at the Incidental Economist website. While the political footballing can be entertainment in its own right, one eventually has to ask: which side is right about Medicaid?

So, did Medicaid improve anyone's health or not? If you believe the authors of the study and the analyses they performed, no; if you believe the political partisans, yes, because the non-significant improvements in health status are actually important. Here's the issue, though: when you do any empirical study, one critical challenge involves differentiating true effects from random effects (i.e., the signal vs. noise problem). We test hypotheses the way we do in order to argue that effects of a particular magnitude are unlikely to have occurred by chance. (There's a seldom-recognized problem with this, but that's another discussion.) The problem with the interpretation offered by the Incidental Economist folks is that it effectively assumes that the observed differences are all signal and no noise. Given what goes on in multi-year, community-based studies like this (i.e., subject attrition,  unmeasured patient characteristics and environmental confounders, effects of undiagnosed health problems, etc.), it's laughable to suggest that random noise isn't contributing substantially to any observed effects. Moreover, if you're going to decide that statistical significance is an obstacle to confirming your bias, at least toss it aside before the fact: choosing your criterion for accepting or rejecting hypotheses post hoc is the worst kind of lazy science. I could decide that I don't buy the significant secondary results, and declare that the level of alpha representing significance is 0.00000001 so that I could ignore those findings. While this might sound contrarian, it's every bit as sound as ignoring the p-values after the fact.

But the Medicaid haters aren't entirely right either. Remember, every study is a single set of observations taken from a single set of subjects; conducting the same study again will get you different measurements, since (again) part of the observed effect is due to random factors. Conducting the study in a different population will get you other sets of different measurements. The point being that it's most appropriate to think of a given study's findings as an observation drawn from a distribution of possible results. Granted, this implies that no study can ever be taken as conclusive evidence of anything; but if you're familiar with the critiques of logical positivism, or the work of John Ioannidis, you probably knew that already. So, if you dislike the ACA's Medicaid expansion, enjoy this week, but try not to forget that the Oregon study was more of a lucky draw than genuine support for your beliefs.

To me, though, the Oregon study was asking the wrong question, and was probably a waste of time and taxpayer money. To put it simply, why would you expect Medicaid coverage to lead to improved health? The data on the subject are more ambivalent than the public debate would lead you to think, and the RAND Health Insurance Experiment, still considered the gold standard of research on insurance, found essentially no effect of insurance arrangements on health status. And honestly, health insurance coverage isn't a medical intervention; why would we expect it to have health benefits? The conventional thinking is that coverage makes it possible to utilize necessary services, which then produce health benefits. But this reminds me of James Altucher's notion of the "conspiracy number": an awful lot of things have to happen for the conventional wisdom to play out. The patient needs to have a health problem that's amenable to medical care, or an undiagnosed illness that screening can identify; he or she needs to be sufficiently motivated to go to the doctor; he or she needs an accessible provider; that provider needs to be willing to accept their insurance as payment; that provider needs to have time to see them; the services the provider delivers have to be of sufficient quality; the patient needs to comply with prescribed treatment; and the treatment itself has to actually work. Only if all these conditions are met will insurance coverage lead to better health.

Of course, most policy analysts end up disappointed by results like Oregon, because they don't think the big picture through this way. In their world, teasing out the true effect of insurance on health is a daunting analytic task. But it doesn't have to be. Remember,  health insurance is just a mechanism for letting people pay for medical care with someone else's money. As such, it should create greater consumption of health services, and lo and behold, the data tell us just that. Even the Oregon study found higher rates of preventive service use, depression screening, and diagnosis and treatment of diabetes. I would ask why Oregon needed to spend scarce Medicaid dollars to demonstrate something that follows from simple logic, but what's done is done. One might hope that this study will discourage the health policy community from viewing insurance coverage as a magical, life-giving force and spur them to acknowledge the extraordinary complexity involved in producing good health outcomes. But I won't hold my breath.

Wednesday, April 24, 2013

The Case Against Statins

Most of the time, I use this blog to call attention to the ways in which weak logic and wishful thinking lead to health policy that is at best unhelpful, and at worst creates problems far worse than what prompted it in the first place. But my interests also carry into medical practice, and I believe there is value in applying deductive thinking here as well. In the era of comparative effectiveness research, non-physician technocrats exert a greater role than ever before in shaping medical practice. Given their abysmal track record, they may benefit from a bit of assistance in staying pointed in the right direction. Or at least away from the wrong direction.

One area in need of attention concerns the use of statin drugs. In case you're not up on the current state of medicine, or you haven't sat through a commercial break on television in a while, statins are an extremely popular class of medications used to prevent or treat acute coronary syndromes by lowering the volume of cholesterol in the blood. For pharmaceutical firms, statins have been a godsend: the population for whom their use is indicated is very large, and the Phase III trials evaluating them have been so compelling that payers have been more than willing to cover them broadly. The result has been massive industry profits, and an increasing desire in some quarters to expand the use of statins to more and more populations. Prescribing these drugs to disease-free patients, even some without elevated LDL, has been standard care for a while. More recently, manufacturers have suggested a need for statins in pediatric populations, and others have even suggested adding them to the water supply to promote the public's health (here's a link, if you think I'm joking).

Between the billions in manufacturer profits and the positive reviews of the data, not to mention the effects (placebo and otherwise) experienced by patients, you might wonder what the problem is. Isn't everyone happy? Well, for starters, no, not everyone is happy: the side effects associated with statins (muscle pain and weakness, liver damage, neurological and digestive problems, and increased blood sugar) are not much fun. But I actually have a different concern: it seems to me that medical decision-makers have been so focused on the question "do statins work?" that they've forgotten to ask the question "what do statins do?"

The basic effect of statins is to inhibit the activity of HMG-CoA reductase, an enzyme critical to the synthesis of cholesterol in the liver. According to the conventional wisdom, this leads to lower levels of LDL and total cholesterol in the blood, and reduces the chances of forming atherosclerotic plaques; the fewer the plaques, the lower the risk of death from acute cardiovascular disease. And you can certainly interpret the data as supporting this theory. A glowing review by the Cochrane Collaboration found that statins (in general) produced significantly lower LDL and TC, as well as a significant reduction in risks of major vascular events and all-cause mortality. Pretty great, right?

There's just one big problem with this tidy little story: there's actually no evidence that higher cholesterol is associated with increased risk of mortality. This shouldn't be all that surprising; cholesterol is a critical building block in all animal life, and the primary reason you have a central nervous system, so it seems counterintuitive to view it as a bad thing. And the data don't argue with your intuition; the Framingham Heart Study failed to find a positive association between cholesterol and mortality risk. That's right: the mammoth study whose data are used to benchmark cardiovascular disease risks all over the civilized world didn't find that lower cholesterol was protective against mortality. This would seem to beg for a follow-up question: if there's no evidence that lowering cholesterol protects against cardiovascular disease-related death, why do we see the effects we see in the statin trials? Of course, for much of the evidence-based community, the "why" is irrelevant if we know that something "works". But I have to ask: what are the consequences if the conventional theory isn't correct?

Let's go back to the basics for a moment, and ask what follows predictably from inhibiting the liver's production of cholesterol, particularly LDL. Most immediately: lower levels of LDL in the blood, and higher levels of sugar, as the liver is unable to convert carbohydrates into cholesterol as it would otherwise. If we assume that the lost LDL was just "extra", and wasn't doing anything positive before you started the statin, then it's all good; another drug can be used to treat the high blood sugar. But how do we know we won't miss that LDL? LDL does lots of things in the body, but one of its most critical functions is the transport of fats, glucose, and other nutrients to the body's tissues for use as fuel. In a perfect world, the "spent" LDL particles would deliver their goods and be promptly recycled back through the liver. In the real world, though, the presence of sugars in the blood compromises these particles, and they get stuck in the arterial wall, where macrophages break them down and recycle their contents into HDL. You might recognize this process as atherosclerosis, which the administration of statins ostensibly prevents. This begs a question, though: with less cholesterol in the blood, how are the muscles in my body getting fuel to function?

The likely answer offers an alternative explanation for the apparent benefit of statins to the heart. Starved of LDL-borne fats as an energy source, research suggests that muscle tissues turn their attention to the elevated levels of sugar in the blood, converting it anaerobically into lactate. While lactate requires a ton of energy to produce, it's pretty good for your heart, as a soluble, non-glycating source of energy. This may be where the reduction in atherosclerotic events comes from. Note also that it's perfectly consistent with the data we have on statins (i.e., lower LDL and lower cardiovascular mortality), and also with the Framingham data.

Unfortunately, the muscles have to work like crazy producing lactate, and the body is still deprived of cholesterol. And over the long term, the shortage of cholesterol and the abundance of sugars in the blood suggests that the cell walls of the muscles become increasingly vulnerable to oxidation, and the process of synthesizing lactate leaves them open to damage from glycation. This is consistent with the muscle weakness reported frequently by statin users, and associated breakdown of muscle tissues is consistent with more serious side effects like nerve damage and rhabdomyolysis. It also suggests that calcification of the cell walls can occur, which is consistent with the heart failure also seen with statins. Not to mention that the brain needs enormous amounts of cholesterol to function correctly, and long-term statin use would suggest neurological problems. And finally, if the muscles fail to keep up with lactate production, this means the heart and liver have to metabolize blood sugar without the benefit of membrane cholesterol. In other words, the benefit to your heart may come at the cost of physical and mental frailty, and at the end your heart may be left worse off than it was before.

The foregoing, of course, applies only to the vast majority; those with genetic hypercholesterolemia can obviously benefit from using a statin. But that caveat points to an obvious point: why doesn't the conversation about statin use start with asking why cholesterol is elevated? (Note: I'm not a physician, and this isn't qualified medical advice.) The solution could be as simple as a shift away from a carb-heavy diet. The question could also point to an inflammatory process being triggered by another illness. What it shouldn't point to is a lazy, one-size-fits-all prescription for statins. Like many, many other decisions in health care, the choice to prescribe statins should be preceded by a sober discussion of the risks and benefits with the patient. Some patients will be inclined to trust the traditional care model and the trial data; some won't. But what shouldn't be done is plowing ahead until the mechanisms of statins are well and truly understood.

Saturday, April 20, 2013

The Problem with Preventive Medicine

The following is a piece I wrote a few days ago for the tHEORetically Speaking blog at Thanks as always to Patti and her team for allowing me to share my work through their fine website.

The passage of the ACA in 2010 has to feel like a high-water mark for proponents of preventive medicine. The Obama health reform essentially makes free preventive care the law of the land, with insurers required to completely cover the cost of a laundry list of interventions intended to avert dreaded chronic illnesses like diabetes, cancer, and cardiovascular disease. The rationale for this policy, often repeated, is that the chronically ill account for 75% of health care spending. Manage their illnesses more effectively through evidence-based treatments and lower-cost non-specialist providers, the argument goes, and you’ll solve health care’s cost crisis, as well as creating a healthier population. As such, by 2014 most insurers will be required to pay for utilization of services like blood pressure, cholesterol, and depression screening, routine cancer screening, obesity and tobacco cessation counseling, and a number of services specific to women and children. The expectation being that a ton of costly chronic illness will never occur as a result.

Unfortunately, there are a few problems with this concept. For one, it directly contradicts the mandate for evidence-based care that most of these folks support: there’s almost no evidence suggesting that these screening interventions have any benefit to patients. A common refrain to this is that engaging patients via preventive screening makes them more conscious of the need for self-management. But this is also unproven: a decade or so ago, “disease management” was a fad that fell short of expectations (as most health policy fads do). In many ways, the push for preventive care is of the “we don’t know what to do, but we have to do something” variety.

But preventive medicine’s problems with comparative effectiveness don’t end there. One of the premises of evidence-based care (like, one of the really, really basic premises) is to avoid providing services with no value to patients. It’s why we care about patient-reported outcomes and using resources efficiently. But providing large volumes of care to asymptomatic, average-risk populations is pretty much guaranteed to waste a ton of resources. If you insist, for example, on regularly screening women for breast cancer without any indication of elevated risk, what you’re mostly doing is inventing a complicated way to keep radiologists busy. Yes, you’ll detect some disease (not all of it will be cancer, of course), and you’ll likely prevent premature death in at least some women. Of course, you’ll also overdiagnose a lot of lesions that will never become malignant, and turn up a ton of false positives. And, importantly, by removing cost considerations from the equation, you’ll be subtly discouraging women from weighing those risks against potential benefits. Interestingly, prevention advocates don’t run from this implication of their work, often embracing the numbers-needed-to-screen statistic in their arguments. Yet implicit in an NNS of, say, 600 is that 599 people receive a worthless intervention. Acute treatments aren’t perfect on numbers-needed either, but they do far better than prevention does, since the populations are smaller.

Unfortunately, the cost rationale for prevention is almost certainly overblown as well. For one thing, the 75% figure is absurd; it includes all the costs of care for people defined as chronically ill, not just the costs of their hospitalizations for uncontrolled disease. For another, the types of analyses that usually establish “cost-effectiveness” don’t really work without evidence of effectiveness. And take it from me: I used to work for a large medical group whose screening recommendations were frequently criticized as heartlessly conservative, and I can tell you that screening programs for large numbers of people are staggeringly expensive. And the patients in our medical group had to cover part of the cost. It’s far from a certainty that providing a wide array of services to a broadly defined population, whose marginal cost of consuming those services is zero, is going to be less costly than acute treatment of a much smaller number of chronically ill patients. Not to mention that preventing some costly conditions earlier on may leave people vulnerable to other costly conditions in later life. I’m thinking specifically of the recent RAND study in the New England Journal of Medicine, which estimated the economic burden of dementia at somewhere between “astronomical” and “economically crippling”. As far as I know, there’s no screening program that can forestall Alzheimer’s disease in early life.

And honestly, I’m surprised by how rarely someone points out that the cost-saving justification for medical interventions is complete nonsense. Since when is “being able to pay for itself” a criterion of value in medical care? Let’s say we have two treatments, A and B; A costs nothing and has no clinical benefits, but receiving it prevents a patient from receiving B, which costs money but treats the patient’s illness effectively. This line of thinking would say that providing A is preferable, since it’s “saved” us the cost of B. This sounds dumb, of course, but that’s because we provide medical care to relieve illness and suffering, not for the purpose of saving money. If the latter is your goal, frankly, you’re better off providing no services at all. If, on the other hand, you recognize the patient’s relief as something of value, simple economics would suggest that you’ll have to exchange something of value to obtain it.

None of this is to say that preventive care is pointless. With an appropriately engaged patient and a clear discussion of their risks and benefits, these interventions can clearly help people to live longer, healthier lives. But consideration of the finer points of the doctor-patient relationship is a far cry from mandating that third parties provide an unlimited quality of services to unlimited numbers of people at no cost. If the end result of this latest effort at health reform is to massively subsidize the provision of services with no real benefit, it’ll be hard to look back on it as a success.

Friday, April 19, 2013

What Evidence-Based Medicine Taught Me About Terrorism

Like most everyone else in the United States, I've been preoccupied this week by the shocking events in Boston. As I noted earlier, I lived and worked in the Boston area from 2008 through 2010, and still have many friends and colleagues in the area. Needless to say, the tragic bombings on Monday and today's deadly manhunt and lockdown were tough to watch. I've stood in the Marathon crowds on Patriots' Day before, and much of what's happened in the past 24 hours has taken place in and around my old neighborhood. An MIT police officer was killed last night about six blocks from where I used to live, and in some of the surreal online photos showing the empty streets in Cambridge, I can see my old apartment. And to me, today's total disregard of the 4th Amendment is pretty troubling (I thought the jack-booted guys barging into your home only existed in the fevered imaginations of right-wing conspiracy nuts).

Yet much as I wish none of this had happened, I can't shake my tendency to think critically about the response to these events from policymakers and law enforcement. To me, there's nothing more dreadfully tasteless than exploiting a tragic event to score political points. Which is why I felt the need to comment when I saw this item from Stewart Baker, a former Homeland Security official. Essentially, Baker is arguing that the Boston attacks prove that groups like the ACLU and the Electronic Freedom Foundation are wrong about widespread camera surveillance and the Internet spying bill known as CISPA. The logic here, presumably, is that widespread surveillance by law enforcement can aid in preventing terror attacks. Since data are meaningless without context, I'm assuming Mr. Baker believes this would be achieved through the same kind of "data mining" that's already getting tiresome in the health care world.

This line of thinking calls to mind an experience of mine from the world of evidence-based medicine. A few years ago, I worked with a large medical group on an evaluation of a prenatal genetic testing technology; to protect the identities of all involved, let's say it was a test for Syndrome X. Syndrome X is an awful condition that pretty much guarantees a short and horribly painful life for anyone born with it. The idea behind the test is to identify the genetic abnormality in utero, and advise the mother to terminate the pregnancy.

Once trials evaluating the test were published, the medical group's genetic counselors became very interested: the test had a sensitivity around 95% and a specificity around 75%. In a vacuum, these are pretty solid data; unfortunately, the test doesn't exist in a vacuum. Like many congenital diseases, Syndrome X is very rare, implying that any broadly-defined population will have lots of true negatives and a handful of true positives. Let's say the annual incidence is 1 in 10,000, and in a given year your population is so large that you'll test 1,000,000 pregnancies. The sensitivity says you'll correctly identify 95 pregnancies where the Syndrome X abnormality is present. Unfortunately, the specificity suggests you'll end up erroneously advising 249,975 women to terminate a perfectly healthy pregnancy. Most diagnostic tests employ a threshold that can be adjusted for greater specificity at the expense of sensitivity (i.e., fewer false- and true-positive results), but the bigger problem here is that Syndrome X is so rare. When the consequences of a false positive are this undesirable, adjusting the test threshold to avoid them may wipe out its sensitivity altogether.

As with Syndrome X, the math behind terrorism is important to keep in mind. Though it's easy to lose perspective in weeks like this, the actual incidence of acts of terror is still vanishingly small (thankfully). And this needs to be factored into any discussion of surveillance and the mining of data to pre-empt those acts. It really can be thought of in terms of sensitivity and specificity. Given the terrible drawbacks of false positives — effectively ruining the lives of innocent people falsely accused, and wasting the time, the attention, and the resources of law enforcement in the process — my preference is for a specificity as close to 100% as possible. This might make it next to impossible for mining algorithms to anticipate attacks, but such is the trade-off when living in a civilized society. The question is pretty simple: how many innocent people should have their lives ruined by a terrorism investigation so that one actual terrorist can be taught? And what do we do if that number is really, really high?