Big Data or Big Brother? The Design Flaw Threatening Uncle Sam's Big Data Healthcare Plan--And How To Fix It

A big front-page story, and a scary headline, in the April 16 Washington Post: “Patient-records plan faces ethical pitfalls.” “Ethical pitfalls”? Yikes! Alert the privacy police! That is, the enforcers on the right, as well as the left!

Could our personal medical records fall into the hands of unethical employers, who will use them to discriminate? Or could our records fall into the hands of the Obamacare powercrats, who will use them to manipulate? Could institutional oppression, and the black helicopters, be far behind?

The subhead of the Post article, appearing above the fold on the hard-copy of the paper, was more reassuring: “Huge data project offers a research bonanza if issues are resolved.” (Interestingly, the online headline was different–and much more positive: “Scientists embark on unprecedented effort to connect millions of patient medical records.”)

The potential health benefits of big data, as we shall see, are enormous. So it’s unfortunate that a basic design flaw in the federal government’s approach to big data threatens to put a public-opinion roadblock in front of the whole enterprise.

Yet in the meantime, that header in the Post, citing “ethical pitfalls,” could be enough to energize the opposition of the Tea Party, the ACLU, and the trial lawyers–a trans-ideological trifecta. And while there’s nothing wrong with eternal vigilance, it would be a shame if the huge health value of new discoveries were to be shut down by misinformation. After all, big data in medicine is a thrilling prospect: the fusing of traditional medical research and the new power of Moore’s Law. Such “fusion power” ought not to be lost in a tangle of suspicion, regulation, and litigation.

The subject of the Post story is the Patient-Centered Outcomes Research Institute (PCORI), a federal agency created in 2010 by the Affordable Care Act, aka, Obamacare. To be sure, those governmental origins will make PCORI deeply suspect in the eyes of many Americans; folks will not be reassured to read its “mission and vision” statement, which declares that it “helps people make informed healthcare decisions, and improves healthcare delivery and outcomes, by producing and promoting high integrity, evidence-based information that comes from research guided by patients, caregivers and the broader healthcare community.” Yes, those smooth words all sound great; indeed, one might think they’ve been filtered through a focus group–you know, sort of like the notorious spinline, “If you like your plan, you can keep your plan.”

The heart of PCORI is “comparative effectiveness research” (CER). It’s an idea tracing its roots back to the first medical case studies–what worked, and what didn’t. Since then, CER has morphed its way through the bureaucracy under various guises, including such familiar phrases as “managed care,” and such unfamiliar concepts such as “quality of life years.” Yet today, in good federal fashion, PCORI has ramped up the stakes on CER; in the last four years, the agency has approved 279 grant awards, totalling $464 million.

So yes, that front-page Post headline, warning of “ethical pitfalls,” was plenty hot–but the story itself was pleasantly cool and neutral. As Post reporter Ariana Eunjung Cha explained,

Government-funded scientists have begun collecting and connecting together terabytes of patient medical records in what may be one of the most radical projects in health care ever attempted. The data — from every patient treated at one of New York’s major hospital centers over the past few years — include some of the most intimate details of a life. Vital signs. Diagnoses and conditions. Results of blood tests, X-rays, MRI scans. Surgeries. Insurance claims. And in some cases, links to genetic samples.

Yet whether it’s described in terms hot or cool, that’s a tellingly comprehensive chronicle of human lives and misery, all to be piled up in Uncle Sam’s vault. And there’s more:

The effort is being duplicated at 10 other sites across the country using data from hospitals, academic research centers, community health clinics, insurers and other sources. If all goes well, by September 2015 they will be linked together to create a giant repository of medical information from 26 million to 30 million Americans.

So that’s the PCORI story, according to the Post. Now for the reaction. One of the first comments on the article sniped, “Gee, not a single reference to HIPAA and individual privacy rights.” HIPAA, of course, is the Health Insurance Portability and Accountability Act of 1996, which seeks to protect the privacy of “individually identifiable health information.” In fact, reporter Cha used the word “privacy” five times in her piece; evidently, though, to the critics, that wasn’t enough.

HIPAA is indeed a big issue. Medical records are private for a reason–actually, lots of good reasons. If health records fall into the wrong hands, the result can be embarrassment, loss of insurance, even loss of employment. And, more broadly, one’s private stuff is nobody else’s business.

However, if medical data is valuable to the individual, it is also valuable to society, now and in the future. After all, knowledge is cumulative: The case-history of one patient–whether it ends happily or not–can be profoundly consequential for future cases. Doctors learn from experience, as well as experiments; it’s vital to medical science that such learning be preserved and propagated.

Just this week, for example, we have been reminded of the disastrous consequences of inadequate data: We learned that perhaps one in twenty patients at outpatient clinics is misdiagnosed. We can ascribe these failures to either carelessness or incompetence, and yet it would nice if we could then prescribe a solid regimen of big data; that is, better information could facilitate better diagnostics–for example, providing doctors with a computerized protocol to work from. It takes away nothing from the physician’s art to have a reference checklist, and it might greatly aid the patient’s peace of mind.

Thus the complicated challenge: Protect personal privacy and protect the interests of the rest of us. Help the patient, as well as help therapeutic science.

Fortunately, there’s a way to do all these things: Big data can be collected and managed in a “HIPAA-compliant” manner if it is “de-identified,” that is, stripped of its linkage to any specific person. Can this be done? Yes. The Federal Aviation Administration, for example, has been gathering and de-identifying data through its Aviation Safety Reporting System for decades. The result has been a spectacular success; effective safety protocols are developed, nobody gets sued, and, most importantly, the program works–the death rate in passenger aviation accidents has plummeted. Today, it’s 99 percent safer to fly than it was just four decades ago. So yes, big data, properly applied, can work wonders.

Of course, trust is a big factor in the workability of any system–and today, there’s not much trust, anywhere, in the healthcare system. Liberals fear the insurance companies, conservatives fear the government, and nobody, left nor right, likes rationing.

So what to do? How to restore trust? At a minimum, it’s going to take another election or two, so that the public can begin to regain confidence in the ability of our leaders actually to lead the system.

Yet even as big data as a cost-management tool has spilled into politics, the stakes of big data have grown even larger. Today, big data is more than just a tool for cost-cutting; it is also a tool for cure-finding.

Not surprisingly, this new cure approach has its origins in data-drenched Silicon Valley. Four year ago, for example, we learned that Sergei Brin, co-founder of Google, is attempting to build a database of every Parkinson’s sufferer in the world. The hope for Brin–whose family suffers from the disease–is that he can figure out the linkages between all the Parkinson’s cases and use that information to leapfrog the process of scientific discovery. As Wired magazine explained,

Brin is proposing to bypass centuries of scientific epistemology in favor of a more Googley kind of science. He wants to collect data first, then hypothesize, and then find the patterns that lead to answers. And he has the money and the algorithms to do it.

Indeed, even before Brin, medical visionary Michael Milken was seeking ways to use research data to isolate every possible variable and identify every possible solution. The Milken Institute’s Faster Cures provides a helpful explanation of big data’s potential to improve health in its Consortia-pedia program.

In the words of a 2013 report from McKinsey & Company, “While health-care costs may be paramount in big data’s rise, clinical trends also play a role.” That is, medical treatment and research are now pushing big data forward. The report continues:

Physicians have traditionally used their judgment when making treatment decisions, but in the last few years there has been a move toward evidence-based medicine, which involves systematically reviewing clinical data and making treatment decisions based on the best available information. Aggregating individual data sets into big-data algorithms often provides the most robust evidence, since nuances in subpopulations (such as the presence of patients with gluten allergies) may be so rare that they are not readily apparent in small samples.

Using big data, researchers can identify environmental issues and zero in on public health outbreaks. They can also make linkages between diseases and the multitudinous variables of genetics and the genome.

Big data is for real. It’s been estimated that more healthcare data has been generated in the last five years than in all of previous human history. This proliferation of data is not necessarily the same thing as wisdom, to be sure, but wisdom is nevertheless to be gleaned from that data.

In other words, way beyond the goal of mere cost-control, the ultimate promise of big data is better and longer life. And yes, PCORI fits in here–or at least it could.

Yet there’s a problem with PCORI; actually, two problems:

First, PCORI’s origins are in cost-cutting, in the more “cost-effective” bureaucratic management of healthcare. After all, President Obama campaigned for the White House on a promise to reduce the typical family’s healthcare outlays by a third; the idea of “bending the cost curve” is deep within the Democrats’ political DNA.

And yet the reality is that people want more healthcare, not less: The Kaiser Family Foundation finds that by a more than 4:1 margin, ordinary Americans feel that they are getting too little treatment, even as the policy elite believes that Americans are getting too much. So there’s the basic split, right there: The masses on one side, the elitists on the other.

As a result, when many Americans see any part of Obamacare–including PCORI–coming their way, they will see the r-word, “rationing.” Indeed, some will see two words: “death panels.”

So sure, PCORI is a tool, but every tool has a dual use–a good use and a bad use. Knowledge is important, but intentions are important, too. As the late Jack Kemp liked to say of poor people, “They don’t care that you know till they know that you care.” So today, if the American people as a whole perceive PCORI–and the Obamafied healthcare system as a whole–as an expression of officialdom’s desire for cost-cutting, they will fear it, and certainly not trust it.

So part of the PCORI problem is optics, and part is reality. If people don’t trust the system–that is, if they think the system is more interested in saving money than it is in saving their lives–then they won’t be interested in hearing even the most reasoned arguments on behalf of PCORI, CER, or any other bit of trendy jargon.

Second, for all its intellectual ambitions, PCORI is actually, paradoxically, too small. As the Post article noted, it hopes to build a database of up to 30 million people–but that’s less than a tenth of this country of 318 million people. In relative context, PCORI isn’t a big data operation, it’s a medium-data operation.

Yet before PCORI gets anywhere near that smallish target of 30 million people, it will have to overcome the fear–a fear sure to be fanned by the headline of the Post article–that the PCORI data-collection operation will lead to the stigmatizing, even ghettoizing, of those 30 million people. After all, PCORI is gathering data on patients–on sick people. And so all of us have a right to ask: What will become of this data? Will it be gathered up and used against me?

As veteran medical observer Jeremy Shane explains, “No one wants data from their weakest physical moment splayed all over the Internet–or at least that’s what they fear will happen.”

To be sure, at one level, it makes perfect sense for PCORI to gather information on patients; one has to start somewhere. Yet at another level, it doesn’t make sense: PCORI is thinking too small.

In healthcare, big data is not just about sick people–it’s about all people. The essence of big data is that one no longer needs to settle for a mere sample of the population; one can study the whole of the population. With big data, there’s no need for an educated guess based on a snapshot; the picture can be of the whole panorama–and then computers can drill down to find the exact right answer. So say goodbye to intuition and impression; say hello to data-crunching totality.

Shane is a believer in the power of big data. As he notes,

We need to remind people that if we have the data, suitably anonymized, we can accelerate cures. That way the offer to the public–the whole public–can be this: Give us your data and we will be able to help you get better faster if you get sick. The big-data program doesn’t need to be mandatory, but it must be inspiring. If it isn’t inspiring, it will not succeed.

Part of that inspiration, Shane continues, would be the shared feeling that data collection is a national civic enterprise. That is, true leaders would communicate to the public that here’s a chance for everyone to make a contribution for the betterment of all. Not everyone will harken to such an idealistic message, of course–but most will.

Shane wants to make it as easy as possible to do the right thing. As he puts it, “The political case for collecting data from people when they are sick would prove to be much more compelling if such data were also collected from people when they are healthy.” That is, no singling out of the sick; sharing data is routine, not exceptional. If we see ourselves as all-in-this-together, then there’s no need to fear discrimination.

And now we must ask ourselves: Is the Obama administration thinking about healthcare in an inspiring manner? Is its track record such that we are inspired to join up? To pitch in?

But wait a second, one might say: If we gather everyone’s data, aren’t we potentially violating everyone’s privacy? If it’s a bad idea to nose around in sick people’s affairs, doesn’t it make it worse to nose around in healthy people’s affairs? After all, the healthy aren’t even involved in the specific medical case–why should they get roped in?

Moreover, in the wake of a myriad of privacy violations–topped off by the disclosures of Edward Snowden–is this really a good time to ask Americans to trust a national data system? People can indeed be forgiven for being more than a little mistrustful.

In such a difficult situation, there are two principal responses:

The first response is a retreat, in the name of privacy, into fatalism, even nihilism–that is, rejecting the idea of big-data problem-solving in the healthcare sector in the name of privacy. And so it’s easy, for example, the PCORI effort dissolving into litigation and paranoia. Such a retreat might seem tempting, even inevitable, but it is not an answer for the health needs of America.

The second, positive, response is a call for leadership. That is, leadership which can transform the current system and inspire large-spirited civic generosity. Yes, that’s a tall order, seemingly beyond the reach of the current political class. But in the most literal sense, such transformational and inspirational leadership is the answer to our health challenges, because such leadership–and only such leadership–can unleash big data to solve otherwise intractable problems.

In the absence of an inspirational vision, it’s easy to see the PCORI idea smashing into a rock of opposition–or make that rocks of opposition, from the left, from the right, and from the tort bar.

Indeed, in the wake of that Post story, it’s not hard to see activists and protestors zeroing in on PCORI: “It’s a creepy DC snooping operation, prying on those who already have enough problems!” “It’s a government rationing tool, a way for the bureaucrats to cull the population!” “It’s an abuse of power, causing irreparable harm to my client that only monetary damages–and contingency fees for me–can begin to make right!”

Returning to that PCORI Post article, what’s perhaps most striking is the absence of national, authoritative political leadership. Yes, President Obama is mentioned, but no other political figures. In other words, no leader is willing to stand up and say, “I have a vision of better treatments and cures. And I see science and big data as vital parts of that vision. So now let’s work together both to protect people and to advance science.

Will that happen? Who knows? All we know for sure is that it hasn’t happened yet. But if we don’t see that sort of leadership forthcoming, our health will be worse, and our lives will be shorter.