Skip to main content

The context you need, when you need it

When news breaks, you need to understand what actually matters — and what to do about it. At Vox, our mission to help you make sense of the world has never been more vital. But we can’t do it on our own.

We rely on readers like you to fund our journalism. Will you support our work and become a Vox Member today?

Join now

Google Wants Guinea Pigs for a New Medical Study. Here’s Why I’d Volunteer.

With health data, the default should be to look for safe ways to share.

Shutterstock / Palau

After six months of covering the merger of technology and medicine for Re/code, I’ve come to believe one thing very strongly: The next great insights into health and disease, and the resulting breakthroughs in diagnostics and treatments, are likely to emerge at the intersection of these disciplines.

I’ve become convinced because nearly every researcher I speak with drives home this point in words and deeds, with advanced medical research increasingly relying on genomic sequencing and other forms of big-data analysis.

But it also simply makes sense: These tools and techniques are allowing scientists to understand biology at a more basic and fundamental level than has ever been possible in the past. They’re steadily unlocking the programming code of life itself.

These approaches are especially promising for devising personalized cancer treatments, based on the specific mutations within a person’s particular tumor.

This realization has shifted my thinking on online privacy. I’ve been a frequent critic of the policies and blunders of various Internet players, and will always believe that we should be thoughtful and deliberate about how we manage personal data in this Information Age.

As I pointed out recently:

Supposedly “de-identified” data has proven to be anything but on several notable occasions in the past (including here, here and here). And electronic medical records have been compromised already.

But in the context of health care, I’ve come to believe that we need good and specific reasons to cling to our data. The default should be to look for safe ways to share.

We can’t afford to mindlessly indulge our abstract fears about privacy, and generalized resentment of big-tech businesses, when there is so much to be gained for society.

Health data is, as one researcher put it to me, the “grist for the mill” — and as it is, far too much of it is locked away in paper filing cabinets of clinics, isolated by well-meaning but out-of-date laws, or jealously guarded by corporations.

This all came to mind late last week, when Google revealed plans to conduct a “Baseline Study” to “establish a basic understanding of a healthy physiology at this most fundamental level.”

The Mountain View, Calif., technology giant’s research division plans to begin with a small pilot program surveying 175 healthy people, then will collaborate with researchers at Duke and Stanford on a far broader study.

Participants will provide blood, saliva and other samples, and will undergo full genomic sequencing and other tests. Google will analyze the data using its sophisticated algorithms and powerful computer network.

“This could become a reference tool that could inspire even more research studies,” Google said in a press release. “And in the long run, we hope this could be a small contribution toward helping the medical profession find new, proactive ways to keep us healthy.”

The company stresses that the effort is strictly for science, and says it’s taking pains to protect patient confidentially. The study will be overseen by an institutional review board, samples will be collected by the health institutions, and the data will only be given to Google once the names and social security numbers have been scrubbed.

But one point did initially give me pause: The information handed over will include full genome sequences of individual participants.

Curious about the implications of that, I contacted Hank Greely, a Stanford law professor focused on the ethical and legal issues associated with biomedical technologies. He said that a full genome sequence, the three billion DNA base pairs that make you you and me me, can only be anonymous if you define “anonymity” in a narrow way.

“I’m not saying people shouldn’t sign up for this,” he said in an interview. “But they need to know going into it that nobody can honestly promise you anonymity or confidentiality.”

That’s because once someone has the sequence, they could theoretically match it up to anywhere else that data lives — for example, on heredity sites like Ancestry.com, 23andMe and Family Tree DNA. In fact, several dozen adoptees reportedly used DNA tests to figure out the likely surname of their biological fathers on the latter site, the BBC reported in 2008.

As these tests become cheaper and more popular — the cost of full genome sequencing has plummeted a millionfold in the last decade, and simpler SNPs tests are already less than $100 — there are likely to be more places where this sort of data is available.

The privacy issue was a persistent theme in the coverage of the Baseline Study last week, and that’s probably a good thing. It’s always a fair question to ask.

But with that all said, after several exchanges with Google, the risks in this very specific circumstance seem tiny to me.

The company isn’t hosting this data publicly, so the only worrisome scenarios are that someone hacks into it, or a rogue Google X employee decides to abuse it for reasons that would also be difficult to fathom.

Down the road, if the study produces useful insights, Google may share some information with outside researchers, but only those working on formal studies also approved by institutional review boards. It won’t ever hand it out to the public, the company says.

I have two tests that I try to apply when thinking about appropriate privacy boundaries: Do consumers have choice, and do they have transparency?

In this case, the answer appears to be “yes” to both. The study is purely voluntary; no one will be compelled to participate. And I’m assured that the consent form explicitly describes the possible risks associated with sharing genomic data, precisely as Greely advocates.

Given these precautions, I’m prepared to say that I’d be comfortable participating in this study — at least if I qualified as healthy, which, unfortunately, I probably don’t.

There’s no way of knowing whether Google’s study will actually produce any genuine scientific leaps, but there’s every reason to believe that one analysis of this sort soon will.

This article originally appeared on Recode.net.

More in Technology

Podcasts
Are humanoid robots all hype?Are humanoid robots all hype?
Podcast
Podcasts

AI is making them better — but they’re not going to be doing your chores anytime soon.

By Avishay Artsy and Sean Rameswaram
Future Perfect
The old tech that could help stop the next airborne pandemicThe old tech that could help stop the next airborne pandemic
Future Perfect

Glycol vapors, explained.

By Shayna Korol
Future Perfect
Elon Musk could lose his case against OpenAI — and still get what he wantsElon Musk could lose his case against OpenAI — and still get what he wants
Future Perfect

It’s not about who wins. It’s about the dirty laundry you air along the way.

By Sara Herschander
Life
Why banning kids from AI isn’t the answerWhy banning kids from AI isn’t the answer
Life

What kids really need in the age of artificial intelligence.

By Anna North
Culture
Anthropic owes authors $1.5B for pirating work — but the claims process is a Kafkaesque messAnthropic owes authors $1.5B for pirating work — but the claims process is a Kafkaesque mess
Culture

“Your AI monster ate all our work. Now you’re trying to pay us off with this piece of garbage that doesn’t work.”

By Constance Grady
Future Perfect
Some deaf children are hearing again because of a new gene therapySome deaf children are hearing again because of a new gene therapy
Future Perfect

A medical field that almost died is quietly fixing one disease at a time.

By Bryan Walsh