You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Please provide any feedback regarding this scenario in the comment form below or by clicking here.

As part of an ongoing effort begun in 2008, an NIH-funded research team collects and manages data for Large National Survey wherein consented participants agree to allow a public use data set made available for researchers. Data elements in the survey include preferences for self-reported health status. Participants also authorize the release of their Medicare Claims data and linkage to public data sets for future research. The informed consent authorizes NIH to publicly post data sets deidentified in accordance with the HIPAA Safe Harbor Standard. 

In 2013, the CDC funds Max Researcher to use the NIH-funded data set for conducting exploratory research. Max integrates the restricted data set with publicly available geocoded datasets from the CDC, Census Bureau, about socio-economic and environmental health risk factors that might predict the best treatments for chronic disease with more patient-specific and subgroup-level accuracy.

Three years into Max’s exploratory research grant (2016), Harvey Hacker at Computer Science University demonstrates that he can apply linear programming methods to uniquely identify two of the individuals in the NIH-funded data set by combining it with voter registration records and the same geocoded data sets being used by Max Researcher. These two individuals represent 0.01% of the Large National Survey Population. Harvey alerts Large National Survey before he publishes his findings in Computer Science Journal and as a New York Times Op Ed.  

Approximately 20% of the participants in the National Survey see the story and exercise their rights to withdraw their data from the Large National Survey by sending Max Researcher a signed letter with their designated subject ID, which is provided each time the survey is administered.

In response, the Large National Survey removes publicly available online data set and changes agreements to require assurances from users that they will not combine either restricted or public use data with other data sets. Max Researcher destroys the data she has been using, shuts down her lab, and takes a job at Venture Capital Drug Discovery Firm, which uses privately brokered data sets with greater utility for exploratory research. Max cannot publish her findings under the auspices of her new organization until patents for treatments are granted.

Questions:

  • What is the best way to manage funding agencies’ mandates for data sharing with privacy concerns?
  • What ethical obligations does Harvey have to share or protect the algorithms he used? What obligations do cryptographers have to accurately communicate privacy risks? What obligations do scientific journals carry to publish or protect methods that might be used for unethical purposes?
  • Should Large National Survey research participants be alerted of cryptographers’ findings as newly identified risks?
  • With the newly published algorithms, several other publicly available research data sets are vulnerable: Should these data sets also be removed? Should participants be warned?
  • The Large National Survey Data Stewards create a computing enclave where the Public Use data can be accessed and analyzed but not downloaded. The capacity limitations render many types of analysis infeasible. What alternatives exist?
  • What standards or guidelines exist now for assessing tradeoffs between privacy risks and utility?
  • Should researchers working under IRBs receive any special status or trust that would distinguish them from members of the public so that they might combine data sets to add value to the data?
  • How can the risks of reidentification be balanced against the potential loss of valuable health insights that result from the removal of data sets from the public domain?
  • How can the risks of reidentification be balanced against the burden of limited data access for researchers and potential loss of health insights (e.g., when a researcher removes themselves and their entire research program from the public domain)?

 

Title

Response

Description

Under terms of funding from NIH, data sets collected with public funds must be made available while protecting privacy. Privacy researchers have shown that such data sets do not truly protect privacy, an issue that has received substantial public attention. This has resulted in more conservative approaches by data stewards, increasing barriers to data use by researchers.

Primary actor/participant

Researcher, Data Stewards

Support actor/participant

Funding agency

Preconditions

  • Data Sharing Policies from funding agencies exist
  • Participants have been consented
  • Public data repository can be accessed and combined with other public data
  • IRB and other required approvals

Post conditions

  • The researcher collects and analyzes the data for a specific research study.
  • Data sets are removed from public access

Alternatives

  • Cryptographers do not alert data stewards before results are released
  • Cryptographers post code online and make it available for unrestricted use
  • In addition to Drug Discovery, Venture Capital Drug Discovery Firm is selling data to marketers about the likely identities of patients and their treating physicians for drug detailing.

Considerations

  • Conflicts between mandates of funding agencies for data sharing and privacy concerns
  • Conflicting interests between cryptography research publishing incentives, privacy of research participants, and
  • Public perception of risk vs. actual risk.

Data Elements Considered

Survey, Laboratory, Demographic, and Geocoded Data about Environmental Risks, consented data from administrative claims

Purpose of the Data Collection

Research

Purpose of Data Use

Research

Terms of Transfer to the Data Holders

Consent

Terms of Transfer to Researchers

IRB approval, agreements negotiated with data stewards

 

  • No labels