The Data Provenance Initiative aims to establish a standardized way of capturing provenance (including inbound, system generated, and outbound provenance), retaining and exchanging the provenance of health information. While there have been efforts to address data provenance, no existing authoritative specification, standard, or model for provenance has been commonly adopted to date within the context of HIT. The Data Provenance initiative seeks to develop a process for helping to establish such standards.
The goals of the data provenance initiative include:
Establishing guidance for handling data provenance in content standards, including the level to which provenance should be applied
Establishing the minimum set of provenance data elements and vocabulary
Standardizing the provenance capabilities to ensure interoperability
The term “provenance” in the context of Health IT refers to evidence and attributes describing the origin of health information as it is captured in a health system. The requirements for data provenance information must support the full lifecycle and lifespan of health IT data. As the exchange of health data increases, so does the demand to track the provenance of this data over time and with each exchange instance. Confidence in the authenticity, trust worthiness and reliability of the data being shared is fundamental to robust, privacy, safety, and security enhanced health information exchange. Truth and trust may be improved by means of a standardized way to capture and express the provenance of the data and by the expectation that systems have the ability to recognize and validate the provenance information. This in turn can lead to uses such as “chain of trust” and “chain of custody” and other business requirements/applications (for example records management, evidentiary support and clinical decision support).
A significant amount of literature has discussed the value, importance and legal necessity of provenance.
The Data Provenance Initiative aims to establish a standardized way of capturing provenance (including inbound, system generated, and outbound provenance), retaining and exchanging the provenance of health information. (Provenance Capabilities)
*President’s Council of Advisors on Science and Technology. Report to the President: Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward, at 25. http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-health-it-report.pdf
While there are efforts to address data provenance, no existing authoritative specification, standard, or model for provenance has been commonly adopted to-date within the context of HIT. This is exemplified by the variance in how HIEs, EHRs, and PHRs currently capture, retain, and display provenance. This variability is problematic for the interoperable exchange (system interoperation), integration, and interpretation of health data.
Even current health information standards, such as the CDA, lack guidance for handling data provenance. Additionally, the receipt and integration of provenance information is equally variable and dependent upon system capabilities. Further challenges are presented if one system can share detailed provenance data but those receiving it cannot process the level of detail exchanged.
It is important for healthcare stakeholders to have confidence in the authenticity, reliability, and trustworthiness of shared data (trust). This trust requires a clear trail of data provenance in order to validate and determine how to use the data.
To help meet this challenge, the goals of this initiative include:
Although significant challenges in addressing data provenance have been identified, there are some out of scope items that will not be directly addressed during this initiative.
A Tiger Team will be established in addition to use case development to work with SDOs in support of this initiative. This group will serve to accelerate the standards analysis activity and provide recommendations back to the initiative. It is an explicit goal of this effort to identify those components of data provenance that can be expanded to support multiple content specifications. The initial focus of the Tiger Team will be a review of CDA and development of reusable building blocks. Subsequent application to other technologies using these building blocks will be prioritized by the initiative.
To identify and define guidance on use of standards to facilitate provenance capabilities by specifying the following (to the extent specified by the Use Case):
Out of Scope
Out of scope items include the following:
The Data Provenance initiative will improve the confidence in the integrity of health information from creation to exchange and integration across multiple health information systems and between parties. Ultimately these standards will improve trust in healthcare data and its applications, which may include clinical care, interventions, analysis, decision making and clinical research, and others.
Note – The below standards are a starting list for consideration and will be updated based on community feedback during Charter discussions and as use case requirements are further identified. This is not a full list of all possible standards.
This initiative may not address all of the data provenance challenges that exist today such as the following:
DPROV Charter Final For Consensus
<iframe width='860' height='700' frameborder='0' src='https://docs.google.com/spreadsheet/pub?key=0ApW4Ox66ml2IdEItWEpLV2s4UkdNRWIyV1Nfa3dfRkE&output=html&widget=true'></iframe> |