CIO Insider

CIOInsider India Magazine


Sharing is Caring: A Prescription to Cure Data Sharing Challenges in Healthcare

Sajit Kumar C N, Director IT, Astrazenca

Data is the new oil. We’ve all heard this several times, in several contexts. It essentially means that the status of data in this new-age economy is akin to that of oil in the old economy. The biggest testimony of this newfound prominence of data is that the World Economic forum has recently declared data as an economic asset class. Interestingly there are lots of similarities between data and oil, which is considered as the most prominent asset class in the old economy.

Like oil , data also increases its value through ‘refining’. Like oil, data has to be handled with care. Data leak is as harmful as oil leak- if not more. Oil is globally controlled by few cartels, such as the OPEC- Organization of the Petroleum Exporting Countries. Similarly, we can say that to a large extent, in this digital era, the data- especially personal data is controlled by global digital platforms giants like Google and Facebook.

However, the similarities end there. The strength of data is its ability to be shared or creating linkages similar to a network. Unlike the oil industry, where the strength is demonstrated by being a monopoly, data gains its power by collaborating or sharing. Especially so in a data-intensive industry like Healthcare. In this digital age, where there is an overdose of data about every aspect of business, the successful organizations are the ones who are able to refine and usefully share those data and benefit out of that shared networking. However , unlike in many other industries, in a highly regulated industry like healthcare, these data are neither easily accessible nor are they easily minable because of privacy considerations. This makes the topic of data sharing in the health care sector very critical.

The availability of data ,ability to process those data and amplifying its utility by collaborative sharing of the same across the stakeholders is clearly emerging as competitive differentiators

There is a lot of data getting generated. It's estimated that approximately 2.3K Exabyte of 'new data' was generated in this industry – just in the year 2020. This includes all the patient data, Clinical trials data etc. Whereas we are putting only a very small fraction of that to any use - for various reasons. As a matter of fact, it is said that more than 30 percent of clinical trial results are never published at all. Not just that – most part of the ' big data' collected from wearables like Fitbit are also not put to good use. This restriction on sharing and usage of data is also impacting the advancement of science at large. Due to lack of original, research-oriented data collection, even the scientific papers are increasingly becoming over-reliant on past publications instead of greenfield research works. The key reasons for this situation are as outlined below:

Privacy Considerations
There is an ongoing contradiction between OPEN SCIENCE vs. PRIVACY when it comes to usage of personal health data. Ideally, we are expected to destroy or sufficiently anonymize such data soon after the intended use. Even in that situation, the data is supposed to be used only with appropriate, informed consent from the patient. Here the key term is ‘informed'. There's still a lot of ambiguity around scope of consent- especially where the patient is not well-informed about his rights and privileges.

Data Hoarding
Who owns the patient data? Common-sense dictates that the ownership of any data rightfully belongs to the originator of the data- in this case, the patients. The collector of data can and should use it only with explicit permission from the owner of the data. However many a times, driven by commercial interests, the hospitals refuse to part with that data

Inadequate Data Infrastructure & Governance
Evidently, there’s a huge volume of data in the healthcare industry, collected and collated from a variety of sources. This includes claims and clinical data, electronic health record information, patient-generated information, sociodemographic data etc. Added to this, there’s also a variety of legal restrictions like privacy, consent etc. All of these calls for a very robust data infrastructure and governance framework. However, unfortunately, there is no global policy framework for data governance. Practically every data sharing

agreements are essentially bilateral agreements between data provider and user. And that is a big impediment for scaling up the data sharing process globally. It is also important to recognize that the worth of data is always highly contextual. What's perfectly 'shareable' data in a context may not be so in a different situation. Therefore there’s a need to evolve a universal mechanism to ensure that data is used only for the intended purpose.

Mismatch Between Source & Users
The interaction between the data source and data usage is very chaotic- to put it mildly. Many times, we end up collecting whatever data comes our way and then try to figure out what to do with it. A clearly articulated need or set of questions that we are seeking answers for, are missing in most of the cases. Similarly on the data consumption side, one does not know what data comes from where. This mismatch results in lots of redundancy and waste in the system from a time, cost & effort standpoint.

Do all these challenges mean that we let go all those data? In this digital era those data means a lot – especially for AI driven digital clinical trials etc, we need to figure out ways to overcome these challenges. Let’s see what they are.

Involve Public:
We need to have an open, public debate on data reuse. Involve the people in the decision making- after all it’s about them and their data. There has to be some kind of institutionalization of the patient consent process. Create a simple, universal, systemic Patient consent workflow flow. It should support various levels of consents -narrow consent vs broad consent, based on usage of the data. Data-specific consent, based on which part of the data one is comfortable to share. Finally, there has to be an expiry date for consent- the consent cannot be for unlimited duration. We had already witnessed a concerted drive , especially during these pandemic days to insert a sunset clause for all the personal data collected using the health apps.

Another initiative worth considering is instituting a system, what could be known as ' data donation'- similar to ' organ donation'. A process in which patients upfront gives consent to use his or her medical data for larger social causes. Data sharing needs to be seen as a moral responsibility- just like the organ donation! We can even look at some commercial considerations for the patients if this can help the patients meet a part of the medical expenses – especially among the underprivileged society. There are already some initiatives happening in this space. For example, Sync for Science (S4S) is a public-private collaboration to develop a simplified, scalable, and secure way for individuals to access and share their electronic health record (EHR) data with researchers

Nationalize Health Data:
The data belongs to the originator of data. Mostly in the context of healthcare, it’s the patient. If the originator is not that well-informed, it's upon the state to protect their interests. Historically, whenever the equitable distribution of resources is perceived as at risk, the state intervenes. For example, in India during the late 1960’s banks were nationalized when the state felt that private banks were not serving the larger social cause.

Medical data is a critical resource meant for the benefit of society as a whole. If this resource is being hoarded by affluent few, the state has the latitude to intervene. The government should consider creating a national health registry which can store all of the citizen's health records, clinical trials and other medical data. There should be a mechanism by which data sets must be deposited in this central database and there has to be some kind of accession number and/or a specific access address. There should also be a systemic traceability to each and every unique access to the database. It should have a record of the purpose for which it is accessed -similar to how configuration management controls the access and use of source code in the software industry.

I am not suggesting a utopian data-socialism regime. Of course we need to protect the interests of the donors of the data and the sponsors of the clinical trials. Lots of effort and investment goes into such data collection and those need to be adequately acknowledged and rewarded.

However it is imperative that information-withholding or data hoarding should no longer be seen as a competitive advantage for anyone.

Curate Data Contributions:
The best approach towards addressing the source-side challenge is to create a metadata about data itself so potential users of data can easily navigate through the available data from various sources. Not just that. It’s also important to store (and make available as needed) the software code required to process or validate or at times reproduction of the clinical data. This is a practice that’s already in vogue in some of the major publication houses. The International Committee of Medical Journal editors requires investigators to register a data-sharing plan at time of registering a trial. This plan must include where the researchers will house the data and, if not in a public repository, the mechanism by which they will provide others access to the data, whether data will be freely available to anyone upon request or only after application to and approval by a learned intermediary, whether a data use agreement will be required, etc.

In order to address the user-side challenge, there's a need for a new role called data managers or data stewards- who can act as information brokers for any aspiring researchers. There should also be a global community of data stewards governed by strict ethical standards as these are the people who will ' know' a lot about the data and there is a need to maintain very high levels of integrity levels.

The availability of data ,ability to process those data and amplifying its utility by collaborative sharing of the same across the stakeholders is clearly emerging as competitive differentiators. Specifically in the healthcare industry, the benefits of being part of such a shared economy far outweighs the operational challenges in doing so. In a regulated industry like pharma, such data sharing has to be done within the tight framework of privacy and data protection. There is an urgent need for a policy intervention from the side of government – not only to ensure a more effective democratization of science as well as to create an efficient infrastructure that aids and enables smooth and effective sharing of data.

Current Issue
ESSPL: Manifesting New Depths in Supply Chain Efficiency