Introduction

In recent years, both medical research and the legal landscape have been changing as a result of the rapid developments in information technology (IT). Medical researchers are collecting, re-using and linking health-related and genomic data on an unprecedented scale, based on the presupposition that this research will significantly improve human health.1, 2 Developments in IT have however led to an increasing concern about the effectiveness of existing data protection law, and the need for a more consistent and comprehensive protection of personal data was recognised in the European Union (EU).3 Therefore, the Data Protection Directive 95/46/EC (DPD) is intended to be replaced by the General Data Protection Regulation (GDPR), which will be directly binding in all EU member states. On 12 March 2014, the European Parliament voted in favour of an amended draft GDPR.4 The Council of the EU agreed on a common approach on a revised text of the proposed GDPR on 15 June 2015.5 The final GDPR text depends on the outcome of the three-way negotiations between the Council, the Parliament and the European Commission. The ambition of the EU legislative bodies is to adopt the GDPR at the end of 2015.6 After adoption, the GDPR will come into force after a transition period of likely 2 years.

The EU data protection law reform has led to an intense debate about its potential effect on medical research. Essentially, the discussion is about where limits should be drawn to the use of sensitive personal data in medical research. Resolving this matter requires a subtle negotiation of a broad range of relevant (fundamental) rights and interests. Key issues are related to the scope and limitations of consent as a legal basis for the use of sensitive personal data in medical research and its possible alternatives. A dominant approach in some EU member states is that the conventional or only alternative to obtain consent is anonymising these data. This has been referred to as the ‘consent or anonymise approach’.7, 8 Even so, derogations to this approach can be laid down in data protection law in so-called ‘research exemptions’.9 This regulatory approach will continue to exist in the forthcoming GDPR, subject to still to be determined change in emphasis and detail. Both in literature and in the medical research community, many have expressed their concern about the consequences of the legislative reform. They indicate that the combination of strict consent requirements and limited research exemptions will severely restrict medical research.10, 11, 12, 13, 14, 15, 16 To contribute to this evolving debate, this paper reviews how the consent or anonymise approach is challenged in a data-intensive medical research context, and discusses possible ways forward within the EU legal framework on data protection.

The context of data-intensive medical research

Increasingly large worlds of complex health-related and genomic data, often referred to as ‘Big Data’, are becoming available to medical researchers.1 Initially, it was indicated that certain data characteristics define Big Data, like its relatively high volume, velocity and variety.17 At present, the term is more and more used to refer to the technical or analytical methods to extract information from complex or multiple data sets.1, 18 Big Data sources potentially valuable to medical researchers include electronic health records (EHRs),19 aggregated clinical trial data, administrative health care,20 and genomic and other -omics data.1, 21 Nowadays, online activities of individuals, for example on mobile phones,22 also allow the continuous collection of health-related and other data.23 In the meanwhile, the wide-scale sharing of data is progressively promoted, for example in open access policies.24 Furthermore, it is pointed out that linkage of multiple data sets at the individual person level is needed for Big Data to become transformative.25

Vital to the collection, re-use and linkage of multiple data sources on a large scale are the research infrastructures and networks in and outside the EU. For example, the UK Biobank provides access to medical researchers from all around the world to a wide variety of health-related data and human samples from more than 500 000 participants.26 In Europe, the Biobanking and BioMolecular resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate the re-use of human samples and health-related data available in biobanks scattered across different nations.27 Also, many initiatives exist to promote or facilitate the large-scale re-use and linkage of health-related and genomic data, such as the Global Alliance for Genomics and Health (Global Alliance).28 These developments illustrate how medical research is increasingly becoming a data-intensive activity, in which health-related, genomic and other data are being collected, re-used and linked on a large scale.

EU legal framework on data protection

In the EU, the right to data protection and the right to privacy are formalised by an overlapping but different set of rules. This is because data protection law does not codify the right to privacy as such, but regulates the use of personal data, which are data related to identifiable individuals.29 The right to data protection has recently been recognised as a separate fundamental right in Article 8 of the Charter of Fundamental Rights of the EU (the Charter). Like any fundamental right, the right to data protection is not absolute and needs to be considered in its relation to other (fundamental) rights and interests, including the social rights of access to health care, social security and social assistance in case of illness (Articles 34 and 35 of the Charter), and the fundamental freedom of the sciences (Article 13 of the Charter). To this end, EU data protection law essentially provides a system of checks and balances, consisting of a set of principles and rules. At the heart of the current principal EU data protection law, the DPD, are the principles of fairness and lawfulness. The principle of fairness requires for example that those who process personal data are clear and open with individuals about how their data will be used. The principle of lawfulness demands that each processing of personal data must be based on consent or another legitimate basis laid down by law, as is also enshrined in Article 8 of the Charter.

When it comes to the processing of sensitive personal data, such as health-related data, a more restrictive set of legal bases is provided by EU data protection law. Genetic data will be explicitly recognised as sensitive in the forthcoming GDPR, without granting this type of data a status different from other categories of sensitive personal data.30 At present, the legal base provided by Article 8 (2a) of the DPD for the processing of sensitive personal data, in any context, is explicit consent. For consent to be valid, it also needs to be specific, freely given and informed (Article 2 (h) DPD). Research exemptions from these consent requirements can be laid down in national law for reasons of ‘substantial public interest’, subject to the provision of ‘suitable safeguards’, according to Article 8 (4) of the DPD. Recital 34, which is related to this article, explicitly mentions that reasons of public interest can relate to areas such as scientific research and public health. It is however indicated that the implementation of research exemptions within national laws varies significantly between EU member states, and consequently hinders international collaboration between researchers.31

Ways forward within the consent or anonymise paradigm

Both the mechanism of consent and its conventional alternative of anonymisation are challenged in a data-intensive medical research context. Much of the debate, as outlined below, focuses on the legal or ethical acceptability of adapting consent or anonymisation mechanisms to overcome these challenges.

Adapting consent

The difficulties in obtaining consent, when personal data are to be available for linkage, re-use and analysis for largely undetermined future research purposes, have been discussed extensively in the literature.32, 33 On the one hand, it is questioned whether meaningful or legally valid (specific, explicit, freely given and informed) consent can be obtained at a one-off event at the time of data collection, as it may not be possible to foresee or comprehend the possible consequences of consenting.9, 34, 35 On the other hand, it is suggested that obtaining specific consent for every linkage or re-use may be overly burdensome or impossible, because this could result in costly and time-consuming procedures, poor recruitment, consent bias, or unwarranted intrusion into the private lives of individuals.36, 37, 38

As a response to the difficulties in obtaining specific consent, adapted models of consent have been put into practice and discussed in the literature. The most common adaptations of consent are models that shift away from specific consent, such as ‘broad consent’, covering a broad range of future data uses.32, 33 There is however an ongoing debate on the legal validity and ethical acceptability of broad consent.34, 39, 40, 41, 42 Some suggest that justifications for broad consent models remain contested in the bioethical literature, and they emphasise that these models are insufficient to ensure meaningful individual control over personal data or human samples.9, 43 Also, it is indicated that, effectively, broad consent is ‘consent for governance’ by certain institutions.41 Others argue that broad consent is an ethically sound alternative for specific consent, although individuals are not given specific information about future research projects.36, 44

In the draft GDPR texts, the current conflicting positions of the Parliament and Council on this topic appear to be reflected. Some indicate that broad consent may not meet the conditions on consent as defined in the Parliament’s draft GDPR, regarding the information that must be given to the individual.37, 45 The position of the Council seems to be that broad consent should be possible for medical research.16 This position is reflected in Recital 25aa of the Council’s draft GDPR, which states that ‘data subjects can give their consent to certain areas of scientific research when in keeping with recognised ethical standards for scientific research.’ Moreover, Article 5 (1b) of the Council’s draft GDPR provides a research exemption to the principle of purpose limitation, when appropriate safeguards are in place in accordance with Article 83.

An approach to consent claimed to be potentially consistent with strict or changing legal requirements is ‘dynamic consent’. Essentially, dynamic consent focuses on using IT and engaging individuals as active participants, so that they can be informed and subsequently re-consent can be obtained more easily.46, 47 Critics, however, argue that dynamic consent could for example lead to an information overload for the individual.36 As a response to this critique, it is emphasised that dynamic consent is not a replacement for existing consent models, but rather a tool that could better facilitate the process of obtaining any form of consent.47, 48

Adapting anonymisation mechanisms

A conventional method to protect data and avoid consent or other legal requirements is anonymisation. Yet, there seems to be a broad consensus that it is impossible to guarantee anonymity, especially when health-related data are re-used in different contexts or genomic data are involved.8, 49, 50, 51, 52, 53 Such a guarantee of absolute anonymity is however not required by data protection law. The term anonymisation is defined in current EU legal documents as a technique, which irreversibly prevents identification, taking into account all the means ‘likely reasonably’ to be used.54 According to Recital 23 of the Parliament’s draft GDPR, ‘all the means reasonably likely to be used either by the controller or by any other person to identify or single out the individual directly or indirectly’ should be taken into account in this assessment. In the Council’s draft GDPR text, the phrase ‘single out’ has been removed from this recital.

Yet, it is indicated that irreversible anonymisation implicates extensive stripping of data sets, and largely excludes data linkage and update, while these activities are essential to most large research networks or projects.55, 56, 57 Some therefore argue that lowering the thresholds for anonymisation could better balance relevant interests, by considering two-way coded data as de-identified in data protection law.58, 59 However, a more broadly accepted function of pseudonymisation (single or two-way coding) is considering it as a useful security measure.54, 60 In addition, Recital 23a of the Council’s draft GDPR mentions that pseudonymisation can reduce risks, but is not intended to preclude the applicability of data protection law. It should be noted, however, that it is not the practical reality that a clear distinction between pseudonymous and anonymous data can always be made.61 Another position is that anonymisation should be avoided in practice.50, 55 Not only since anonymisation excludes data linkage or update, but also because anonymisation takes away most legal obligations to protect the data or respect individual rights or interests, while the (hypothetical) risk of re-identification remains.62 In addition, information derived from anonymised data could still affect groups; risks of discrimination or stigmatisation have been described in the literature.33, 63

The search for solutions with the use of anonymisation techniques and other innovative methods also carries on. An example is to prevent re-identification by ‘taking the analysis to the data, not the data to the analysis’, as facilitated by the initiative called dataSHIELD.64 It is claimed that under DataSHIELD personal data re-use, linkage and analysis is enabled in accordance with legislation and guidance in the United Kingdom, primarily because no identifying or sensitive information is returned to the researcher.65, 66, 67 Significant challenges however need to be overcome in the implementation of this initiative.64

Ways forward outside the consent or anonymise paradigm

An alternative approach is to search for ways forward outside the consent or anonymise paradigm, by creating another legal basis than consent for the processing of sensitive personal data for medical research purposes. According to Article 81 (2a) of the Parliament’s draft GDPR, such a research exemption from consent should be provided by national law, for ‘research that serves a high public interest’. In contrast, Article 9 (2i) of the Council’s draft GDPR indicates that consent is not required when the processing is necessary for scientific purposes, subject to certain conditions and safeguards laid down in law. Differing positions on the appropriate scope of research exemptions are also reflected in the literature. Some argue that research exemptions should be kept to a minimum by using dynamic consent approaches, taking into account the requirements of necessity and proportionality.68 Others suggest that consent should serve as ‘a default starting point from which departure is possible’ for a particular data usage, when there is evidence of a strong justification in the public interest.6 A more radical view is that providing another legal basis than consent should not be considered as an ‘exemption’, but as an equally acceptable route to achieve protection when data are re-used in large biobanks and data sets.9 Also, some argue to reduce or eliminate the need for consent by focusing on solidarity arguments and harm mitigation.69

An interrelated issue is which appropriate safeguards should be put in place when a research exemption from consent is provided. In Article 81 of the Parliament’s draft GDPR, mandatory pseudonymisation under the highest technical standards is presented as such a safeguard. It is argued though that a strict interpretation of this requirement will possibly render most data useless for epidemiological research.14 According to Article 83 (2) of the Council’s draft GDPR, technological and/or organisational protection measures, such as pseudonymisation, could ensure that the processing of personal data is minimised, in pursuance of the proportionality and necessity principles. In addition, it does provide an escape where these measures would prevent achieving the scientific purpose, and this purpose cannot be fulfilled otherwise with reasonable means. Technological and organisational or governance measures have also been proposed in the literature to justify alternative legal bases to consent, such as opt-out registration,9 authorisation by an ethics committee,8 limiting data access and use, and engaging in public participation.32 To overcome some of the challenges related to implementing governance mechanisms on an international scale, an e-governance system is proposed.70

Discussion

What can we learn from the above? In the debate on how to deal with the challenges to the consent or anonymise approach in the context of data-intensive medical research, within the EU legal framework on data protection, we suggest that the following considerations should be taken into account.

To begin with, we conclude that the search for ways forward within the consent or anonymise paradigm becomes increasingly difficult in a data-intensive medical research context. Although innovative technologies or methods could reduce some of these difficulties, a common position in the reviewed literature is that obtaining meaningful consent or irreversibly anonymising data is impracticable or impossible for a great deal of data-intensive medical research. It may be for these reasons that the necessity of a research exemption, which creates an alternative legal basis to consent, seems to be beyond questioning in the legal debate. This necessity may increase even further, dependent on what definitions on consent and anonymisation will be provided by the forthcoming GDPR, which need to be clear to reduce legal uncertainty and prevent the erosion of data protection law.

Then, we recommend that further debate should focus on two issues related to research exemptions in data protection law. First, we do not expect that a high level of harmonisation on the conditions of research exemptions will be provided by the forthcoming GDPR. The draft GDPR texts do provide an overlapping EU legal framework on this topic, but leave considerable room for a more detailed regulation on a national level. It therefore seems that it will be largely up to the EU member states to determine the appropriate conditions of research exemptions. This will probably again result in a diverse implementation of research exemptions within the EU, which may impede the exchange of sensitive personal data for research across national borders. Initiatives within the medical research community to coordinate the development of harmonised approaches, such as BBMRI-ERIC and the Global Alliance, may therefore remain of vital importance to achieve the goal of international interoperability. Second, we notice that there is a lack of consensus on what the conditions of a research exemption from consent should be, while these conditions are of great influence to how relevant rights and interests need to be taken into account in a data-intensive medical research context. We agree with the suggestions in the literature that this act of balancing should include an independent necessity and proportionality test, for instance by an (data access) ethics committee. In addition, we emphasise that proportionate technical and governance measures should be incorporated in the design of data-intensive medical research projects and infrastructures, not only in order to provide a secure data processing environment, but also to allow individuals and the public to access clear information about the use of their data and their rights concerning this usage. Such transparency measures are in particular relevant where technological complexity makes it difficult for individuals to find out which personal data are used, for what purpose and by whom, as indicated in Recital 46 of both draft GDPR texts. We suggest that these measures could include the use of IT and participant interfaces to provide individuals with sufficient information and control over their data, and to stimulate participation by relevant stakeholders. Such a focus on research exemptions with appropriate safeguards should be preferred above continuing the practice of (over)stretching concepts of consent or anonymisation in order to sustain their central role. This may be necessary not only to meet legal requirements, but also to maintain public trust.

Overall, we conclude that research exemptions in data protection law should allow for the creation of a context-specific normative framework, in which the particularities of the use of sensitive personal data in medical research can be taken into account. Further interdisciplinary research is however needed to determine when a shift away from consent as a legal basis is necessary and proportionate in a data-intensive medical research context, and what technological and governance measures should be put in place when such a research exemption from consent is provided.