From the American College of Epidemiology
Ethics, big data and computing in epidemiology and public health

https://doi.org/10.1016/j.annepidem.2017.05.002Get rights and content

Abstract

Purpose

This article reflects on the activities of the Ethics Committee of the American College of Epidemiology (ACE). Members of the Ethics Committee identified an opportunity to elaborate on knowledge gained since the inception of the original Ethics Guidelines published by the ACE Ethics and Standards of Practice Committee in 2000.

Methods

The ACE Ethics Committee presented a symposium session at the 2016 Epidemiology Congress of the Americas in Miami on the evolving complexities of ethics and epidemiology as it pertains to “big data.” This article presents a summary and further discussion of that symposium session.

Results

Three topic areas were presented: the policy implications of big data and computing, the fallacy of “secondary” data sources, and the duty of citizens to contribute to big data. A balanced perspective is needed that provides safeguards for individuals but also furthers research to improve population health. Our in-depth review offers next steps for teaching of ethics and epidemiology, as well as for epidemiological research, public health practice, and health policy.

Conclusions

To address contemporary topics in the area of ethics and epidemiology, the Ethics Committee hosted a symposium session on the timely topic of big data. Technological advancements in clinical medicine and genetic epidemiology research coupled with rapid advancements in data networks, storage, and computation at a lower cost are resulting in the growth of huge data repositories. Big data increases concerns about data integrity; informed consent; protection of individual privacy, confidentiality, and harm; data reidentification; and the reporting of faulty inferences.

Introduction

It has been more than 15 years since the original 2000 American College of Epidemiology (ACE) Ethics Guidelines [1] were published. Since then, specialized fields of epidemiology (e.g., genetic and molecular epidemiology) have emerged, and as has awareness that epidemiology is closely interconnected with other fields (e.g., health information technology, global health and noncommunicable diseases). These advances have changed the profession of epidemiology, introducing numerous concepts related to big data and computing. Since the Guidelines' original publication, additional ethical issues in the context of specialized fields of epidemiology have emerged and presented challenges. To address this need, the Ethics Committee hosted a symposium session at the 2016 Epidemiology Congress of the Americas held in Miami, FL, June 21–24, 2016. This article presents a summary and further discussion of that symposium session. The session addressed three topics: (1) the international policy and human rights implications of big data and computing (B.M.K.); (2) the fallacy of “secondary” data sources (L.M.L.); and (3) the benefits, risks, and duties of citizens to contribute to big data (K.W.G.). This article exemplifies the Ethics Committee's ongoing consultative efforts to highlight contemporary topics in the area of ethics and epidemiology relevant to professional epidemiologists. We are targeting a diverse audience of health care researchers including epidemiologists, health care informatics specialists, geneticists, health care providers, policymakers, and others who work with, or are interested in working with big data.

Section snippets

Big data: potential and challenges

The current technological landscape permits the digitization and storage of unprecedented amount of data from many sources, including smart phones, text messages, credit card purchases, online activity, electronic medical records, and global positioning system data. Many of these data sources contain personal information both related and unrelated to health, including for example, geographic location, health or social security number, and credit card number. Various forms of health information

Evolving epidemiology data sources

Historically, we have divided the sources of health information into two categories: primary and secondary. “Primary data” refer to data collected for a specific research question using an instrument (e.g., a survey or laboratory test) designed or chosen to optimize validity. “Secondary data” refers to existing data that were collected for a purpose other than the specific research question at hand. Secondary data might come from routine public health surveillance, population-based health

Evolving access and regulatory landscape

The motivation for the use of big data includes the efficiencies gleaned from creating “economies of scale.” First, data are rapidly generated from genetic, medical, socioeconomic, social media, and geospatial sources; disease and other types of registries; primary care and community clinics; and from data sources that include air pollution, climate, and contaminated soils and water. Second, these data are able to be stored in internet-based centers which would allow government agencies the

Ethical principles revisited

The ethical dimensions of big data and population health research are not unlike the common ethical principles in epidemiology research and practice. Whatever our data source, we must uphold the ethical principles that reflect what we value—minimizing harms while maximizing benefits, ensuring just distribution of burdens and benefits, respect individual autonomy through informed consent, privacy and confidentiality, build trust, and maintain scientific rigor [1]. To honor these ethical

Societal contributions to big data

Epidemiology has always been information-intensive. Indeed, its very existence depends on the collection and analysis of data and information. Much of that data and information relates, pertains, or is somehow linked to individual people, their families, or their communities. Given the goals and successes of epidemiology and other population health sciences and the sustainability and quality of health care systems, one should infer that the collection and analysis of data and information is

Ethical framework to address big data

Traditional approaches to the issues of big data have relied on ethical principles requiring protection from presumed harm from biomedical research. A new and more positive approach to address the challenges of big data emphasizes human rights. Indeed, this is the basis for the Framework for Responsible Sharing of Genomic and Health-Related Data established by the Global Alliance for Genomics and Health (GA4GH; https://genomicsandhealth.org/). At the core of the Framework is the understanding

Conclusion

The purpose of this article is a “call to action” in the area of ethics, big data, and computing in epidemiology and public health. Also, to provide readers with information about the activities of the college and to give readers of the Annals of Epidemiology a broad perspective on a recent major epidemiological issue. Our intentions for this article and subsequent ethics and epidemiology publications as part of the work of the Ethics Committee are for these resources to be nimble, accessible,

Next steps

Our plenary session dealt with ethics, big data, and computing in epidemiology and public health from the perspective of our speakers including research and teaching expertise in the areas of ethics, epidemiology, genomics, health policy, legal dimensions, bioethics, and bioinformatics. Additional perspectives may not have been adequately addressed in the plenary session including but not limited to the ethical considerations surrounding study methods/design, data collection/analysis, standards

References (16)

There are more references available in the full text version of this article.

Cited by (57)

  • Deep Learning in the Management of Intracranial Aneurysms and Cerebrovascular Diseases: A Review of the Current Literature

    2022, World Neurosurgery
    Citation Excerpt :

    Ethical considerations when using large-volume patient data include data ownership and consent for an individual’s data to be captured in an ML system, in addition to security considerations when sharing data between institutions and AI systems. Because applied ML in healthcare is in its infancy, it is likely that such issues related to data management and consent will arise throughout their development71,72 and will require continuous reassessment as AI evolves. DL algorithms can predict the risk of aneurysms from the stratification of risk factors, detect aneurysms, and predict the risk of IA rupture and their treatment outcomes.

  • Key indicators of ethical challenges in digital healthcare: A combined Delphi exploration and confirmative factor analysis approach with evidence from Khorasan province in Iran

    2021, Technological Forecasting and Social Change
    Citation Excerpt :

    Security in data storage, the safety of information, protection against unauthorised access, and data use are the components of security. Our findings confirm with the results of Fricker et al., (2015b), Kopala and Mitchell (2011), Lee (2017) and Ozair et al., (2015) in HER technology, Galderisi and Caputo (2017), and Cvrkel (2018) in mHealth and Salerno et al., (2017) in big data and Dickens and Cook (2006) in telemedicine and Kelly et al., (2013) in wearable health technologies. This study applied the Delphi method and established 26 items measuring six key indicators of ethical challenges in digital healthcare.

View all citing articles on Scopus
View full text