Scolaris Content Display Scolaris Content Display

The effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital

Collapse all Expand all

Background

Preparing healthcare providers to manage relatively rare life‐threatening emergency situations effectively is a challenge. Training sessions enable staff to rehearse for these events and are recommended by several reports and guidelines. In this review we have focused on interactive training, this includes any element where the training is not solely didactic but provides opportunity for discussions, rehearsals, or interaction with faculty or technology. It is important to understand the effective methods and essential elements for successful emergency training so that resources can be appropriately targeted to improve outcomes.

Objectives

To assess the effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital on patient outcomes, clinical care practices, or organisational practices, and to identify essential components of effective interactive emergency training programmes.

Search methods

We searched CENTRAL, MEDLINE, Embase, CINAHL and ERIC and two trials registers up to 11 March 2019. We searched references of included studies, conference proceedings, and contacted study authors.

Selection criteria

We included randomised trials and cluster‐randomised trials comparing interactive training for emergency situations with standard/no training. We defined emergency situations as those in which immediate lifesaving action is required, for example cardiac arrests and major haemorrhage. We included all studies where healthcare workers involved in providing direct clinical care were participants. We excluded studies outside of a hospital setting or where the intervention was not targeted at practicing healthcare workers. We included trials irrespective of publication status, date, and language.

Data collection and analysis

We used standard methodological procedures expected by Cochrane and Cochrane Effective Practice and Organisation of Care (EPOC) Group. Two review authors independently extracted data and assessed the risk of bias of each included trial. Due to the small number of studies and the heterogeneity in outcome measures, we were unable to perform the planned meta‐analysis. We provide a structured synthesis for the following outcomes: survival to hospital discharge, morbidity rate, protocol or guideline adherence, patient outcomes, clinical practice outcomes, and organisation‐of‐care outcomes. We used the GRADE approach to rate the certainty of the evidence and the strength of recommendations for each outcome.

Main results

We included 11 studies that reported on 2000 healthcare providers and over 300,000 patients; one study did not report the number of participants. Seven were cluster randomised trials and four were single centre studies. Four studies focused on obstetric training, three on obstetric and neonatal care, two on neonatal training, one on trauma and one on general resuscitations. The studies were spread across high‐, middle‐ and low‐income settings.

Interactive training may make little or no difference in survival to hospital discharge for patients requiring resuscitation (1 study; 30 participants; 98 events; low‐certainty evidence). We are uncertain if emergency training changes morbidity rate, as the certainty of the evidence is very low (3 studies; 1778 participants; 57,193 patients, when reported). We are uncertain if training alters healthcare providers' adherence to clinical protocols or guidelines, as the certainty of the evidence is very low (3 studies; 156 participants; 558 patients). We are uncertain if there were improvements in patient outcomes following interactive training for emergency situations, as we assessed the evidence as very low‐certainty (5 studies, 951 participants; 314,055 patients). We are uncertain if training for emergency situations improves clinical practice outcomes as the certainty of the evidence is very low (4 studies; 1417 participants; 28,676 patients, when reported). Two studies reported organisation‐of‐care outcomes, we are uncertain if interactive emergency training has any effect on this outcome as the certainty of the evidence is very low (634 participants; 179,400 patient population).

We examined prespecified subgroups and found no clear commonalities in effect of multidisciplinary training, location of training, duration of the course, or duration of follow‐up. We also examined areas arising from the studies including focus of training, proportion of staff trained, leadership of intervention, and incentive/trigger to participate, and again identified no clear mediating factors. The sources of funding for the studies were governmental, local organisations, or philanthropic donors.

Authors' conclusions

We are uncertain if there are any benefits of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital as the certainty of the evidence is very low. We were unable to identify any factors that may have allowed us to identify an essential element of these interactive training courses.

We found a lack of consistent reporting, which contributed to the inability to meta‐analyse across specialities. More trials are required to build the evidence base for the optimum way to prepare healthcare providers for rare life‐threatening emergency events. These trials need to be conducted with attention to outcomes important to patients, healthcare providers, and policymakers. It is vitally important to develop high‐quality studies adequately powered and with attention to minimising the risk of bias.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

The effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital

What is the aim of this review?

We aimed to find out if healthcare workers who work in hospitals and receive training where they can interact with learning materials and other workers give better healthcare during emergency situations.

Key messages

We are unsure about if interactive training for emergency situations improves healthcare, as there were conflicting results between studies and problems with the methods the trials used which could lead to false results.

What was studied in this review?

Hospital‐based healthcare workers need to be well prepared to react expertly to emergency situations that threaten people's lives. There are many training courses for this, some of which allow healthcare workers to interact with learning materials and other workers. However, we do not know if these training courses prepare healthcare workers to provide better healthcare.

We searched for studies that assessed the effectiveness of interactive training compared to usual training or no training. We looked only at the type of study thought to be the strongest form of evidence, that is randomised trials (where participants could be assigned to either the training group or no/standard‐training group by chance). We looked for any effects on patient outcomes (e.g. survival or length of hospital stay), any effects on staff (e.g. improved skills in an actual clinical situation), or changes within the organisation (e.g. reorganisation of working patterns). We did not look at changes in a simulated environment.

What are the main results of this review?

We found 11 studies that were relevant to this review. Nine of these focused on maternal and newborn health. Because there were so few studies and they all examined different effects of emergency training, we were unable to combine the results.

All of the trials included weaknesses in their design that could have lead to inaccurate results. The certainty of evidence for our important outcomes focusing on changes to patient care/outcomes was very low, therefore based on the available evidence we are uncertain as to whether training of healthcare workers in the management of life‐threatening emergency situations made a difference to patients or organisations. The studies were paid by government, local hospitals, or charities.

How up‐to‐date is this review?

We looked at all of the studies examining this area up until March 2019.

Authors' conclusions

Implications for practice

Logically, it seems important to train staff for in‐hospital‐based emergencies. However, due to the heterogeneity of outcomes within this review, it was not possible to provide firm conclusions as to whether interactive training works. Having said this, the structured synthesis of the evidence showed that most of the studies included in this review reported improvements in patient, staff, or organisational outcomes. The certainty of the evidence for these results is very low.

The evidence for what type of training works and what the important elements of training are is unclear. However, we did find that the effects of interactive training were not universal in any given study.

Implications for research

Whilst a wealth of studies have been carried out into emergency training, there are few well‐conducted randomised trials. Implementing training even when it is local and low‐cost represents a significant investment for a healthcare facility (Yau 2016), therefore in the resource‐stretched environment of health care it is vital that high‐quality studies are undertaken to identify whether interventions are effective or not. This is especially important as we have seen that not all training is effective, and in some cases there can be negative impact.

Furthermore, it is important going forwards that studies are carefully designed to answer important clinical questions with outcomes that patients, staff, and policymakers value. For example, it is imperative that cost‐effectiveness of interventions be considered, due to the high cost of implementing training. In terms of patient outcome measures, although interim measurements of knowledge and performance in simulated environments are useful markers of the success of an intervention, it is vital that there is a shift towards measuring the harder‐to‐measure outcomes, such as clinical and organisational practice change, as the primary outcomes of studies. Powering studies to these outcomes will ensure that there is a clearer understanding of exactly how effective the interventions are, and which ones are most worthwhile investing in to improve patient care. However, we acknowledge this requires significant investment.

It has become evident that a wealth of outcome measures are reported within each of these areas. When considering individual clinical areas (e.g. paediatrics, anaesthetics), there are still very different measures used. To enable both meta‐analysis and comparison between interventions, it is important that all studies should report a core outcome set.

Whilst these studies do not necessarily need to be randomised trials, it is important that they are well conducted and answer valuable questions.

Another point raised by this review is length of follow‐up. If, as we recommend, clinically important outcomes, rather than the more common intermediary measures, are the primary outcomes of these studies, either large sample sizes or long time periods will be necessary. Furthermore, the important issue of deterioration in effect over time needs to be addressed. As we have discussed, the idea of repeated training is important (Bluestone 2013), and as such it will be important to record the changes over time in the effectiveness of the interventions and have a sufficient follow‐up period to be able to measure the impact of time. We suggest that under one year is likely to be an insufficient length of follow‐up.

There has also been a significant concentration in studies on obstetric and neonatal emergencies, with much less focus on the other medical and surgical specialities. It is important that there is more concentration outside of maternal and neonatal health in order to ensure that patient care in all settings is enhanced.

Summary of findings

Open in table viewer
Summary of findings for the main comparison. Interactive training for in‐hospital‐based healthcare providers on the management of life‐threatening emergencies: effects on clinical practice and patient outcomes

The effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital

Patient or population:

Participants: Healthcare workers delivering life‐saving emergency care in a hospital setting (obstetric/labour and delivery staff, physicians, skilled birth attendants, midwives, midlevel surgical trainees, anaesthesiologists, nurses, internal medical residents)
Population: Patients who suffer life‐threatening emergencies in hospital: women around the time of birth, neonates, trauma patients, and adults undergoing resuscitation
Setting: All hospital settings are included. The evidence for this review is drawn from the Netherlands, Denmark, the USA, China, Pakistan, Kenya, Mexico, and Ghana.
Intervention: Interactive training, i.e. any training including a component in which participants are not just passive recipients of the training
Comparison: Standard training delivered at the facilities, no training, or an element of the intervention (e.g. a new training session) but only the didactic component

Outcomes (number of studies)

No. participants/no. in the population studied

Certainty of the evidence
(GRADE)

Impact and selected results

Survival to hospital discharge

(1 study)

30 participants

98 events (cardiac arrests) observed

⊕⊕⊝⊝
Low 1

Interactive emergency training strategies may make little or no difference in survival to hospital discharge.

Morbidity rate

(3 studies)

1778 participants

57,193 in the population studied2

⊕⊝⊝⊝
Very low 3

It is uncertain whether interactive training leads to change in morbidity rates.

Protocol or guideline adherence

(3 studies)

156 participants

558 in the population studied

⊕⊝⊝⊝
Very low 4

It is uncertain whether interactive training leads to change in protocol or guideline adherence.

Patient outcomes

(5 studies)

951 participants

314,055 in the patient population

⊕⊝⊝⊝
Very low 5

It is uncertain whether interactive training leads to change in patient outcomes.

Clinical practice outcomes

(4 studies)

1417 participants

28,676 in the population (patients and staff)2

⊕⊝⊝⊝
Very low 6

It is uncertain whether interactive training leads to changes in clinical practice outcomes.

Organisation of care

(2 studies)

634 participants

179,400 in the patient population

⊕⊝⊝⊝
Very low 7

It is uncertain whether interactive training leads to change in organisation‐of‐care measures.

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: We are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: Our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect.
Very low certainty: We have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

1We downgraded the certainty of the evidence to low due to high risk of bias and imprecision.
2One study, Riley 2011, did not report numbers for participants or population.
3We downgraded the certainty of evidence to very low due to high risk of bias, inconsistency and imprecision.
4We downgraded the certainty of evidence to very low due to high risk of bias, inconsistency of findings and the small number of participants.
5We downgraded the certainty of evidence to very low due to high risk of bias, inconsistent results and small sample sizes.
6We downgraded the certainty of evidence to very low due to risk of bias, inconsistency in results and due the sample size being small or unclear in some studies.
7We downgraded the certainty of evidence to very low due to high risk of bias and inconsistency between studies.

Background

Healthcare professionals strive to provide safe and effective clinical care, but sub optimal emergency care is a frequently identified factor in adverse outcomes for patients with acute conditions. A number of reports and guidelines have identified training in emergencies, in particular, as key to improving outcomes for patients (IOM 2000; ERC 2010; CMACE 2011; Soar 2015).

Training is a logical way for staff to develop their skills to respond effectively to relatively rare emergency situations. However, despite more than a decade of research, little evidence exists for the impact of this training on clinical outcomes. The best way to equip staff with the myriad skills they require to deal effectively with stressful live clinical situations remains unclear (Calvert 2013).

There is an increasing recognition that there needs to be training for both technical skills and human factors in the form of situational awareness and teamwork training (Shapiro 2004; Calvert 2013). In order to achieve these goals, there are a huge number of different, often expensive, training courses available to health professionals, many of which are interactive. However, the way this emergency training is implemented is not uniform (Anderson 2005). This lack of uniformity is further compounded by the availability of adequately trained staff to deliver the training in different locations (Anderson 2005; Calvert 2013).

The effectiveness and limitations of different models of training for these emergency situations remains unclear. This uncertainty is due in part to the heterogeneity of training models that are implemented and studied. In addition, there is wide variation in how these training models are evaluated and reported. Currently no standardised evaluation tool exists, and many of the published outcomes are based on self‐reporting or subjective assessment by observers. Many studies do not assess clinical outcomes.

Identifying the most effective methods and essential elements for successful interactive emergency training will provide a useful guide for those designing, implementing, and evaluating training. The utilisation of this knowledge will ensure that healthcare providers are given the best opportunity to gain the skills they need to provide the best possible emergency care to their patients.

Description of the condition

Training of healthcare professionals to effectively manage emergency situations presents different challenges to training staff to provide routine care, in part due to the rarity of cases (Smith 2013).

Emergency situations differ between specialities, but all are defined as "serious, unexpected, and often dangerous situations requiring immediate action" (OED 2014). For the purposes of this review, an emergency situation will be one in which immediate lifesaving action is required. Examples include cardiac or respiratory arrest, failed intubation, major haemorrhage, shoulder dystocia during childbirth, severe sepsis, and tension pneumothorax. These situations can arise either in emergency settings, for example in the emergency department, or in elective settings where staff have to respond to a patient's evolving condition, for example a failed intubation in theatre.

Training for emergencies is different to that for routine care. This is because for routine care, whether the training is interactive or didactic, it can be backed up by 'on the job' reinforcement. The ability to spend time refining skills outside a high‐pressure environment means that a training programme does not have to perform the function of fully preparing staff for a new situation. However, for emergency situations, it is crucial that professionals work efficiently, both individually and as a team, even if it is the first time they have encountered the clinical situation or worked together. This requirement for comprehensive preparation has led to the development of training interventions to address the clinical and human factors in the emergency response.

Description of the intervention

This review examined interactive training interventions preparing healthcare professionals for emergency situations. We considered training for interventions performed within hospitals, as part of the clinical role of staff. We considered hospitals to be any facility‐based care setting that provides comprehensive secondary or tertiary clinical care, which included care delivered as a first point of contact in the emergency department.

In this review we concentrated on hospital‐based emergencies as a subset of all emergency care. There are other settings in which staff are trained to respond to emergencies, either in office‐based care settings or in the community. However, these settings are very different to the hospital environment and present different challenges. Within hospital settings it is usually possible to call upon a broader team of people and specialists to appropriately respond to and comprehensively manage an emergency. The focus in the community or primary care setting may be on the immediate management and transfer to an appropriate facility. Because of these differing priorities, the interventions and measures of effectiveness are likely to be different, therefore it was important to consider these areas separately.

This review focused on interactive training, that is any form of educational session that has an interactive component. Interactive training courses can have many different formats: courses could have, for example, pre‐course e‐learning components, case‐study discussions, or skills‐drills. There must be a component of attendees interacting with the course/faculty and not only passively absorbing information. This presents a challenge when attempting to define or subcategorise interactive training. We defined interactive training by using Freeth's model (Freeth 2005; Hammick 2010):

  • exchange‐based learning (e.g. debates, seminar or workshop discussions, case and problem‐solving study sessions);

  • observation‐based learning (e.g. work shadowing, joint client/patient consultations);

  • action‐based learning (e.g. collaborative enquiry, problem‐based learning, joint research, quality improvement initiatives, practice or community development projects); and

  • simulation‐based learning (e.g. role‐play, experiential group work, the use of clinical skills centres, and integrating drama groups within teaching sessions).

In addition to the different types of interactive training, other elements within training programmes can vary considerably. Courses may be administered locally, regionally, or nationally. Some high‐profile courses conform to strict regulations in terms of content, delivery, and assessment (ALS 2014), whilst others may be arranged to suit local needs without national accreditation. Some courses contain an element of assessment (ATLS 2015), whilst others are attendance based (PROMPT 2012). Courses may be multidisciplinary in faculty and attendees (CAT 2015), whilst others are run by and for only one profession (TEAM 2015). Courses vary in duration from half a day to several days. The speed of deterioration in knowledge and skills of participants and therefore how regularly training is required must also be considered by course conveners (Crofts 2007; Yang 2012). To maintain the course qualification, some courses need to be repeated every four years (ATLS 2015), whilst others are annual (PROMPT 2012).

How the intervention might work

Interactive emergency training sessions enable healthcare professionals to familiarise themselves with required skills in a controlled environment. By having a pre‐rehearsed systematic approach to an emergency, staff may then feel more able to concentrate on the current clinical situation rather than panicking about how to approach the emergency. It is this element of rehearsal and planning for emergencies that the interactive elements of the various types of training provide that could be the key to ensuring an appropriate emergency response by each individual and the team as a whole. If a systematic, evidence‐based approach towards each in‐hospital emergency could be adopted, improved outcomes for patients could result.

Why it is important to do this review

Previous reviews have focused on single aspects of training: modality or speciality (Siassakos 2009; Cook 2011; Lockey 2018). However, this review is broad in scope for three reasons. Firstly, there is a paucity of high‐quality randomised studies investigating emergency training, so the number of studies to be examined will be increased with a cross‐speciality review. Secondly, similar methods of training are applied across a range of emergencies, for example life support courses use similar methods to teach and assess candidates. Finally, although there are differences between training programmes, key essential elements to ensure successful emergency training may be clearly illuminated by examining programmes across specialities.

This review considered all interactive training interventions, both medical and surgical, to identify essential components for effective training common to all situations. It focused on patient and organisational outcomes, rather than on acquisition of knowledge or user rating of training.

A huge number of training courses have been developed worldwide to provide healthcare workers with the skills they require to deal with emergencies. However, as was identified over a decade ago, these courses are often poorly described and even more infrequently studied (Black 2003). We have seen some positive patient outcomes from evaluations that have been carried out (Draycott 2006; Shoushtarian 2014). However, we have also begun to understand that training is not always effective, and in fact on occasion has been shown to coincide with worsening patient outcomes (MacKenzie 2007). If training programmes are evaluated as harmful, they should be quickly modified or abandoned. Training programmes are expensive to run (Yau 2016), therefore it is essential that resources are channeled to increase the effectiveness of staff training and to maximise positive outcomes for patients.

The focus of this review was on changes in staff practice and patient outcomes rather than surrogate outcome measures of change demonstrated by training programmes. An example of a surrogate measure may include change in performance in 'mock code' scenarios (Donoghue 2009). Although these measures do provide a useful way to measure behavioural change as a direct result of the course, they do not represent how these skills translate into actual clinical practice in emergency settings.

By focusing on actual behaviour change and patient outcomes in emergency situations, this review provided an opportunity to identify the essential components of effective emergency training. If this can be achieved, then the factors that are required to deliver the best possible training can be incorporated into emergency training courses to facilitate improvement in patient and organisational outcomes across specialities.

Objectives

To assess the effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital on patient outcomes, clinical care practices or organisational practices, and to identify essential components of effective interactive emergency training programmes.

Methods

Criteria for considering studies for this review

Types of studies

We included randomised trials and cluster‐randomised trials investigating training interventions where there was the comparison of interactive training and no or standard training.

Types of participants

We considered healthcare professionals working within a hospital environment with the potential for life‐threatening, time‐pressured emergencies in which treatments require rapid physical interventions. We included studies conducted in public or private settings and in low‐, middle‐, or high‐income countries. The healthcare worker could be at any stage of their professional career. We excluded studies primarily investigating undergraduate/pre‐service healthcare students.

We considered the following specialties.

  • Emergency medicine

  • Obstetrics and gynaecology

  • Anaesthetics

  • Intensive care medicine

  • Paediatrics, including neonatology

  • All medical specialities

  • All surgical specialities

We excluded the following specialties, as they do not have life‐threatening emergencies for which healthcare staff would have to specifically respond. Life‐threatening emergencies in these specialities would tend to be responded to by a different clinical speciality, for example the medical emergency team if a patient in these settings was to have a respiratory arrest.

  • Ophthalmology

  • Radiology

  • Psychiatry

Types of interventions

We considered all types of interactive educational intervention with the primary aim of improving the performance of healthcare staff responding to life‐threatening emergencies in hospitals. This broad definition has been selected to try to bring together the evidence of effectiveness for the variety of different training opportunities offered to staff. It will also provide us the opportunity to compare and contrast between different lengths and intensities of intervention. For the purposes of this review, we considered interactive training to be any type of educational intervention with an interactive component as categorised by Freeth (Freeth 2002).

The training course could lead to a recognised qualification, for example an ‘Advanced Life Support provider’ certificate, however it could not form part of a primary qualification for health professionals, such as their primary medical or nursing degree.

The intervention could be delivered by a single methodology or by a combination of methods, for example online tutorials, lectures, and workshops. These interventions could take place individually or in groups. The intervention could involve the training of a single professional group or a multiprofessional team. The intervention could be of any duration and frequency and could occur in any setting (e.g. within the clinical department, local simulation room, or regional, national, or international training centre).

Types of outcome measures

We used Kirkpatrick's model of educational outcomes as modified and used by Freeth to develop a categorisation scheme for outcomes (Freeth 2002). We only considered studies that examined level 3 (behavioural change) and level 4 (practice and patient outcomes) in this review. We did not include level 1 (participant reaction) and 2 (acquisition of knowledge and skills) as outcomes for the review because despite their usefulness and wide use of the Kirkpatrick model, there remains a lack evidence for a clear causal chain between level 1 and 4 (Bates 2004), therefore the use of level 1 and 2 outcomes as a surrogate for level 3 and 4 outcomes cannot be assumed. In addition, because we were interested in identifying effects of training programmes on outcomes measured during or related to emergency clinical care, we excluded the level 2 surrogate outcomes of knowledge and skills measured on simulators or actual patients in training and non‐emergency settings.

Patient outcomes included mortality and severe morbidity. In order to demonstrate changes in the management of the relatively rare events leading to these outcomes, studies would be required to have extremely large sample sizes. In response to this, proxy measures of patient outcome are often used in smaller‐scale studies, and included in larger studies. These include the quality of clinical care provided or changes in organisational practice, which may be assessed by measuring adherence to guidelines, clinical errors, appropriate escalation to senior colleagues, and number of staff sick days.

The outcome measures addressed by individual studies are varied. We have therefore developed a framework with which to present the outcome measures for this review, based on the Kirkpatrick model. To facilitate clarity of this framework for this review, we have added examples of outcomes that some studies may consider.

Primary outcomes

  • Survival to hospital discharge

  • Morbidity rate (e.g. incidence of hypoxic ischaemic encephalopathy in neonates, incidence of sepsis, incidence of residual neurological symptoms) or patient deterioration (e.g. number of cardiopulmonary arrests, requirement for care escalation to a higher dependency setting, Glasgow Coma Scale, deterioration in vital signs) specific to each speciality

  • Protocol or guideline adherence (as assessed by observation or review of records, e.g. perimortem caesarean delivery during management of maternal cardiopulmonary resuscitation, time to first defibrillation in cardiopulmonary arrest)

Secondary outcomes
Patient outcomes

  • Length of stay

  • Patient‐reported outcome measures (including complaints and patient satisfaction scales)

  • Mortality

Clinical practice outcomes

  • Skills during emergency situations (e.g. structured observed assessment of intubation procedure, observation of teamwork skills)

  • Clinical endpoint of emergency situation (e.g. success of intubation, correct emergency ultrasound diagnosis)

  • Appropriate escalation of care to seniors or different specialities

  • Staff attitude (e.g. safety climate, teamwork, satisfaction, level of institutional support)

  • Clinical errors (e.g. incorrect drug dosage)

Organisation‐of‐care outcomes

  • Implementation of new systems (e.g. emergency boxes, treatment algorithms or proformas for reference during the emergency, one central emergency number to call)

  • Development of local guidelines

  • Institutional support (e.g. staff opinion, financial commitment)

  • Staffing levels (e.g. workload rating, sick leave, turnover of staff)

Search methods for identification of studies

Electronic searches

We designed a sensitive search strategy to retrieve studies from the following electronic bibliographic databases. We searched the following databases on 11 March 2019.

  • Cochrane Library via Wiley including the Cochrane Central Register of Controlled Trials (CENTRAL; 2019, Issue 3 of 12)

  • MEDLINE via Ovid (1946 to 11 March 2019)

  • Embase via Ovid (1947 to 11 March 2019)

  • CINAHL via EBSCO (Cumulative Index to Nursing and Allied Health Literature) (1980 to 11 March 2019)

  • ERIC via ProQuest (1980 to11 March 2019)

We also searched the following trial registries on 11 March 2019.

  • World Health Organization International Clinical Trials Registry Platform (WHO ICTRP) (www.who.int/ictrp/en/)

  • US National Institutes of Health Ongoing Trials Register ClinicalTrials.gov (clinicaltrials.gov/)

We used the sensitivity and precision‐maximising filter for retrieving randomised trials from MEDLINE and Embase as recommended in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011), which we adapted for the other databases.

We did not apply any language restrictions. We devised the search strategy for the Ovid MEDLINE interface and then adapted it for the other databases. The search strategies are provided in Appendix 1.

Searching other resources

We scanned the reference lists of included studies and any relevant systematic reviews identified. We consulted relevant individuals and organisations for information about unpublished or ongoing studies. We also scanned abstracts from relevant conferences including the AMEE: An International Association for Medical Education and International Conference on Resident Education.

Data collection and analysis

Selection of studies

Two review authors (of AM, JF, and KB) independently screened all titles and abstracts for eligibility. We retrieved the full‐text articles for all studies deemed by any review author to be potentially eligible. Two review authors (of AM, JF, and KB) assessed the full‐text articles against the inclusion criteria. Any disagreements between the two review authors were resolved by discussion with the review team.

We kept a record of eligibility assessment for each full‐text article and presented key excluded studies in the 'Characteristics of excluded studies' table.

We documented the entire process for the selection of studies using a PRISMA flow chart to demonstrate the initial number of records, records after de‐duplication, studies excluded at title and abstract screening stage, and finally the total numbers of excluded and included studies (Moher 2009).

Data extraction and management

Two review authors (of AM, JF, and KB) independently extracted data from each study onto a data collection form based upon the Cochrane Effective Practice and Organisation of Care (EPOC) Group data collection checklist (EPOC 2013). Review authors (of AM, JF, and KB) piloted the form and ensured that it was fit for purpose and that there was consistency of approach. We refined the form as we progressed in the data extraction process by adding further fields or categories to the existing fields.

We attempted to contact the original study authors if information in the article text or in an abstract was insufficient. If we identified multiple publications from one study, we treated the study as a single entity and extracted findings across all publications onto one form.

Assessment of risk of bias in included studies

We used the EPOC 'Risk of bias' tool to assess the risk of bias (EPOC 2015). The areas of bias addressed by the tool cover the domains outlined in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011).

Two review authors (of AM, JF, and KB) independently assessed the risk of bias of each included study, and assessment was compared and reconciled, if necessary, with the help of an arbitrator. We categorised each study as having low, high, or unclear risk of bias using the EPOC 'Risk of bias' tool (EPOC 2015). Any disagreements were resolved by discussion or by consulting the senior review author.

Measures of treatment effect

From each study we collected the outcomes relevant to this review, regardless of whether they were the primary outcome for each individual study or not. We extracted the effect estimate and confidence intervals of the intervention from the data provided in the publication.

We were unable to perform a meta‐analysis due to the heterogeneity of outcomes reported in the included studies. We presented a structured synthesis of the results as reported by the authors.

Unit of analysis issues

We were unable to perform any meta‐analysis and therefore did not experience any unit of analysis issues.

Dealing with missing data

We recorded if data were missing on the data extraction forms and then contacted the authors for further information. We also considered this information when judging the risk of bias of included studies.

Assessment of heterogeneity

Due to the nature of this review, we expected significant statistical heterogeneity between studies. In addition, it was difficult to anticipate a priori the sources of heterogeneity. We therefore extracted all important sources of heterogeneity in the data abstraction form, which included methodological and contextual aspects of the included studies.

Assessment of reporting biases

There was an insufficient number of studies to undertake a funnel plot for any outcome, therefore we were unable to perform an analysis for publication bias (Higgins 2011).

For studies where a protocol had been published, we compared the predefined outcome measures with those that were reported. For studies with no protocol, we examined the outcomes discussed in the methods section of the publication and compared these to the results. This is reflected in the 'Risk of bias' assessment.

Data synthesis

Different outcome measures and different methods of measuring outcomes were used in the studies included in this review. We were unable to combine studies in a meta‐analysis. We have therefore presented the findings as a structured synthesis (Higgins 2011).

'Summary of findings' table and assessing the certainty of the evidence

We used the five GRADE considerations (risk of bias, inconsistency, imprecision, indirectness, and publication bias) to make judgements about the certainty of the available evidence for each main outcome (Guyatt 2011). Two review authors (of AM, JF, and KB) independently carried out this assessment, resolving any disagreements through discussion with a third review author. We presented the information in 'Summary of findings' tables along with describing key information pertaining to the findings for each outcome including comparative risks, risk ratio, and the number of participants (Higgins 2011). We justified all decisions to downgrade the certainty of the evidence in relation to each outcome using footnotes (EPOC 2017).

The 'Summary of findings' tables present evidence for the three primary outcomes (survival to hospital discharge, morbidity rate and protocol or guideline adherence) and three secondary outcomes (patient outcomes, clinical practice outcomes and organisation of care) . We used GRADEpro GDT software to generate the 'Summary of findings' tables (GRADEproGDT 2015).

Subgroup analysis and investigation of heterogeneity

We were unable to investigate statistical heterogeneity because it was not possible to meta‐analyse the studies.

As described in the Types of outcome measures section, we classified the outcomes as patient, clinical practice, or organisation of care. We were unable to perform subgroup analyses due to the inability to undertake a meta‐analysis. However, we approached the review with the possible subgroups as a structure with which to consider the data. These included the following.

  • Clinical speciality, because different specialities may have different approaches to training or emergencies that are more amenable to short training interventions than others, e.g. shoulder dystocia training versus advanced neonatal resuscitation.

  • Composition of the participant group (multiprofessional or single profession), as this would enable an assessment of whether training in multiprofessional or single‐professional groups delivers improved outcomes. It would also allow a determination in terms the equity of training interventions between staff groups.

  • The frequency of the intervention, e.g. one‐off, monthly, annually, as this would allow consideration of whether it is important to have frequent repetitive training or whether one‐off training is sufficient.

  • Length of training, as this would allow an understanding of whether training interventions need to be long (e.g. one week) or if short interventions (e.g. one hour) can have an impact on patient care.

  • Local or off‐site training to understand whether training location matters.

  • Public or private institution where training occurs to allow consideration of the impact of the setting of the intervention.

  • Study design, study quality, degree of adjustment, geographical location to allow an understanding of the impact of the method of investigation on the outcomes.

  • Interventions that rely on the actions of a single provider versus a team of providers.

  • Outcome types: patient outcomes, clinical practice outcomes, and organisation‐of‐care outcomes.

  • Time period, as there may be time trends that increase safety culture.

  • Type of health system, e.g. public or private system.

  • Other relevant clinical/training/specialty characteristics identified during the data extraction.

Sensitivity analysis

We were unable to perform a sensitivity analysis.

Results

Description of studies

Results of the search

We identified 3261 references from electronic database searching and handsearching of reference lists after de‐duplication. Full‐text screening of 75 records resulted in 11 studies being included in the review (Characteristics of included studies). Three studies were ongoing (Characteristics of ongoing studies). The PRISMA flow diagram is shown in Figure 1.


Study flow diagram.

Study flow diagram.

Included studies

We identified 11 randomised studies for inclusion in this review. Four focused exclusively on obstetric training (Nielsen 2007; Riley 2011; Sorensen 2015; Fransen 2017), three on obstetric and neonatal training (Nisar 2011; Walker 2014; Gomez 2018), two exclusively on neonatal training (Opiyo 2008; Xu 2014), one on trauma (Knudson 2008), and one on general adult resuscitation (Weidman 2010). There were approximately 2000 healthcare workers randomised to different forms of training in these studies. Outcome data were collected on over 300,000 patients.

Study design and setting

Seven of the studies were cluster‐randomised trials (Nielsen 2007; Nisar 2011; Riley 2011; Walker 2014; Xu 2014; Fransen 2017; Gomez 2018), whilst four were single‐centre studies (Knudson 2008; Opiyo 2008; Weidman 2010; Sorensen 2015).

Regarding the cluster‐randomised trials, all but one study, Nisar 2011, focused solely on obstetrics and/or neonatology (Nielsen 2007; Riley 2011; Walker 2014; Xu 2014; Fransen 2017; Gomez 2018). The study that did not focus on emergency obstetrics included emergency obstetric training as part of the intervention (Nisar 2011). The largest trials were conducted in Ghana, China, and Mexico (Walker 2014; Xu 2014; Gomez 2018). In the Upper West, Central, and Western regions of Ghana, 40 public and mission hospitals were randomised to receive a training intervention in waves. Over the 18‐month study period, data were collected on 105,850 births (Gomez 2018). In two Eastern regions of China, 22 hospitals were randomised. Over the two‐year study period, data on 120,563 births were collected (62,774 in intervention and 57,789 in control) (Xu 2014). A large cluster‐randomised trial in three Mexican states included 24 community hospitals matched in 12 pairs, for which 58,837 deliveries occurred and 641 births were observed (Walker 2014).

There were three further large cluster‐randomised trials based in high‐income settings. One took place in 24 obstetric units in the Netherlands (12 intervention and 12 control); the authors collected outcome data on 28,657 women with a viable (beyond 24 weeks) pregnancy for one year following training (Fransen 2017). One study was undertaken in 15 labour and delivery units in the USA; 20,863 women delivered in the trial hospitals during the study (Nielsen 2007). The smallest study was conducted in three small community hospitals in the Midwest USA; in total these represented about 1800 deliveries per year (380/889/500). Women admitted to the hospital during the study period were included in the study (Riley 2011).

The cluster‐randomised trial not focusing on obstetrics/neonatology was based in three district hospitals in three cities in Pakistan; 248 life‐threatening emergencies were observed during the study (Nisar 2011).

Two of the single‐centre randomised trials focused on non‐obstetric/neonatal issues. One study was based in the emergency department of San Francisco General Hospital; the focus of the study was the treatment of trauma patients presenting to the department (Knudson 2008). A further study was based at a tertiary care facility in the USA with approximately 450 inpatient beds. Consecutive adult resuscitation attempts led by study participants occurring during the study period were included. Ninety‐eight cardiac arrests were analysed (Weidman 2010).

The remaining two studies focused on obstetrics/neonatology, one based at Punwami maternity hospital, Kenya. The hospital has 17,000 deliveries per year and is the main maternity facility for Nairobi. In this hospital 212 resuscitations of newborns were observed (97 in the intervention group and 115 in the control) (Opiyo 2008). The other study took place in the obstetric and anaesthesiology departments of a University of Copenhagen Hospital in Denmark, which has approximately 6300 deliveries per year. One hundred staff participated in the study (Sorensen 2015).

While all the included studies are randomised trials, there was heterogeneity in the study designs and settings. There was a mixture of cluster trials and standard trials as well as single site and multicentre trials. The settings also varied in terms of where the studies were conducted, and included a mix of low‐, middle‐, and high‐income countries.

Intervention and comparator groups

Gomez and colleagues delivered a low‐dose, high‐frequency intervention where there were eight days of low‐dose sessions at the hospitals, followed by ongoing high‐frequency practice sessions using simulators supplied by the study, delivered by a peer practice co‐ordinator who received extra training and mentoring calls. Local staff were trained to collect data. There was an internal control of no training, as this was a stepped wedge design (Gomez 2018). Xu and colleagues set up a system of cascading neonatal resuscitation training through the 11 intervention sites. Thirty providers were trained at the start of the study, and these healthcare workers set up local training at their hospitals. The control sites received only the routine training that was already offered at their hospital (Xu 2014). Walker and colleagues delivered 24 hours of PRONTO interprofessional obstetric emergency training to the 12 intervention hospitals. The control hospitals received no intervention (Walker 2014).

Fransen and colleagues arranged a one‐day multiprofessional simulation‐based team training focusing on crew resource management in a simulation centre. The control arm received no intervention (Fransen 2017). Nielsen and colleagues arranged an adapted version of the MedTeams Labor & Delivery Team Coordination Course, which was delivered as a three‐day instructor training session. These trainers then returned to deliver local training on site. A contingency team of senior staff were also trained to respond to obstetric emergencies. A total of 1307 delivery room staff were trained. The control arm received no intervention (Nielsen 2007). Riley and colleagues randomised three units to receive either TeamSTEPPS didactic training, TeamSTEPPS training, and in situ staff training or a control (Riley 2011).

Nisar and colleagues arranged a five‐day Essential Surgical Skills course with an emphasis on emergency maternal, neonatal, and child health. The control arm received no intervention (Nisar 2011).

Knudson and colleagues delivered 10 hours of scenario‐based teaching in either a didactic manner or a simulation‐enhanced training package (Knudson 2008). Weidman and colleagues randomised eligible residents to receive standard training plus simulation training or standard resuscitation training alone. The simulation group received a four‐hour resuscitation training session using a computerised mannequin simulator in a simulation laboratory (Weidman 2010).

Opiyo and colleagues delivered a one‐day resuscitation course for resuscitation at birth. The control group received delayed training (Opiyo 2008). Sorensen and colleagues delivered an in situ simulation course or an off‐site simulation course on two obstetric emergencies; the in situ training group was the intervention group (Sorensen 2015).

As described above, no two of the tested interventions were the same, including the interventions focused on obstetrics/neonatology, which comprise the vast majority of the studies in the review. This introduces an element of heterogeneity of intervention, making comparison difficult. However, despite this, all of the interventions had simulation as a core component of the intervention.

Participants

There was heterogeneity in the participants included in the studies. Gomez and colleagues recruited all the skilled birth attendants working in the study sites, which in practice meant that only midwives were recruited (Gomez 2018). In the study by Xu and colleagues, all obstetricians, paediatricians, and midwives were invited to participate. This was measured by the number of staff who completed the evaluation: 97 in the intervention and 87 in the control group (Xu 2014). Walker and colleagues included 450 physicians and nurses who worked directly with pregnant women or their infants during labour, birth, or the postpartum period (Walker 2014).

In the study by Fransen and colleagues, multiprofessional staff of the intervention units were obliged to participate, and were divided into multiprofessional teams. A total of 471 staff received the training course (Fransen 2017). In the study by Nielsen and colleagues, 1307 staff members from obstetrics, anaesthesiology, and nursing were trained by the newly trained team. These staff were also structured into core work teams (Nielsen 2007). In the study by Riley and colleagues, all labour and delivery staff were invited to participate (Riley 2011).

Nisar and colleagues recruited 36 doctors working in emergency departments and labour rooms and responsible for emergency management of general, obstetric, neonatal, and child health. Half of these received training (Nisar 2011).

Knudson recruited midlevel surgical residents to take part in the study; 18 participants were included in the study, but only 10 of whom for which there were outcomes relevant to this review (Knudson 2008). In the study by Weidman and colleagues, postgraduate year two internal medicine residents were recruited. These residents are on call one in four nights and lead the resuscitation team (Weidman 2010).

Opiyo and colleagues assessed nursing and midwifery staff who work in the labour ward and theatre (90 in total) for inclusion; only 35 met the eligibility criteria and were therefore all offered training, and the remaining 55 who were not eligible for inclusion (largely due to being unavailable at the required times) formed the control group (Opiyo 2008). Sorensen and colleagues assessed nurses, midwives, and doctors in all roles in the labour ward for eligibility. One hundred out of an eligible 249 were randomised and grouped into teams of 10 (Sorensen 2015).

Who delivered training?

Three studies adopted an approach of training the trainers, who then cascaded the intervention throughout the study sites.

In Gomez 2018, experienced skilled birth attendants (in this case midwives) were trained as master trainers. Who delivered this master training is not documented. These master trainers then delivered the on‐site courses in the 40 participating health facilities. Following the initial course, they selected and trained local peer practice co‐ordinators to deliver the ongoing intervention at each site (Gomez 2018).

In Nielsen 2007, clinical staff from the intervention hospitals attended an instructor training session; these staff returned to conduct local training sessions (Nielsen 2007). Xu and colleagues developed a cascade of trainers starting from five national Neonatal Resuscitation Program trainers, who trained 30 county‐level providers who were healthcare workers. Each of these instructors set up a hospital‐based training centre (Xu 2014).

The training was carried out by a specified group of trainers in seven studies. The background of the trainers is defined in two studies. In Sorensen and colleagues, instructors were recruited from the working committee, which consisted of representatives from all the healthcare professionals participating in the trial (Sorensen 2015). In Fransen 2017, the training at the simulation centre was run by two members (an obstetrician and communication expert) of a group of 10 facilitators with several years of experience (Fransen 2017).

The group delivering training was broadly defined in five studies. In Opiyo 2008, course instructors had completed a Kenya Resuscitation Council Advanced Life Support Generic Instructor Course co‐supervised by a team from the UK‐Resuscitation Council (Opiyo 2008). In Nisar 2011, the training was carried out by Advanced Life Support Group certified instructors, and in Walker 2014, the training was carried out by PRONTO trainers. In Knudson 2008, the instructors were by implication trauma surgeons (Knudson 2008). In Weidman 2010, there was then a faculty‐facilitated video debriefing of the scenario. The faculty is not specifically defined (Weidman 2010). In one study it was not clear who delivered the training (Riley 2011).

Outcomes

A wide variety of outcomes were reported by the studies included in the review. Many of these did not fall into our primary or secondary outcome measures, as they were not Kirkpatrick level 3 or 4 outcomes. Outcomes falling into each of our proposed categories were reported in at least one study. In terms of our primary outcomes, survival to hospital discharge was reported by one study (Weidman 2010); morbidity rate by three studies (Nielsen 2007; Riley 2011; Fransen 2017); and protocol/guideline adherence by three studies (Opiyo 2008; Weidman 2010; Nisar 2011). Regarding our secondary outcomes, patient outcomes were reported in five studies (Weidman 2010; Walker 2014; Xu 2014; Fransen 2017; Gomez 2018); clinical practice outcomes in five studies (Nielsen 2007; Knudson 2008; Riley 2011; Walker 2014; Sorensen 2015); and organisation‐of‐care outcomes in two studies (Walker 2014; Xu 2014).

Funding source

Four studies were funded from a single, government‐affiliated source. Fransen 2017 was funded by the Netherlands Organisation for Health Research and Development. Knudson 2008 and Weidman 2010 were funded by the US Army and the US National Institutes of Health, respectively. Nisar 2011 was funded by the Pakistan Initiative for Mothers and Newborns, a USAID‐funded organisation.

Five studies were funded by multiple organisations, including a government/governmental organisation. Nielsen 2007 combined funding from the Department of Defense and the American Research Institute. Riley 2011 combined a government source (US Agency for Healthcare Research and Quality) with local funding from the University of Minnesota Academic Health Center. Xu 2014 was funded by China‐Australia Health and HIV/AIDS Facility, a partnership between the governments of China and Australia.

Some studies were funded from sources from a variety of backgrounds. Sorensen 2015 was funded via Danish Regions Development and Research Foundation, the Laerdal Foundation for Acute Medicine, and Aase and Ejnar Danielsen Foundation. Walker 2014 was funded Mexican National Institute of Women (INMUJERES) and the State Secretary for Women in the states of Chiapas and Mexico. Supplemental funding was provided by the Bill and Melinda Gates Foundation and the Laerdel Foundation.

Two studies were funded from solely philanthropic sources: the Bill and Melinda Gates Foundation funded Gomez 2018, and the Laerdel Foundation and a Wellcome Trust senior research fellowship funded Opiyo 2008.

Excluded studies

We excluded 3186 studies at the screening stage (see the PRISMA diagram in Figure 1). At the full‐text stage, we screened 75 records and excluded 49, mainly because they were not randomised trials. Reasons for exclusion of seven studies are provided in the Characteristics of excluded studies table, which either almost met our inclusion criteria or were particularly large/important studies that did not meet our inclusion criteria.

Risk of bias in included studies

The risk of bias of the included studies is summarised in Figure 2 and Figure 3. No studies had an low overall risk of bias; a key reason for this is that it was not possible to blind study participants to the intervention. However, even excluding this element, no studies displayed an overall low risk of bias.


Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.


Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Allocation

Five included studies described adequate methods of random sequence generation and allocation concealment (Nielsen 2007; Weidman 2010; Nisar 2011; Sorensen 2015; Fransen 2017). In four studies random sequence generation or allocation concealment was not discussed (Knudson 2008; Riley 2011; Xu 2014; Gomez 2018). Two studies were unable to randomise as planned: one because the local ministry of health wanted to allocate two hospitals in two of their sites to the intervention (Walker 2014), and one because so many staff met the exclusion criteria that if randomisation had taken place, the study would have been underpowered (Opiyo 2008). In these two studies the planned allocation concealment was not discussed (Opiyo 2008; Walker 2014).

Blinding

It was not possible to blind participants to their allocation as they have to take part in the intervention. For this reason all studies had a high risk of performance bias. Four studies described blinding of outcome assessors (Knudson 2008; Weidman 2010; Nisar 2011; Sorensen 2015), suggesting a low risk of detection bias. Two studies had unblinded data collectors (Opiyo 2008; Gomez 2018), and one study had self‐reported outcomes with random verification by evaluators (Xu 2014). The remaining studies did not discuss the blinding of their outcome assessors (Nielsen 2007; Riley 2011; Walker 2014; Fransen 2017)

Incomplete outcome data

One study discussed incomplete outcome data, and reported it as low (Sorensen 2015). Two studies discussed how they planned to minimise missing data (Nielsen 2007; Fransen 2017), however the presence of missing data was not described, therefore we have judged them as at unclear risk of bias. The remaining studies did not discuss missing data and have therefore been assessed as at an unclear risk of bias (Knudson 2008; Opiyo 2008; Weidman 2010; Nisar 2011; Riley 2011; Walker 2014; Xu 2014; Gomez 2018).

Selective reporting

We assessed eight studies as at low risk of reporting bias, as the outcomes were reported as described in their protocol or methods (Nielsen 2007; Knudson 2008; Opiyo 2008; Nisar 2011; Riley 2011; Xu 2014; Sorensen 2015; Fransen 2017; Gomez 2018). One study did not define their primary outcomes in the methods and therefore has been judged as having an unclear risk of bias (Knudson 2008). One study could not report most of their outcomes of interest as they did not occur, and therefore developed other outcome measures (Walker 2014), and one study reported additional outcome measures that were not defined (Weidman 2010); these two studies have been allocated a high risk of bias.

Other potential sources of bias

Two studies did not appear to have any other sources of bias (Knudson 2008; Fransen 2017). Four studies that randomised staff within the same hospital have the possibility of contamination between study groups (Opiyo 2008; Weidman 2010; Nisar 2011; Sorensen 2015). In some studies new policies of oversight and support were set up in addition to training. For example, a contingency team who responded to emergency calls was created following training in one study (Nielsen 2007), and a neonatal resuscitation quality management team in another (Xu 2014). In one study, hospitals had to be replaced as 11 units were unable to continue to participate in the study when baseline data collection started (Walker 2014). In one study, some of the staff were transferred out of the facility before the end of the study, meaning that cross‐over could have occurred (Gomez 2018). In the remaining study, there were differences between the sites, with one site in particular having considerably more staff (Riley 2011).

Effects of interventions

See: Summary of findings for the main comparison Interactive training for in‐hospital‐based healthcare providers on the management of life‐threatening emergencies: effects on clinical practice and patient outcomes

The rationale for the level of certainty of evidence for each outcome is presented in the Grade Proflie in Appendix 2.

Primary outcomes

Survival to hospital discharge

There was low‐certainty evidence from one study, with 30 participants and 98 events (cardiac arrests), which showed that interactive training may make little or no difference to survival to hospital discharge (Weidman 2010). We downgraded the certainty of the evidence to low due to high risk of bias and imprecision.

Weidman and colleagues set their study in a tertiary healthcare facility in the USA and recruited internal medicine residents who were the leaders of the resuscitation team and measured consecutive resuscitation attempts led by these residents (Weidman 2010). This study measured survival to discharge during actual resuscitations. This study reported that 15.2% versus 9.6% survived; 95% confidence intervals (CI) were not reported (Weidman 2010).

Morbidity rate

Very low‐certainty evidence from three studies showed that it is uncertain whether interactive training improves morbidity rate (Nielsen 2007; Riley 2011; Fransen 2017). At least 1778 participants and a patient population of more than 57,193 (one study did not report figures) contributed evidence to this outcome. We downgraded the certainty of evidence to very low because of the high risk of bias, inconsistency and imprecision.

Nielsen and colleagues’ study was based in the labour and delivery units in the USA. They recruited labour and delivery staff and measured outcomes in women who were admitted to the study sites at over 20 weeks gestation (Nielsen 2007). Riley and colleagues’ study took place in three small community hospitals in the USA. They recruited labour and delivery staff and studied all women admitted to their hospitals during the study period (Riley 2011). Fransen and colleagues conducted their study in obstetric units in the Netherlands. The recruited members of the multiprofessional team and measured obstetric outcomes in women with singleton pregnancies over 24 weeks (Fransen 2017).

In Nielsen 2007, no evidence of a difference was observed for the primary outcome of Adverse Outcome Index: the mean was 8.3% in the intervention group and 7.2% in the control group; the approximate 95% CI for the difference between groups was −5.6 to 3.2. The Weighted Adverse Outcome Score was 2.7 in intervention versus 2.3 in control (95% CI −3.4 to 1.4), and the Severity Index was 31.6 in intervention versus 30.6 in control (95% CI −23.0 to 7.0) (Nielsen 2007).

Fransen and colleagues reported several of our primary outcomes. They reported as their primary outcome a composite of obstetric complications. They reported absolute number of complications as 287/14,500 in the intervention versus 299/14,157 in the control (odds ratio (OR) 1.0, 95% CI 0.80 to 1.3). With regard to their secondary outcome measures, trauma due to shoulder dystocia decreased in the intervention compared to control group (23/14,500 versus 35/14,157; OR 0.50, 95% CI 0.25 to 0.99). This was largely due to the contribution by reduced clavicle fracture (13/14,500 versus 26/14,157; OR 0.38, 95% CI 0.15 to 0.93). Interestingly, there were more severe postpartum haemorrhages in the intervention group (41/14,500 versus 19/14,157; OR 2.2, 95% CI 1.2 to 3.9) and subsequently more transfusions greater than 4 units (34/14,500 versus 18/14,157; OR 2.1, 95% CI 1.1 to 3.8); embolisations (10/14,500 versus 3/14,157; OR 4.7, 95% CI 1.3 to 17); and hysterectomies (10/14,500 versus 1/14,157; OR 10, 95% CI 0.99 to 120). All other secondary outcome measures, which included low Apgar score, eclampsia, hypoxic ischaemic encephalopathy (HIE), and a combined low Apgar/low arterial umbilical pH showed no changes in the intervention group (Fransen 2017).

The primary outcome in Riley 2011 was the Weighted Adverse Outcome Score. Riley and colleagues reported their results for each of the three sites as pre‐post intervention means. In the full intervention group (didactic and in situ simulations), there was a 37.4% reduction in the Weighted Adverse Outcome Score (1.15 standard deviation (SD) 0.47 pre‐intervention versus 0.72 SD 0.12 post‐intervention). In the didactic intervention group, there was a 1% reduction in Weighted Adverse Outcome Score (1.46 SD 1.05 pre‐intervention versus 1.45 SD 0.82 post‐intervention). Finally, in the control group there was an increase in Weighted Adverse Outcome Score of 42.7% (1.05 SD 0.79 pre‐intervention to 1.50 SD 0.35 post‐intervention) (Riley 2011).

Protocol or guideline adherence

Very low‐certainty evidence from three studies with 156 participants and 558 patients contributed to this outcome (Opiyo 2008; Weidman 2010; Nisar 2011). According to these studies, it is uncertain whether interactive training improves protocol or guideline adherence. We downgraded the certainty of evidence to very low because of the high risk of bias, inconsistency of findings and the small number of participants.

Opiyo and colleagues recruited nursing and midwifery staff from one hospital in Kenya. They studied resuscitations of newborns during the study period(Opiyo 2008). Weidman and colleagues’ study was based in a tertiary hospital in the USA. Internal medicine residents were recruited and the resuscitation attempts were analysed (Weidman 2010). Nisar and colleagues recruited doctors working in the labour room in three hospitals in Pakistan. They studied the structured approach to life‐threatening emergencies and included patients experiencing life‐threatening emergencies during the study period (Nisar 2011).

Opiyo and colleagues assessed whether nurses/midwives post‐training undertook more perfect or adequate resuscitations than those who did not receive training. In the first phase of the study, where 35 providers were trained perfect (23.7% versus 10.4%; OR 2.27, 95% CI 1.23 to 4.22) or adequate (66% versus 27%; OR 2.45, 95% CI 1.75 to 3.42), resuscitation was more likely to take place with training. Following the phase 2 roll‐out of the intervention, this held true, with 40% of resuscitations being perfect compared to 13.3% of the controls being perfect (OR 3, 95% CI 0.79 to 11.42). Similarly, 74.3% of resuscitations were adequate compared to 60% in the control (OR 1.24, 95% CI 0.71 to 2.15). When combining phase 1 and 2, resuscitations were also more often perfect (28% versus 10.8%; Risk Ratio 2.60, 95% CI 1.53 to 4.43) and adequate (68.1% versus 30.8%; Risk Ratio 2.22 95% CI 1.64 to 2.99) in the intervention group (Opiyo 2008).

Mean resuscitation scores were higher in the intervention group during phase 1 (2.50, 95% CI 2.25 to 2.74 versus 1.95, 95% CI 1.74 to 2.6). This was also observed in the pooled data from phase 1 and 2 (2.4, 95% CI 2.18 to 2.61 versus 1.83, 95% CI 1.61 to 2.04) (Opiyo 2008).

Weidman and colleagues showed that cardiopulmonary resuscitation quality, as recorded by the defibrillators, was similar in terms of the compression depth (intervention 47.9 mm (SD 7.0) versus control 48.8 mm (SD 7.7)); compression rate (106.5 min‐1 (SD 6.0) versus 104.4 min‐1 (SD 9.2)); ventilation rate (11.5 min‐1 (SD 4.0) versus 12.2 min‐1 (SD 4.1)); no‐flow fraction (median intervention 0.08 (interquartile range (IQR) 0.05 to 0.12) versus control 0.07 (IQR 0.05 to 0.11)); pre‐shock pause (5.3 s (IQR 4.0 to 8.6) versus 3.6 (IQR 2.4 to 5.2)), post‐shock pause (2.9 s (IQR 2.2 to 3.3) versus 2.4 (IQR 1.7 to 2.6)); and appropriate shocks (mean intervention 66.2% (SD 12.9) versus mean control 71.4% (SD 9.2)) (Weidman 2010).

Nisar and colleagues examined whether a structured approach to emergencies was taken in each group. In the individual‐level analysis, 79/124 of events in the intervention group were managed according to a structured approach compared to 46/124 in the control group (OR 2.98, 95% CI 1.78 to 4.99). In the cluster‐level analysis, 62.9% (50.4 to 75.3) were managed according to a structured approach in the intervention group compared to 36.3% (26.3 to 46.4) in the control group (Nisar 2011).

Secondary outcomes

Patient outcomes

Five studies contributed evidence to patient outcomes with 951 participants and 314, 055 in the patient population (Weidman 2010; Walker 2014; Xu 2014; Fransen 2017; Gomez 2018). Due to very low‐certainty evidence, it is uncertain whether interactive training affects patient outcomes. We downgraded the certainty of evidence to very low because of high risk of bias, inconsistent results and small sample sizes.

Weidman and colleagues’ study was based in a tertiary hospital in the USA. Internal medicine residents were recruited and the resuscitation attempts were analysed (Weidman 2010). Walker and colleagues worked in 24 community hospitals in Mexico. They trained interprofessional teams and studied maternal and neonatal outcomes at the study sites (Walker 2014). Xu and colleagues worked in 22 provinces in China. They recruited all healthcare providers and studied resuscitation of neonates at all live births in the study hospitals (Xu 2014). Fransen and colleagues conducted their study in obstetric units in the Netherlands. The recruited members of the multiprofessional team and measured obstetric outcomes in women with singleton pregnancies over 24 weeks (Fransen 2017). Gomez and colleagues’ study took place in Ghana. They trained skilled birth attendants at hospitals and studied institutional deliveries at study sites (Gomez 2018).

Three studies reported improvements in patient outcomes (Walker 2014; Xu 2014; Gomez 2018). Two studies did not show improvement in patient outcomes, although the trend in their findings was towards improvement in patient outcomes (Weidman 2010; Fransen 2017).

In terms of studies reporting improvement in patient outcomes, Xu and colleagues showed that 10/62,274 died from asphyxia‐related causes in the intervention group, whilst 14/57,789 did so in the control group. Similarly, there were 464/62,274 babies born with asphyxia in the intervention group and 448/57,789 in the control group (Xu 2014). Gomez and colleagues split their results into the effect in the first and second six months. In terms of intrapartum stillbirth, there were 242/36,160 deliveries in the first six months compared to 392/38,192 at baseline (risk ratio (RR) 0.65, 95% CI 0.54 to 0.78). In the second six months, there were 165/31,498, equating to a RR of 0.49 (95% CI 0.36 to 0.65). With regard to newborn mortality within 24 hours of birth, there were 284/38,192 at baseline; 140/36,160 in the first six months (RR 0.41, 95% CI 0.32 to 0.51); and 104/31,498 in the second six months (RR 0.30, 95% CI 0.21 to 0.43). The risk ratios presented are adjusted for region and facility level (Gomez 2018).

Walker and colleagues measured a 44% decrease in perinatal mortality rates (95% CI −87% to −36% ) (Walker 2014).

Regarding studies not showing improvement, in Fransen 2017, the secondary outcome of perinatal mortality was 0.45% in the intervention group versus 0.55% in the control group (OR 0.75, 95% CI 0.53 to 1.07). One maternal death occurred in the control group (Fransen 2017). Weidman and colleagues reported finding no statistically significant change in return of spontaneous circulation with the intervention (56.5% versus 51.9% ‐ no confidence intervals presented) (Weidman 2010).

Clinical practice outcomes

Four studies contributed evidence on clinical practice outcomes (Nielsen 2007; Knudson 2008; Riley 2011; Sorensen 2015). Over 1417 participants with over 28,676 patients (one study reported no numbers)) contributed to this outcome. Due to very‐low certainty evidence, we are uncertain whether interactive training makes a difference to clinical practice outcomes. We downgraded the certainty of evidence to very low because of risk of bias, inconsistency in results and due the sample size being small or unclear in some studies.

Nielsen and colleagues’ study was based in the labour and delivery units in the USA. They recruited labour and delivery staff and measured outcomes in women who were admitted to the study sites at over 20 weeks gestation (Nielsen 2007). Knudson and colleagues recruited mid‐level surgical trainees working in an emergency department in the USA. They studied crisis management skills in major resuscitations (Knudson 2008). Riley and colleagues’ study took place in 3 small community hospitals in the USA. They recruited labour and delivery staff and studied all women admitted to their hospitals during the study period (Riley 2011). Sorensen and colleagues worked in one hospital in Denmark. They recruited shift working staff on the labour ward and studied these participants using a safety attitude questionnaire (Sorensen 2015).

Nielsen and colleagues examined 11 process measures, of which only 2 were relevant to our secondary outcomes. They documented immediate caesarean section decision‐to‐incision interval, which was the only outcome that showed improvement in the intervention group with an adjusted mean of 21.1 minutes in the intervention group versus 33.3 minutes in the control group (95% CI −36.9 to −0.7) (Nielsen 2007).

Riley and colleagues administered a safety attitudes questionnaire to measure impressions of the culture of safety. They reported that there was absence of evidence for change in safety attitudes in the control or didactic intervention site. In the full intervention site (didactic and in situ simulation), there was an increase in the teamwork domain scores, but the authors reported that this was "not statistically significant" when adjustment was applied. Numbers were not reported (Riley 2011).

Sorensen and colleagues used safety attitudes questionnaires pre‐ and postintervention, and mean differences were calculated. Teamwork scores reduced by 1.4 (95% CI −5.8 to 3.1); safety climate scores increased by 1.6 (95% CI −2.0 to 5.1); job satisfaction increased by 0.6 (95% CI −2.9 to 4.1); stress recognition reduced by 2.6 (95% CI −9.2 to 4.0); and work condition reduced by −0.3 (95% CI −5.7 to 5.1) (Sorensen 2015).

Knudson and colleagues measured the skills of surgical residents during trauma calls by videotaping them during actual resuscitations. The scores for initial treatment skills were similar for both the critical (simulation 91% SD 25 versus didactic 89% SD 28) and overall (simulation 71% SD 15 versus didactic 68% SD 14) skills. Crisis management skills were similar for overall (simulation 83% SD 17 versus didactic 74% SD 22); decision making (simulation 71% SD 31 versus didactic 60% SD 32); and situation awareness (simulation 85% SD 14 versus didactic 79% SD 19). However, the teamwork elements of the crisis management scores seemed to increase in the simulation group compared to the didactic group (87 SD 19 versus 72 SD 24) (Knudson 2008).

Organisation‐of‐care outcomes

Two studies (634 participants; 179,400 patient population) contributed evidence to organisation‐of‐care outcomes (Walker 2014; Xu 2014). Due to very low‐certainty evidence, it is uncertain whether interactive training improves organisation‐of‐care outcomes. We downgraded the certainty of evidence to very low because of high risk of bias and inconsistency between studies.

Walker and colleagues worked in 24 community hospitals in Mexico. They trained interprofessional teams and studied maternal and neonatal outcomes at the study sites (Walker 2014). Xu and colleagues worked in 22 provinces in China. They recruited all healthcare providers and studied resuscitation of neonates at all live births in the study hospitals (Xu 2014).

Walker and colleagues found that the 12 intervention hospitals together identified 124 goals, of which 33 focused on teamwork; 35 focused on additional training; and 56 focused on system changes. After a 3‐month interval, between 2 and 12 goals were achieved by participant teams (mean = 6 goals) at each site. Seventy‐three (58.8%) of these goals were completed, including 28 (80%) of training goals, 30 (53%) of system change goals, and 15 (45%) of teamwork goals (Walker 2014).

Xu and colleagues distributed questionnaires to the 11 intervention and control hospitals at the end of the study. In terms of neonatal resuscitation providers being present at delivery, 10/11 intervention sites reported this as standard compared to 8/11 controls. Periodic neonatal resuscitation training was provided in 11/11 of the intervention sites compared with 8/11 of controls. Paediatricians participating in pre‐resuscitation discussion occurred in 10/11 of the intervention sites compared to 5/11 of controls. The neonatal intensive care team were present at delivery in 6/11 of the intervention sites compared to 2/11 of controls; paediatricians being in the delivery room for high‐risk deliveries occurred in 11/11 of the intervention sites compared to 6/11 of controls; and neonatal resuscitation case audit/discussion occurred in 10/11 of the intervention sites compared to 4/11 of the controls (Xu 2014).

Elements that may impact on effectiveness of intervention

We identified possible subgroups a priori. We have included a discussion of the elements identified as most important due to their potential impact on the effectiveness of the intervention. We also identified some potential mediating factors post hoc, which are discussed at the end of this section.

Multidisciplinary training

Multidisciplinary training took place in all but three of the studies involving obstetrics/neonatal emergency training. Xu and colleagues invited obstetricians, paediatricians, and midwives (Xu 2014), and Walker and colleagues offered training to all physicians and nurses who worked directly with pregnant women or their infants during labour (Walker 2014). Fransen and colleagues delivered training to several multiprofessional obstetric teams consisting of a gynaecologist/obstetrician, a secondary care midwife and/or a resident, and nurses (Fransen 2017). Nielsen and colleagues trained obstetricians, anaesthesiologists, and nurses from each of the intervention sites (Nielsen 2007). Riley and colleagues trained all labour and delivery staff (Riley 2011). Sorensen and colleagues included healthcare professionals who worked in shifts on the labour ward: consultant and trainee doctors in obstetrics and anaesthesiology, midwives, specialised midwives, auxiliary nurses, nurse anaesthetists, and operating theatre nurses (Sorensen 2015).

Opiyo and colleagues trained only nursing/midwifery staff (Opiyo 2008); Gomez and colleagues recruited all skilled birth attendants (however in actuality only midwives participated) (Gomez 2018); and Nisar and colleagues trained only doctors (Nisar 2011). Similarly, the two studies focusing on non‐obstetric/neonatal training focused on single staff groups: Knudson and colleagues trained midlevel surgical trainees (Knudson 2008), whilst Weidman and colleagues trained postgraduate year two internal medicine residents (Weidman 2010).

Location of training

The location of training was not universally reported. In situ training was delivered by Xu and colleagues and Gomez and colleagues, who reported training in local hospitals (Xu 2014; Gomez 2018). Nielsen and colleagues delivered a training‐the‐trainers course at an undisclosed location, but training to the wider staff was conducted at their hospitals (Nielsen 2007). Riley and colleagues describe the didactic component as classroom based, with the simulations in situ (Riley 2011), and Sorensen and colleagues investigated training in situ versus off‐site (Sorensen 2015).

Two studies delivered training solely in a simulation centre (Weidman 2010; Fransen 2017). The location of training in the remaining four studies was unclear or not discussed (Knudson 2008; Opiyo 2008; Nisar 2011; Walker 2014).

Duration of each course

The duration of the courses varied widely. The longest course was delivered by Gomez and colleagues, which involved eight days of training and monthly simulation sessions (Gomez 2018). Another study was a one‐off five‐day training session undertaken by doctors in the intervention group (Nisar 2011). Walker and colleagues delivered 24 hours of training (Walker 2014), and Knudson and colleagues delivered 10 hours of training spread over five weeks (Knudson 2008).

Three studies had an intervention length of one day (Opiyo 2008; Sorensen 2015; Fransen 2017). Two courses were shorter: Weidman and colleagues delivered a 4‐hour course (Weidman 2010), and Riley and colleagues delivered a 30‐minute didactic intervention or a simulation of 30 to 45 minutes with 2 hours of debriefing (Riley 2011).

In another study, the length of the intervention was not clear in terms of the duration of the actual training session or whether staff attended repeated training sessions (Nielsen 2007). The length of the intervention in another study was also unclear (Xu 2014).

Duration of follow‐up

The duration of follow‐up varied dramatically. Nisar and colleagues and Sorensen and colleagues collected data for just four to six weeks after the intervention (Nisar 2011; Sorensen 2015). One study lasted eight months, and Opiyo and colleagues collected data for a total of one year, with six months being retrospective and the following six months being for three months after the first training and three months after second (Opiyo 2008). Fransen and colleagues collected data for one year after all staff had received the intervention (Fransen 2017), as did Gomez and colleagues (who also had a six‐month run‐in period) (Gomez 2018). Another study continued for 15 months (Nielsen 2007).

Two studies lasted for three years (Walker 2014; Xu 2014); one study for four years (Riley 2011); and the Knudson 2008 study was still in progress at time of report (Knudson 2008).

Areas identified from the data

Focus of training

Three studies specifically mention team‐based training or a focus on team training in their packages. Fransen and colleagues delivered simulation‐based obstetric team training (Fransen 2017). Nielsen and colleagues delivered MedTeams labour and delivery team co‐ordination course based in crew resource management principles (Nielsen 2007), and Riley and colleagues delivered teamwork alone or teamwork and TeamStepps simulation training (Riley 2011).

The remaining studies may have included teamwork in their training, but the focus of the intervention was skills and knowledge based. Xu and colleagues delivered neonatal resuscitation training (Xu 2014). Opiyo and colleagues focused on an ABC approach to resuscitation at birth (Opiyo 2008).

Walker and colleagues and Sorensen and colleagues delivered simulation of obstetric emergency training (PRONTO) (Walker 2014; Sorensen 2015). Nisar and colleagues delivered essential surgical skills with a focus on emergency maternal, neonatal, and child health (Nisar 2011), and Gomez and colleagues delivered a curriculum of neonatal resuscitation and management of obstetric emergencies (Gomez 2018). Knudson and colleagues delivered a scenario‐based trauma curriculum (Knudson 2008), and Weidman and colleagues resuscitation training (Weidman 2010).

Proportion of staff involved in the intervention

The proportion of staff involved in the intervention was not universally reported. Fransen and colleagues explicitly stated that participation in intervention units was approximately 95% (Fransen 2017). Walker and colleagues also reported this information, noting that between 6.4% and 31.6% of eligible medical personnel at each facility were trained, with a mean participation rate of 20.5%. Overall, 450 of 3228 eligible personnel in all 12 hospitals participated in the training (Walker 2014). Opiyo and colleagues reported that there were 90 providers, of which 32 were trained initially (55 not eligible to be randomised), whilst in a later phase a further 34 providers were trained (Opiyo 2008).

Some studies discussed the numbers/proportion of eligible participants trained, rather than the proportion of overall staff trained. Sorensen and colleagues reported that 100 of 249 eligible participants were recruited, of which half were assigned to intervention and half to control (Sorensen 2015). Nisar and colleagues stated that all eligible doctors were randomised, and all of the 50% assigned to the intervention group participated (Nisar 2011). Weidman reported that 30 residents were eligible to be randomised and that the intervention was delivered to all 14 residents randomised to it (Weidman 2010).

Nielsen and colleagues reported the number of staff trained across the 7 hospitals as 1307, however the proportion is not stated (Nielsen 2007). Riley and colleagues and Gomez and colleagues do not report this information, although everyone was invited to participate (Riley 2011; Gomez 2018). Similarly, Xu and colleagues imply that all are invited, but the proportion is not clear (Xu 2014). It was not clear how many were invited or eligible to participate in Knudson 2008.

Leadership of intervention

In five included studies, the research team initiated the intervention (Opiyo 2008; Nisar 2011; Walker 2014; Sorensen 2015; Fransen 2017). Knudson and colleagues do not clearly report this, but they imply that the authors themselves are involved in the postgraduate training programme, and this intervention is being delivered to trainees (Knudson 2008). Nielsen and colleagues do not clearly report the leadership of the intervention, however they do report that the Department of Defense is committed to the crew resource training approach (Nielsen 2007). The intervention in Xu 2014 is driven by the Chinese Ministry of Health. The intervention in Gomez 2018 is driven by the priorities of the Ghanaian Health Service and the non‐governmental organisation running the study (Jhepigo). The leadership of the intervention in the remaining two studies is unclear (Weidman 2010; Riley 2011).

Incentive/trigger to participate in study

The triggers to start or participate in the study were wide‐ranging. Some studies aimed to build evidence for whether training was effective. Fransen and colleagues believed there was a lack of evidence for improvement of maternal and perinatal outcomes (Fransen 2017). Sorensen and colleagues wanted to establish the impact of off‐site or in situ training on stress and motivation to understand how to maximise learning (Sorensen 2015).

Riley and colleagues had previously identified that it did not seem that proficiency during simulation translated to clinical proficiency, and wanted to investigate this (Riley 2011). Similarly, both Opiyo and colleagues and Nisar and colleagues had found that there was little evidence of effect of training on patient outcomes (Opiyo 2008; Nisar 2011).

Xu and colleagues recognised that there had been improvements in asphyxia‐related deaths in previous studies, however counties and townships were not prioritised, and they hoped to help this with their initiative (Xu 2014). Walker and colleagues identified the need to develop low‐cost, high‐fidelity simulation training for low‐resource settings (Walker 2014). Gomez and colleagues wanted to support the government's health strategy to reduce institutional newborn mortality through training 90% of the country's skilled birth attendants (Gomez 2018).

Nielsen and colleagues identified that increasing costs of liability insurance meant that a major change in behaviour may be accepted (Nielsen 2007).

A final reason was to prepare for new roles. Knudson and colleagues aimed to prepare residents for their role as trauma team leaders, due to the need to efficiently and effectively train trauma surgeons as the burden of injury is increasing globally (Knudson 2008). Weidman and colleagues identified that residents did not feel adequately trained to lead resuscitations (Weidman 2010).

Discussion

Summary of main results

Given that the certainty of evidence in this review is very low in general, we are unable to report the effects of interactive training of healthcare providers on any of the outcome measures with any certainty. Having said that, of the 11 studies included in this review, nine reported that the training intervention improved at least one outcome at Kirkpatrick 3 or 4 level (Nielsen 2007; Knudson 2008; Opiyo 2008; Nisar 2011; Riley 2011; Walker 2014; Xu 2014; Fransen 2017; Gomez 2018). The remaining two studies showed no improvements in the outcomes of interest for this review (Weidman 2010; Sorensen 2015).

We have seen that interactive training can lead to changes in our primary outcomes of morbidity (Riley 2011; Fransen 2017), and adherence to protocols/guidelines (Opiyo 2008; Nisar 2011), however, no change was observed in survival to hospital discharge (Weidman 2010). Patient outcomes were improved in three studies (Walker 2014; Xu 2014; Gomez 2018), clinical practice outcomes in two studies (Nielsen 2007; Knudson 2008), and organisation‐of‐care goals in two studies (Walker 2014; Xu 2014). When considering the positive impact of training, it is important to note that in one study there was an increase in the number of severe postpartum haemorrhages, blood transfusions, and embolisations to manage the postpartum haemorrhages (Fransen 2017). Overall, it is uncertain whether interactive training changes our outcomes of interest given the very low certainty of the evidence, and the adverse effects reported in Fransen and colleagues do highlight the need for caution around the assumption that training is always a good thing.

When comparing studies that reported change with those which did not, there was little to distinguish the two groups. Both groups showed heterogeneity in terms of study location, which staff were targeted, duration of training course, location of training course, and proportion of staff included in the study.

Studies that included a multiprofessional staff group were more successful at modifying complex processes, which require a number of different elements to work concurrently. For example, there were improved decision‐to‐incision intervals for caesarean section in the Nielsen 2007 study. The process of transferring a woman to the operating theatre and anaesthetising her is complex and requires several staff groups working together (Nielsen 2007). A further example is from the Fransen 2017 study, where shoulder dystocia trauma was reduced following training (Fransen 2017). Managing shoulder dystocia often requires three to four people working efficiently and harmoniously.

When studies focus on single staff groups, there seem to be changes in behaviour that are less multidimensional and more focused on the actions of one person, for example in Nisar 2011 and Opiyo 2008, there was an improvement in the structured approach taken to manage an emergency (Opiyo 2008; Nisar 2011). In Gomez 2018, there was a reduction in intrapartum stillbirth and neonatal mortality. The authors described how one reason for this reduction was because staff were trained to resuscitate every baby that was born not breathing, except the macerated stillbirths (babies who had clearly been dead inside the womb for some time and had external changes to reflect this). The initiation, and in fact the initial manoeuvres, of neonatal resuscitation tend to lie with a single healthcare worker (Gomez 2018).

However, Knudson and colleagues trained only surgical trainees, and the only improvement shown following training was in the teamwork elements of their crisis management scores, which would seem to be a complex process (Knudson 2008).

It needs to be acknowledged that whilst some improvements were seen in some outcomes, many of the outcome measures included in this review did not show change. As mentioned previously, no change was shown with interactive training in two studies (Weidman 2010; Sorensen 2015). However, in four studies (Opiyo 2008; Nisar 2011; Walker 2014; Gomez 2018), there were positive changes in all of the outcomes reported that were included in this review. Interestingly, three of these studies provided training to a single professional group (Opiyo 2008; Nisar 2011; Gomez 2018), whilst one provided multidisciplinary training (Walker 2014). The remaining studies showed improvements that were spread across different outcome areas.

One interesting point raised in considering the four studies that showed improvement across outcomes is that of length of time of follow‐up. Two of the studies had a relatively short follow‐up time of two to three months following the intervention (Opiyo 2008; Nisar 2011). One study had a follow‐up of one year, however it is also important to note that the intervention was ongoing (Gomez 2018). Whereas Walker and colleagues had a much longer follow up period of three years after providing training sessions (Walker 2014). This raises the point as to whether the length of follow‐up in some studies is optimal. When an intervention has to become embedded in a hospital, the change takes time, and it may be necessary to allow enough time for this change to be observed.

This review highlights that focused training, often of single professional groups, can result in an improvement in specific skills over a short period of time (Opiyo 2008; Nisar 2011). This raises the point of knowledge decay, because it is less clear how long this knowledge lasts and therefore how frequently training needs to be repeated. Fransen and colleagues tried to consider this by looking at whether there was a change in the effectiveness of their intervention between the four quarters of the year following implementation (Fransen 2017). They identified that the effectiveness of the intervention in terms of impact on patient outcomes seemed to decline three months following the intervention (van de Ven 2017). Other studies have recognised the deterioration in knowledge and the importance of repetitive rather than one‐off interventions (Bluestone 2013). Several studies in this review seem to have considered this factor and ensured that ongoing training was part of their intervention (Nielsen 2007; Xu 2014; Gomez 2018).

One other area that could have affected the way the interactive training studies were implemented was the leadership for the intervention and the trigger for initiating the study. These studies were largely initiated by the researchers reporting them, who were seeking more evidence to understand whether interactive training can improve actual outcomes, which are identified here as being at level 3 or 4 of the Kirkpatrick scale. This was the case across all the studies reported in this review, rather than being unique to either the group showing change or not.

Another factor we identified is that in order to fund these important studies which are generally large in scale or length, a significant commitment is required. Governments or government agencies or large philanthropic organisations funded all of the included studies. This perhaps links to the complexity and scale required to answer the patient and organisation‐of‐care focused outcomes.

The fact that randomised trials have been used to demonstrate changes in clinically and organisationally important outcomes is a significant step forwards from the previous focus on observational studies. It is difficult to power studies to achieve this aim, not in the least because the events are relatively rare, thus large number of participants or long time periods can be required to see any impact.

Overall completeness and applicability of evidence

We identified a comprehensive set of randomised trials. To achieve this we employed an inclusive search strategy to identify all randomised trials, and retrieved the full‐text reports of any that we thought may be useful. Our main limitations in terms of completeness and applicability of the evidence lie with the evidence that is available and the inherent weaknesses within it. This is largely due to the complex nature of training interventions as well as the tendency to implement projects on a small scale rather than in a research setting.

The evidence found is largely based in obstetrics and neonatology, with only two of the 11 studies being focused outside of these areas. This may be particularly important as, in the experience of the review team, obstetrics and neonatology can be far more isolated medical specialties than, for example, internal medicine or surgery. This is due to the highly specialised nature of the patients being treated, meaning that the teams are relatively small and close‐knit compared to the broader medical specialities. However, this also means that it has not been possible to gain a comprehensive insight into the lessons across a broader group of specialities, which was one of the objectives of this review.

Due to the nature of the ways the included studies are reported, it has not been possible to ascertain exactly how many patients fed into the outcomes for this review. This is because the population studied could either be the staff group to whom training was offered, or the patient population of the department participating in the study. This contributes to the difficulty in deciding how far the review findings are generalisable.

Despite having exclusively used randomised trial evidence, it has not been possible to combine results to give a pooled estimate for the effect size of training for a specific outcome. Even when studies were investigating the same area, the outcome measures being used were disparate. And even when measuring morbidity in obstetrics, for example, there are no universally defined criteria, therefore studies measure this same outcome differently, making combining results impossible. It was also not possible to ascertain some of the issues that are useful for understanding the implementation of the studies, such as what proportion of staff were trained and what triggered the intervention. This is in part due to a lack of uniform criteria for complex behavioural‐change interventions.

Whilst all of the studies included in this review are randomised, there is a large body of evidence in this area that comes from non‐randomised studies, which provides additional information not included in this review. For example, a review of observational evidence found that technology‐enhanced training could improve patient outcomes, however the evidence for the improved pooled effect size of 0.50 (95% CI, 0.34‐0.66; P < .001) was inconsistent and included studies reporting negative effects (Cook 2011).

Certainty of the evidence

As seen in Figure 2, Figure 3, and summary of findings Table for the main comparison, none of the studies provide evidence with a low risk of bias, even when the element of the participants being blinded (which is not possible when they are the ones receiving training) is removed. Furthermore, using the GRADE criteria, the certainty of evidence is low for the outcome of survival to discharge and very low for all other outcomes.

Potential biases in the review process

When screening the search results, we identified some articles that described complex interventions associated with interactive training. When the intervention described was considered to be substantially more than interactive training, the review team debated its inclusion, our concern being that the changes observed in the study needed to be due to the interactive training being assessed rather than any separate organisational changes. There were further debates around how immediate the emergency care needed to be in order to be included, as a vast array of emergency care exists.

Agreements and disagreements with other studies or reviews

Broadly, the findings of this review agree with those of other reviews, namely that there is little high‐quality randomised trial evidence to confirm that interactive training programmes effect patient or organisational outcomes. One review focused on technology‐enhanced simulation, incorporating randomised trial evidence and observational evidence. They concluded that technology‐enhanced simulation had a moderate effect on patient outcomes (Cook 2011). However, when looking at individual specialities, rather than across specialities, there is more convincing evidence that interactive training is effective. For example, a Cochrane Review investigating in‐service training for health professionals to improve care of seriously ill newborns and children in low‐income countries found that in‐service training improves health professionals' treatment of neonates (Opiyo 2015). However, they only found two studies and called for further high‐quality evidence. A further Cochrane Review of newborn resuscitation training programmes showed that there was a reduction of early neonatal mortality with training (Dempsy 2015; Pammi 2016). A recent review of advanced cardiac life support reports that it is likely that advanced life support courses have an effect on survival to discharge and return of spontaneous circulation, although no randomised trial evidence was eligible for inclusion in that review (Lockey 2018). Having said this, a Cochrane Review of Advanced Trauma Life Support training revealed that there is no randomised trial evidence that trauma training programmes improve outcomes for victims of injury (Jayaraman 2014). With regard to obstetric training, there is an ongoing review (Fransen 2015) and another review that suggests there were positive results from training (Bergh 2015).

In terms of identifying the active components of training, our review has not clearly identified the essential components required to change outcomes. We have seen that multiprofessional training can effect complex processes and training focusing on single staff groups can alter individuals behaviour. However, a previous review of obstetric emergency training made clear conclusions as to the necessary active components. These were having institutional‐level incentives to train, multiprofessional on‐site training of all staff, teamwork training, and high‐fidelity simulation models (Siassakos 2009).

Study flow diagram.
Figures and Tables -
Figure 1

Study flow diagram.

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.
Figures and Tables -
Figure 2

Risk of bias graph: review authors' judgements about each risk of bias item presented as percentages across all included studies.

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.
Figures and Tables -
Figure 3

Risk of bias summary: review authors' judgements about each risk of bias item for each included study.

Summary of findings for the main comparison. Interactive training for in‐hospital‐based healthcare providers on the management of life‐threatening emergencies: effects on clinical practice and patient outcomes

The effects of interactive training of healthcare providers on the management of life‐threatening emergencies in hospital

Patient or population:

Participants: Healthcare workers delivering life‐saving emergency care in a hospital setting (obstetric/labour and delivery staff, physicians, skilled birth attendants, midwives, midlevel surgical trainees, anaesthesiologists, nurses, internal medical residents)
Population: Patients who suffer life‐threatening emergencies in hospital: women around the time of birth, neonates, trauma patients, and adults undergoing resuscitation
Setting: All hospital settings are included. The evidence for this review is drawn from the Netherlands, Denmark, the USA, China, Pakistan, Kenya, Mexico, and Ghana.
Intervention: Interactive training, i.e. any training including a component in which participants are not just passive recipients of the training
Comparison: Standard training delivered at the facilities, no training, or an element of the intervention (e.g. a new training session) but only the didactic component

Outcomes (number of studies)

No. participants/no. in the population studied

Certainty of the evidence
(GRADE)

Impact and selected results

Survival to hospital discharge

(1 study)

30 participants

98 events (cardiac arrests) observed

⊕⊕⊝⊝
Low 1

Interactive emergency training strategies may make little or no difference in survival to hospital discharge.

Morbidity rate

(3 studies)

1778 participants

57,193 in the population studied2

⊕⊝⊝⊝
Very low 3

It is uncertain whether interactive training leads to change in morbidity rates.

Protocol or guideline adherence

(3 studies)

156 participants

558 in the population studied

⊕⊝⊝⊝
Very low 4

It is uncertain whether interactive training leads to change in protocol or guideline adherence.

Patient outcomes

(5 studies)

951 participants

314,055 in the patient population

⊕⊝⊝⊝
Very low 5

It is uncertain whether interactive training leads to change in patient outcomes.

Clinical practice outcomes

(4 studies)

1417 participants

28,676 in the population (patients and staff)2

⊕⊝⊝⊝
Very low 6

It is uncertain whether interactive training leads to changes in clinical practice outcomes.

Organisation of care

(2 studies)

634 participants

179,400 in the patient population

⊕⊝⊝⊝
Very low 7

It is uncertain whether interactive training leads to change in organisation‐of‐care measures.

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty: We are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.
Low certainty: Our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect.
Very low certainty: We have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.

1We downgraded the certainty of the evidence to low due to high risk of bias and imprecision.
2One study, Riley 2011, did not report numbers for participants or population.
3We downgraded the certainty of evidence to very low due to high risk of bias, inconsistency and imprecision.
4We downgraded the certainty of evidence to very low due to high risk of bias, inconsistency of findings and the small number of participants.
5We downgraded the certainty of evidence to very low due to high risk of bias, inconsistent results and small sample sizes.
6We downgraded the certainty of evidence to very low due to risk of bias, inconsistency in results and due the sample size being small or unclear in some studies.
7We downgraded the certainty of evidence to very low due to high risk of bias and inconsistency between studies.

Figures and Tables -
Summary of findings for the main comparison. Interactive training for in‐hospital‐based healthcare providers on the management of life‐threatening emergencies: effects on clinical practice and patient outcomes