User login
Identifying the Sickest During Triage: Using Point-of-Care Severity Scores to Predict Prognosis in Emergency Department Patients With Suspected Sepsis
Sepsis is the leading cause of in-hospital mortality in the United States.1 Sepsis is present on admission in 85% of cases, and each hour delay in antibiotic treatment is associated with 4% to 7% increased odds of mortality.2,3 Prompt identification and treatment of sepsis is essential for reducing morbidity and mortality, but identifying sepsis during triage is challenging.2
Risk stratification scores that rely solely on data readily available at the bedside have been developed to quickly identify those at greatest risk of poor outcomes from sepsis in real time. The quick Sequential Organ Failure Assessment (qSOFA) score, the National Early Warning System (NEWS2), and the Shock Index are easy-to-calculate measures that use routinely collected clinical data that are not subject to laboratory delay. These scores can be incorporated into electronic health record (EHR)-based alerts and can be calculated longitudinally to track the risk of poor outcomes over time. qSOFA was developed to quantify patient risk at bedside in non-intensive care unit (ICU) settings, but there is no consensus about its ability to predict adverse outcomes such as mortality and ICU admission.4-6 The United Kingdom’s National Health Service uses NEWS2 to identify patients at risk for sepsis.7 NEWS has been shown to have similar or better sensitivity in identifying poorer outcomes in sepsis patients compared with systemic inflammatory response syndrome (SIRS) criteria and qSOFA.4,8-11 However, since the latest update of NEWS2 in 2017, there has been little study of its predictive ability. The Shock Index is a simple bedside score (heart rate divided by systolic blood pressure) that was developed to detect changes in cardiovascular performance before systemic shock onset. Although it was not developed for infection and has not been regularly applied in the sepsis literature, the Shock Index might be useful for identifying patients at increased risk of poor outcomes. Patients with higher and sustained Shock Index scores are more likely to experience morbidity, such as hyperlactatemia, vasopressor use, and organ failure, and also have an increased risk of mortality.12-14
Although the predictive abilities of these bedside risk stratification scores have been assessed individually using standard binary cut-points, the comparative performance of qSOFA, the Shock Index, and NEWS2 has not been evaluated in patients presenting to an emergency department (ED) with suspected sepsis.
METHODS
Design and Setting
We conducted a retrospective cohort study of ED patients who presented with suspected sepsis to the University of California San Francisco (UCSF) Helen Diller Medical Center at Parnassus Heights between June 1, 2012, and December 31, 2018. Our institution is a 785-bed academic teaching hospital with approximately 30,000 ED encounters per year. The study was approved with a waiver of informed consent by the UCSF Human Research Protection Program.
Participants
We use an Epic-based EHR platform (Epic 2017, Epic Systems Corporation) for clinical care, which was implemented on June 1, 2012. All data elements were obtained from Clarity, the relational database that stores Epic’s inpatient data. The study included encounters for patients age ≥18 years who had blood cultures ordered within 24 hours of ED presentation and administration of intravenous antibiotics within 24 hours. Repeat encounters were treated independently in our analysis.
Outcomes and Measures
We compared the ability of qSOFA, the Shock Index, and NEWS2 to predict in-hospital mortality and admission to the ICU from the ED (ED-to-ICU admission). We used the
We compared demographic and clinical characteristics of patients who were positive for qSOFA, the Shock Index, and NEWS2. Demographic data were extracted from the EHR and included primary language, age, sex, and insurance status. All International Classification of Diseases (ICD)-9/10 diagnosis codes were pulled from Clarity billing tables. We used the Elixhauser comorbidity groupings19 of ICD-9/10 codes present on admission to identify preexisting comorbidities and underlying organ dysfunction. To estimate burden of comorbid illnesses, we calculated the validated van Walraven comorbidity index,20 which provides an estimated risk of in-hospital death based on documented Elixhauser comorbidities. Admission level of care (acute, stepdown, or intensive care) was collected for inpatient admissions to assess initial illness severity.21 We also evaluated discharge disposition and in-hospital mortality. Index blood culture results were collected, and dates and timestamps of mechanical ventilation, fluid, vasopressor, and antibiotic administration were obtained for the duration of the encounter.
UCSF uses an automated, real-time, algorithm-based severe sepsis alert that is triggered when a patient meets ≥2 SIRS criteria and again when the patient meets severe sepsis or septic shock criteria (ie, ≥2 SIRS criteria in addition to end-organ dysfunction and/or fluid nonresponsive hypotension). This sepsis screening alert was in use for the duration of our study.22
Statistical Analysis
We performed a subgroup analysis among those who were diagnosed with sepsis, according to the 2016 Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) criteria.
All statistical analyses were conducted using Stata 14 (StataCorp). We summarized differences in demographic and clinical characteristics among the populations meeting each severity score but elected not to conduct hypothesis testing because patients could be positive for one or more scores. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each score to predict in-hospital mortality and ED-to-ICU admission. To allow comparison with other studies, we also created a composite outcome of either in-hospital mortality or ED-to-ICU admission.
RESULTS
Within our sample 23,837 ED patients had blood cultures ordered within 24 hours of ED presentation and were considered to have suspected sepsis. The mean age of the cohort was 60.8 years, and 1,612 (6.8%) had positive blood cultures. A total of 12,928 patients (54.2%) were found to have sepsis. We documented 1,427 in-hospital deaths (6.0%) and 3,149 (13.2%) ED-to-ICU admissions. At ED triage 1,921 (8.1%) were qSOFA-positive, 4,273 (17.9%) were Shock Index-positive, and 11,832 (49.6%) were NEWS2-positive. At ED triage, blood pressure, heart rate, respiratory rate, and oxygen saturated were documented in >99% of patients, 93.5% had temperature documented, and 28.5% had GCS recorded. If the window of assessment was widened to 1 hour, GCS was only documented among 44.2% of those with suspected sepsis.
Demographic Characteristics and Clinical Course
qSOFA-positive patients received antibiotics more quickly than those who were Shock Index-positive or NEWS2-positive (median 1.5, 1.8, and 2.8 hours after admission, respectively). In addition, those who were qSOFA-positive were more likely to have a positive blood culture (10.9%, 9.4%, and 8.5%, respectively) and to receive an EHR-based diagnosis of sepsis (77.0%, 69.6%, and 60.9%, respectively) than those who were Shock Index- or NEWS2-positive. Those who were qSOFA-positive also were more likely to be mechanically ventilated during their hospital stay (25.4%, 19.2%, and 10.8%, respectively) and to receive vasopressors (33.5%, 22.5%, and 12.2%, respectively). In-hospital mortality also was more common among those who were qSOFA-positive at triage (23.4%, 15.3%, and 9.2%, respectively).
Because both qSOFA and NEWS2 incorporate GCS, we explored baseline characteristics of patients with GCS documented at triage (n = 6,794). These patients were older (median age 63 and 61 years, P < .0001), more likely to be male (54.9% and 53.4%, P = .0031), more likely to have renal failure (22.8% and 20.1%, P < .0001), more likely to have liver disease (14.2% and 12.8%, P = .006), had a higher van Walraven comorbidity score on presentation (median 10 and 8, P < .0001), and were more likely to go directly to the ICU from the ED (20.2% and 10.6%, P < .0001). However, among the 6,397 GCS scores documented at triage, only 1,579 (24.7%) were abnormal.
Test Characteristics of qSOFA, Shock Index, and NEWS2 for Predicting In-hospital Mortality and ED-to-ICU Admission
Among 23,837 patients with suspected sepsis, NEWS2 had the highest sensitivity for predicting in-hospital mortality (76.0%; 95% CI, 73.7%-78.2%) and ED-to-ICU admission (78.9%; 95% CI, 77.5%-80.4%) but had the lowest specificity for in-hospital mortality (52.0%; 95% CI, 51.4%-52.7%) and for ED-to-ICU admission (54.8%; 95% CI, 54.1%-55.5%) (Table 3). qSOFA had the lowest sensitivity for in-hospital mortality (31.5%; 95% CI, 29.1%-33.9%) and ED-to-ICU admission (29.3%; 95% CI, 27.7%-30.9%) but the highest specificity for in-hospital mortality (93.4%; 95% CI, 93.1%-93.8%) and ED-to-ICU admission (95.2%; 95% CI, 94.9%-95.5%). The Shock Index had a sensitivity that fell between qSOFA and NEWS2 for in-hospital mortality (45.8%; 95% CI, 43.2%-48.5%) and ED-to-ICU admission (49.2%; 95% CI, 47.5%-51.0%). The specificity of the Shock Index also was between qSOFA and NEWS2 for in-hospital mortality (83.9%; 95% CI, 83.4%-84.3%) and ED-to-ICU admission (86.8%; 95% CI, 86.4%-87.3%). All three scores exhibited relatively low PPV, ranging from 9.2% to 23.4% for in-hospital mortality and 21.0% to 48.0% for ED-to-ICU triage. Conversely, all three scores exhibited relatively high NPV, ranging from 95.5% to 97.1% for in-hospital mortality and 89.8% to 94.5% for ED-to-ICU triage.
When considering a binary cutoff, the Shock Index exhibited the highest AUROC for in-hospital mortality (0.648; 95% CI, 0.635-0.662) and had a significantly higher AUROC than qSOFA (AUROC, 0.625; 95% CI, 0.612-0.637; P = .0005), but there was no difference compared with NEWS2 (AUROC, 0.640; 95% CI, 0.628-0.652; P = .2112). NEWS2 had a significantly higher AUROC than qSOFA for predicting in-hospital mortality (P = .0227). The Shock Index also exhibited the highest AUROC for ED-to-ICU admission (0.680; 95% CI, 0.617-0.689), which was significantly higher than the AUROC for qSOFA (P < .0001) and NEWS2 (P = 0.0151). NEWS2 had a significantly higher AUROC than qSOFA for predicting ED-to-ICU admission (P < .0001). Similar findings were seen in patients found to have sepsis.
DISCUSSION
In this retrospective cohort study of 23,837 patients who presented to the ED with suspected sepsis, the standard qSOFA threshold was met least frequently, followed by the Shock Index and NEWS2. NEWS2 had the highest sensitivity but the lowest specificity for predicting in-hospital mortality and ED-to-ICU admission, making it a challenging bedside risk stratification scale for identifying patients at risk of poor clinical outcomes. When comparing predictive performance among the three scales, qSOFA had the highest specificity and the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission in this cohort of patients with suspected sepsis. These trends in sensitivity, specificity, and AUROC were consistent among those who met EHR criteria for a sepsis diagnosis. In the analysis of the three scoring systems using all available cut-points, qSOFA and NEWS2 had the highest AUROCs, followed by the Shock Index.
Considering the rapid progression from organ dysfunction to death in sepsis patients, as well as the difficulty establishing a sepsis diagnosis at triage,23 providers must quickly identify patients at increased risk of poor outcomes when they present to the ED. Sepsis alerts often are built using SIRS criteria,27 including the one used for sepsis surveillance at UCSF since 2012,22 but the white blood cell count criterion is subject to a laboratory lag and could lead to a delay in identification. Implementation of a point-of-care bedside score alert that uses readily available clinical data could allow providers to identify patients at greatest risk of poor outcomes immediately at ED presentation and triage, which motivated us to explore the predictive performance of qSOFA, the Shock Index, and NEWS2.
Our study is the first to provide a head-to-head comparison of the predictive performance of qSOFA, the Shock Index, and NEWS2, three easy-to-calculate bedside risk scores that use EHR data collected among patients with suspected sepsis. The Sepsis-3 guidelines recommend qSOFA to quickly identify non-ICU patients at greatest risk of poor outcomes because the measure exhibited predictive performance similar to the more extensive SOFA score outside the ICU.16,23 Although some studies have confirmed qSOFA’s high predictive performance,28-31 our test characteristics and AUROC findings are in line with other published analyses.4,6,10,17 The UK National Health Service is using NEWS2 to screen for patients at risk of poor outcomes from sepsis. Several analyses that assessed the predictive ability of NEWS have reported estimates in line with our findings.4,10,32 The Shock Index was introduced in 1967 and provided a metric to evaluate hemodynamic stability based on heart rate and systolic blood pressure.33 The Shock Index has been studied in several contexts, including sepsis,34 and studies show that a sustained Shock Index is associated with increased odds of vasopressor administration, higher prevalence of hyperlactatemia, and increased risk of poor outcomes in the ICU.13,14
For our study, we were particularly interested in exploring how the Shock Index would compare with more frequently used severity scores such as qSOFA and NEWS2 among patients with suspected sepsis, given the simplicity of its calculation and the easy availability of required data. In our cohort of 23,837 patients, only 159 people had missing blood pressure and only 71 had omitted heart rate. In contrast, both qSOFA and NEWS2 include an assessment of level of consciousness that can be subject to variability in assessment methods and EHR documentation across institutions.11 In our cohort, GCS within 30 minutes of ED presentation was missing in 72 patients, which could have led to incomplete calculation of qSOFA and NEWS2 if a missing value was not actually within normal limits.
Several investigations relate qSOFA to NEWS but few compare qSOFA with the newer NEWS2, and even fewer evaluate the Shock Index with any of these scores.10,11,18,29,35-37 In general, studies have shown that NEWS exhibits a higher AUROC for predicting mortality, sepsis with organ dysfunction, and ICU admission, often as a composite outcome.4,11,18,37,38 A handful of studies compare the Shock Index to SIRS; however, little has been done to compare the Shock Index to qSOFA or NEWS2, scores that have been used specifically for sepsis and might be more predictive of poor outcomes than SIRS.33 In our study, the Shock Index had a higher AUROC than either qSOFA or NEWS2 for predicting in-hospital mortality and ED-to-ICU admission measured as separate outcomes and as a composite outcome using standard cut-points for these scores.
When selecting a severity score to apply in an institution, it is important to carefully evaluate the score’s test characteristics, in addition to considering the availability of reliable data. Tests with high sensitivity and NPV for the population being studied can be useful to rule out disease or risk of poor outcome, while tests with high specificity and PPV can be useful to rule in disease or risk of poor outcome.39 When considering specificity, qSOFA’s performance was superior to the Shock Index and NEWS2 in our study, but a small percentage of the population was identified using a cut-point of qSOFA ≥2. If we used qSOFA and applied this standard cut-point at our institution, we could be confident that those identified were at increased risk, but we would miss a significant number of patients who would experience a poor outcome. When considering sensitivity, performance of NEWS2 was superior to qSOFA and the Shock Index in our study, but one-half of the population was identified using a cut-point of NEWS2 ≥5. If we were to apply this standard NEWS2 cut-point at our institution, we would assume that one-half of our population was at risk, which might drive resource use towards patients who will not experience a poor outcome. Although none of the scores exhibited a robust AUROC measure, the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission when using the standard binary cut-point, and its sensitivity and specificity is between that of qSOFA and NEWS2, potentially making it a score to use in settings where qSOFA and NEWS2 score components, such as altered mentation, are not reliably collected. Finally, our sensitivity analysis varying the binary cut-point of each score within our population demonstrated that the standard cut-points might not be as useful within a specific population and might need to be tailored for implementation, balancing sensitivity, specificity, PPV, and NPV to meet local priorities and ICU capacity.
Our study has limitations. It is a single-center, retrospective analysis, factors that could reduce generalizability. However, it does include a large and diverse patient population spanning several years. Missing GCS data could have affected the predictive ability of qSOFA and NEWS2 in our cohort. We could not reliably perform imputation of GCS because of the high missingness and therefore we assumed missing was normal, as was done in the Sepsis-3 derivation studies.16 Previous studies have attempted to impute GCS and have not observed improved performance of qSOFA to predict mortality.40 Because manually collected variables such as GCS are less reliably documented in the EHR, there might be limitations in their use for triage risk scores.
Although the current analysis focused on the predictive performance of qSOFA, the Shock Index, and NEWS2 at triage, performance of these scores could affect the ED team’s treatment decisions before handoff to the hospitalist team and the expected level of care the patient will receive after in-patient admission. These tests also have the advantage of being easy to calculate at the bedside over time, which could provide an objective assessment of longitudinal predicted prognosis.
CONCLUSION
Local priorities should drive selection of a screening tool, balancing sensitivity, specificity, PPV, and NPV to achieve the institution’s goals. qSOFA, Shock Index, and NEWS2 are risk stratification tools that can be easily implemented at ED triage using data available at the bedside. Although none of these scores performed strongly when comparing AUROCs, qSOFA was highly specific for identifying patients with poor outcomes, and NEWS2 was the most sensitive for ruling out those at high risk among patients with suspected sepsis. The Shock Index exhibited a sensitivity and specificity that fell between qSOFA and NEWS2 and also might be considered to identify those at increased risk, given its ease of implementation, particularly in settings where altered mentation is unreliably or inconsistently documented.
Acknowledgment
The authors thank the UCSF Division of Hospital Medicine Data Core for their assistance with data acquisition.
1. Jones SL, Ashton CM, Kiehne LB, et al. Outcomes and resource use of sepsis-associated stays by presence on admission, severity, and hospital type. Med Care. 2016;54(3):303-310. https://doi.org/10.1097/MLR.0000000000000481
2. Seymour CW, Gesten F, Prescott HC, et al. Time to treatment and mortality during mandated emergency care for sepsis. N Engl J Med. 2017;376(23):2235-2244. https://doi.org/10.1056/NEJMoa1703058
3. Kumar A, Roberts D, Wood KE, et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589-1596. https://doi.org/10.1097/01.CCM.0000217961.75225.E9
4. Churpek MM, Snyder A, Sokol S, Pettit NN, Edelson DP. Investigating the impact of different suspicion of infection criteria on the accuracy of Quick Sepsis-Related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores. Crit Care Med. 2017;45(11):1805-1812. https://doi.org/10.1097/CCM.0000000000002648
5. Abdullah SMOB, Sørensen RH, Dessau RBC, Sattar SMRU, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
6. Kim KS, Suh GJ, Kim K, et al. Quick Sepsis-related Organ Failure Assessment score is not sensitive enough to predict 28-day mortality in emergency department patients with sepsis: a retrospective review. Clin Exp Emerg Med. 2019;6(1):77-83. HTTPS://DOI.ORG/ 10.15441/ceem.17.294
7. National Early Warning Score (NEWS) 2: Standardising the assessment of acute-illness severity in the NHS. Royal College of Physicians; 2017.
8. Brink A, Alsma J, Verdonschot RJCG, et al. Predicting mortality in patients with suspected sepsis at the emergency department: a retrospective cohort study comparing qSOFA, SIRS and National Early Warning Score. PLoS One. 2019;14(1):e0211133. https://doi.org/ 10.1371/journal.pone.0211133
9. Redfern OC, Smith GB, Prytherch DR, Meredith P, Inada-Kim M, Schmidt PE. A comparison of the Quick Sequential (Sepsis-Related) Organ Failure Assessment Score and the National Early Warning Score in non-ICU patients with/without infection. Crit Care Med. 2018;46(12):1923-1933. https://doi.org/10.1097/CCM.0000000000003359
10. Churpek MM, Snyder A, Han X, et al. Quick Sepsis-related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores for detecting clinical deterioration in infected patients outside the intensive care unit. Am J Respir Crit Care Med. 2017;195(7):906-911. https://doi.org/10.1164/rccm.201604-0854OC
11. Goulden R, Hoyle MC, Monis J, et al. qSOFA, SIRS and NEWS for predicting inhospital mortality and ICU admission in emergency admissions treated as sepsis. Emerg Med J. 2018;35(6):345-349. https://doi.org/10.1136/emermed-2017-207120
12. Biney I, Shepherd A, Thomas J, Mehari A. Shock Index and outcomes in patients admitted to the ICU with sepsis. Chest. 2015;148(suppl 4):337A. https://doi.org/https://doi.org/10.1378/chest.2281151
13. Wira CR, Francis MW, Bhat S, Ehrman R, Conner D, Siegel M. The shock index as a predictor of vasopressor use in emergency department patients with severe sepsis. West J Emerg Med. 2014;15(1):60-66. https://doi.org/10.5811/westjem.2013.7.18472
14. Berger T, Green J, Horeczko T, et al. Shock index and early recognition of sepsis in the emergency department: pilot study. West J Emerg Med. 2013;14(2):168-174. https://doi.org/10.5811/westjem.2012.8.11546
15. Middleton DJ, Smith TO, Bedford R, Neilly M, Myint PK. Shock Index predicts outcome in patients with suspected sepsis or community-acquired pneumonia: a systematic review. J Clin Med. 2019;8(8):1144. https://doi.org/10.3390/jcm8081144
16. Seymour CW, Liu VX, Iwashyna TJ, et al. Assessment of clinical criteria for sepsis: for the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):762-774. https://doi.org/ 10.1001/jama.2016.0288
17. Abdullah S, Sørensen RH, Dessau RBC, Sattar S, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
18. Usman OA, Usman AA, Ward MA. Comparison of SIRS, qSOFA, and NEWS for the early identification of sepsis in the Emergency Department. Am J Emerg Med. 2018;37(8):1490-1497. https://doi.org/10.1016/j.ajem.2018.10.058
19. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
20. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5
21. Prin M, Wunsch H. The role of stepdown beds in hospital care. Am J Respir Crit Care Med. 2014;190(11):1210-1216. https://doi.org/10.1164/rccm.201406-1117PP
22. Narayanan N, Gross AK, Pintens M, Fee C, MacDougall C. Effect of an electronic medical record alert for severe sepsis among ED patients. Am J Emerg Med. 2016;34(2):185-188. https://doi.org/10.1016/j.ajem.2015.10.005
23. Singer M, Deutschman CS, Seymour CW, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-810. https://doi.org/10.1001/jama.2016.0287
24. Rhee C, Dantes R, Epstein L, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009-2014. JAMA. 2017;318(13):1241-1249. https://doi.org/10.1001/jama.2017.13836
25. Safari S, Baratloo A, Elfil M, Negida A. Evidence based emergency medicine; part 5 receiver operating curve and area under the curve. Emerg (Tehran). 2016;4(2):111-113.
26. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
27. Kangas C, Iverson L, Pierce D. Sepsis screening: combining Early Warning Scores and SIRS Criteria. Clin Nurs Res. 2021;30(1):42-49. https://doi.org/10.1177/1054773818823334.
28. Freund Y, Lemachatti N, Krastinova E, et al. Prognostic accuracy of Sepsis-3 Criteria for in-hospital mortality among patients with suspected infection presenting to the emergency department. JAMA. 2017;317(3):301-308. https://doi.org/10.1001/jama.2016.20329
29. Finkelsztein EJ, Jones DS, Ma KC, et al. Comparison of qSOFA and SIRS for predicting adverse outcomes of patients with suspicion of sepsis outside the intensive care unit. Crit Care. 2017;21(1):73. https://doi.org/10.1186/s13054-017-1658-5
30. Canet E, Taylor DM, Khor R, Krishnan V, Bellomo R. qSOFA as predictor of mortality and prolonged ICU admission in Emergency Department patients with suspected infection. J Crit Care. 2018;48:118-123. https://doi.org/10.1016/j.jcrc.2018.08.022
31. Anand V, Zhang Z, Kadri SS, Klompas M, Rhee C; CDC Prevention Epicenters Program. Epidemiology of Quick Sequential Organ Failure Assessment criteria in undifferentiated patients and association with suspected infection and sepsis. Chest. 2019;156(2):289-297. https://doi.org/10.1016/j.chest.2019.03.032
32. Hamilton F, Arnold D, Baird A, Albur M, Whiting P. Early Warning Scores do not accurately predict mortality in sepsis: A meta-analysis and systematic review of the literature. J Infect. 2018;76(3):241-248. https://doi.org/10.1016/j.jinf.2018.01.002
33. Koch E, Lovett S, Nghiem T, Riggs RA, Rech MA. Shock Index in the emergency department: utility and limitations. Open Access Emerg Med. 2019;11:179-199. https://doi.org/10.2147/OAEM.S178358
34. Yussof SJ, Zakaria MI, Mohamed FL, Bujang MA, Lakshmanan S, Asaari AH. Value of Shock Index in prognosticating the short-term outcome of death for patients presenting with severe sepsis and septic shock in the emergency department. Med J Malaysia. 2012;67(4):406-411.
35. Siddiqui S, Chua M, Kumaresh V, Choo R. A comparison of pre ICU admission SIRS, EWS and q SOFA scores for predicting mortality and length of stay in ICU. J Crit Care. 2017;41:191-193. https://doi.org/10.1016/j.jcrc.2017.05.017
36. Costa RT, Nassar AP, Caruso P. Accuracy of SOFA, qSOFA, and SIRS scores for mortality in cancer patients admitted to an intensive care unit with suspected infection. J Crit Care. 2018;45:52-57. https://doi.org/10.1016/j.jcrc.2017.12.024
37. Mellhammar L, Linder A, Tverring J, et al. NEWS2 is Superior to qSOFA in detecting sepsis with organ dysfunction in the emergency department. J Clin Med. 2019;8(8):1128. https://doi.org/10.3390/jcm8081128
38. Szakmany T, Pugh R, Kopczynska M, et al. Defining sepsis on the wards: results of a multi-centre point-prevalence study comparing two sepsis definitions. Anaesthesia. 2018;73(2):195-204. https://doi.org/10.1111/anae.14062
39. Newman TB, Kohn MA. Evidence-Based Diagnosis: An Introduction to Clinical Epidemiology. Cambridge University Press; 2009.
40. Askim Å, Moser F, Gustad LT, et al. Poor performance of quick-SOFA (qSOFA) score in predicting severe sepsis and mortality - a prospective study of patients admitted with infection to the emergency department. Scand J Trauma Resusc Emerg Med. 2017;25(1):56. https://doi.org/10.1186/s13049-017-0399-4
Sepsis is the leading cause of in-hospital mortality in the United States.1 Sepsis is present on admission in 85% of cases, and each hour delay in antibiotic treatment is associated with 4% to 7% increased odds of mortality.2,3 Prompt identification and treatment of sepsis is essential for reducing morbidity and mortality, but identifying sepsis during triage is challenging.2
Risk stratification scores that rely solely on data readily available at the bedside have been developed to quickly identify those at greatest risk of poor outcomes from sepsis in real time. The quick Sequential Organ Failure Assessment (qSOFA) score, the National Early Warning System (NEWS2), and the Shock Index are easy-to-calculate measures that use routinely collected clinical data that are not subject to laboratory delay. These scores can be incorporated into electronic health record (EHR)-based alerts and can be calculated longitudinally to track the risk of poor outcomes over time. qSOFA was developed to quantify patient risk at bedside in non-intensive care unit (ICU) settings, but there is no consensus about its ability to predict adverse outcomes such as mortality and ICU admission.4-6 The United Kingdom’s National Health Service uses NEWS2 to identify patients at risk for sepsis.7 NEWS has been shown to have similar or better sensitivity in identifying poorer outcomes in sepsis patients compared with systemic inflammatory response syndrome (SIRS) criteria and qSOFA.4,8-11 However, since the latest update of NEWS2 in 2017, there has been little study of its predictive ability. The Shock Index is a simple bedside score (heart rate divided by systolic blood pressure) that was developed to detect changes in cardiovascular performance before systemic shock onset. Although it was not developed for infection and has not been regularly applied in the sepsis literature, the Shock Index might be useful for identifying patients at increased risk of poor outcomes. Patients with higher and sustained Shock Index scores are more likely to experience morbidity, such as hyperlactatemia, vasopressor use, and organ failure, and also have an increased risk of mortality.12-14
Although the predictive abilities of these bedside risk stratification scores have been assessed individually using standard binary cut-points, the comparative performance of qSOFA, the Shock Index, and NEWS2 has not been evaluated in patients presenting to an emergency department (ED) with suspected sepsis.
METHODS
Design and Setting
We conducted a retrospective cohort study of ED patients who presented with suspected sepsis to the University of California San Francisco (UCSF) Helen Diller Medical Center at Parnassus Heights between June 1, 2012, and December 31, 2018. Our institution is a 785-bed academic teaching hospital with approximately 30,000 ED encounters per year. The study was approved with a waiver of informed consent by the UCSF Human Research Protection Program.
Participants
We use an Epic-based EHR platform (Epic 2017, Epic Systems Corporation) for clinical care, which was implemented on June 1, 2012. All data elements were obtained from Clarity, the relational database that stores Epic’s inpatient data. The study included encounters for patients age ≥18 years who had blood cultures ordered within 24 hours of ED presentation and administration of intravenous antibiotics within 24 hours. Repeat encounters were treated independently in our analysis.
Outcomes and Measures
We compared the ability of qSOFA, the Shock Index, and NEWS2 to predict in-hospital mortality and admission to the ICU from the ED (ED-to-ICU admission). We used the
We compared demographic and clinical characteristics of patients who were positive for qSOFA, the Shock Index, and NEWS2. Demographic data were extracted from the EHR and included primary language, age, sex, and insurance status. All International Classification of Diseases (ICD)-9/10 diagnosis codes were pulled from Clarity billing tables. We used the Elixhauser comorbidity groupings19 of ICD-9/10 codes present on admission to identify preexisting comorbidities and underlying organ dysfunction. To estimate burden of comorbid illnesses, we calculated the validated van Walraven comorbidity index,20 which provides an estimated risk of in-hospital death based on documented Elixhauser comorbidities. Admission level of care (acute, stepdown, or intensive care) was collected for inpatient admissions to assess initial illness severity.21 We also evaluated discharge disposition and in-hospital mortality. Index blood culture results were collected, and dates and timestamps of mechanical ventilation, fluid, vasopressor, and antibiotic administration were obtained for the duration of the encounter.
UCSF uses an automated, real-time, algorithm-based severe sepsis alert that is triggered when a patient meets ≥2 SIRS criteria and again when the patient meets severe sepsis or septic shock criteria (ie, ≥2 SIRS criteria in addition to end-organ dysfunction and/or fluid nonresponsive hypotension). This sepsis screening alert was in use for the duration of our study.22
Statistical Analysis
We performed a subgroup analysis among those who were diagnosed with sepsis, according to the 2016 Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) criteria.
All statistical analyses were conducted using Stata 14 (StataCorp). We summarized differences in demographic and clinical characteristics among the populations meeting each severity score but elected not to conduct hypothesis testing because patients could be positive for one or more scores. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each score to predict in-hospital mortality and ED-to-ICU admission. To allow comparison with other studies, we also created a composite outcome of either in-hospital mortality or ED-to-ICU admission.
RESULTS
Within our sample 23,837 ED patients had blood cultures ordered within 24 hours of ED presentation and were considered to have suspected sepsis. The mean age of the cohort was 60.8 years, and 1,612 (6.8%) had positive blood cultures. A total of 12,928 patients (54.2%) were found to have sepsis. We documented 1,427 in-hospital deaths (6.0%) and 3,149 (13.2%) ED-to-ICU admissions. At ED triage 1,921 (8.1%) were qSOFA-positive, 4,273 (17.9%) were Shock Index-positive, and 11,832 (49.6%) were NEWS2-positive. At ED triage, blood pressure, heart rate, respiratory rate, and oxygen saturated were documented in >99% of patients, 93.5% had temperature documented, and 28.5% had GCS recorded. If the window of assessment was widened to 1 hour, GCS was only documented among 44.2% of those with suspected sepsis.
Demographic Characteristics and Clinical Course
qSOFA-positive patients received antibiotics more quickly than those who were Shock Index-positive or NEWS2-positive (median 1.5, 1.8, and 2.8 hours after admission, respectively). In addition, those who were qSOFA-positive were more likely to have a positive blood culture (10.9%, 9.4%, and 8.5%, respectively) and to receive an EHR-based diagnosis of sepsis (77.0%, 69.6%, and 60.9%, respectively) than those who were Shock Index- or NEWS2-positive. Those who were qSOFA-positive also were more likely to be mechanically ventilated during their hospital stay (25.4%, 19.2%, and 10.8%, respectively) and to receive vasopressors (33.5%, 22.5%, and 12.2%, respectively). In-hospital mortality also was more common among those who were qSOFA-positive at triage (23.4%, 15.3%, and 9.2%, respectively).
Because both qSOFA and NEWS2 incorporate GCS, we explored baseline characteristics of patients with GCS documented at triage (n = 6,794). These patients were older (median age 63 and 61 years, P < .0001), more likely to be male (54.9% and 53.4%, P = .0031), more likely to have renal failure (22.8% and 20.1%, P < .0001), more likely to have liver disease (14.2% and 12.8%, P = .006), had a higher van Walraven comorbidity score on presentation (median 10 and 8, P < .0001), and were more likely to go directly to the ICU from the ED (20.2% and 10.6%, P < .0001). However, among the 6,397 GCS scores documented at triage, only 1,579 (24.7%) were abnormal.
Test Characteristics of qSOFA, Shock Index, and NEWS2 for Predicting In-hospital Mortality and ED-to-ICU Admission
Among 23,837 patients with suspected sepsis, NEWS2 had the highest sensitivity for predicting in-hospital mortality (76.0%; 95% CI, 73.7%-78.2%) and ED-to-ICU admission (78.9%; 95% CI, 77.5%-80.4%) but had the lowest specificity for in-hospital mortality (52.0%; 95% CI, 51.4%-52.7%) and for ED-to-ICU admission (54.8%; 95% CI, 54.1%-55.5%) (Table 3). qSOFA had the lowest sensitivity for in-hospital mortality (31.5%; 95% CI, 29.1%-33.9%) and ED-to-ICU admission (29.3%; 95% CI, 27.7%-30.9%) but the highest specificity for in-hospital mortality (93.4%; 95% CI, 93.1%-93.8%) and ED-to-ICU admission (95.2%; 95% CI, 94.9%-95.5%). The Shock Index had a sensitivity that fell between qSOFA and NEWS2 for in-hospital mortality (45.8%; 95% CI, 43.2%-48.5%) and ED-to-ICU admission (49.2%; 95% CI, 47.5%-51.0%). The specificity of the Shock Index also was between qSOFA and NEWS2 for in-hospital mortality (83.9%; 95% CI, 83.4%-84.3%) and ED-to-ICU admission (86.8%; 95% CI, 86.4%-87.3%). All three scores exhibited relatively low PPV, ranging from 9.2% to 23.4% for in-hospital mortality and 21.0% to 48.0% for ED-to-ICU triage. Conversely, all three scores exhibited relatively high NPV, ranging from 95.5% to 97.1% for in-hospital mortality and 89.8% to 94.5% for ED-to-ICU triage.
When considering a binary cutoff, the Shock Index exhibited the highest AUROC for in-hospital mortality (0.648; 95% CI, 0.635-0.662) and had a significantly higher AUROC than qSOFA (AUROC, 0.625; 95% CI, 0.612-0.637; P = .0005), but there was no difference compared with NEWS2 (AUROC, 0.640; 95% CI, 0.628-0.652; P = .2112). NEWS2 had a significantly higher AUROC than qSOFA for predicting in-hospital mortality (P = .0227). The Shock Index also exhibited the highest AUROC for ED-to-ICU admission (0.680; 95% CI, 0.617-0.689), which was significantly higher than the AUROC for qSOFA (P < .0001) and NEWS2 (P = 0.0151). NEWS2 had a significantly higher AUROC than qSOFA for predicting ED-to-ICU admission (P < .0001). Similar findings were seen in patients found to have sepsis.
DISCUSSION
In this retrospective cohort study of 23,837 patients who presented to the ED with suspected sepsis, the standard qSOFA threshold was met least frequently, followed by the Shock Index and NEWS2. NEWS2 had the highest sensitivity but the lowest specificity for predicting in-hospital mortality and ED-to-ICU admission, making it a challenging bedside risk stratification scale for identifying patients at risk of poor clinical outcomes. When comparing predictive performance among the three scales, qSOFA had the highest specificity and the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission in this cohort of patients with suspected sepsis. These trends in sensitivity, specificity, and AUROC were consistent among those who met EHR criteria for a sepsis diagnosis. In the analysis of the three scoring systems using all available cut-points, qSOFA and NEWS2 had the highest AUROCs, followed by the Shock Index.
Considering the rapid progression from organ dysfunction to death in sepsis patients, as well as the difficulty establishing a sepsis diagnosis at triage,23 providers must quickly identify patients at increased risk of poor outcomes when they present to the ED. Sepsis alerts often are built using SIRS criteria,27 including the one used for sepsis surveillance at UCSF since 2012,22 but the white blood cell count criterion is subject to a laboratory lag and could lead to a delay in identification. Implementation of a point-of-care bedside score alert that uses readily available clinical data could allow providers to identify patients at greatest risk of poor outcomes immediately at ED presentation and triage, which motivated us to explore the predictive performance of qSOFA, the Shock Index, and NEWS2.
Our study is the first to provide a head-to-head comparison of the predictive performance of qSOFA, the Shock Index, and NEWS2, three easy-to-calculate bedside risk scores that use EHR data collected among patients with suspected sepsis. The Sepsis-3 guidelines recommend qSOFA to quickly identify non-ICU patients at greatest risk of poor outcomes because the measure exhibited predictive performance similar to the more extensive SOFA score outside the ICU.16,23 Although some studies have confirmed qSOFA’s high predictive performance,28-31 our test characteristics and AUROC findings are in line with other published analyses.4,6,10,17 The UK National Health Service is using NEWS2 to screen for patients at risk of poor outcomes from sepsis. Several analyses that assessed the predictive ability of NEWS have reported estimates in line with our findings.4,10,32 The Shock Index was introduced in 1967 and provided a metric to evaluate hemodynamic stability based on heart rate and systolic blood pressure.33 The Shock Index has been studied in several contexts, including sepsis,34 and studies show that a sustained Shock Index is associated with increased odds of vasopressor administration, higher prevalence of hyperlactatemia, and increased risk of poor outcomes in the ICU.13,14
For our study, we were particularly interested in exploring how the Shock Index would compare with more frequently used severity scores such as qSOFA and NEWS2 among patients with suspected sepsis, given the simplicity of its calculation and the easy availability of required data. In our cohort of 23,837 patients, only 159 people had missing blood pressure and only 71 had omitted heart rate. In contrast, both qSOFA and NEWS2 include an assessment of level of consciousness that can be subject to variability in assessment methods and EHR documentation across institutions.11 In our cohort, GCS within 30 minutes of ED presentation was missing in 72 patients, which could have led to incomplete calculation of qSOFA and NEWS2 if a missing value was not actually within normal limits.
Several investigations relate qSOFA to NEWS but few compare qSOFA with the newer NEWS2, and even fewer evaluate the Shock Index with any of these scores.10,11,18,29,35-37 In general, studies have shown that NEWS exhibits a higher AUROC for predicting mortality, sepsis with organ dysfunction, and ICU admission, often as a composite outcome.4,11,18,37,38 A handful of studies compare the Shock Index to SIRS; however, little has been done to compare the Shock Index to qSOFA or NEWS2, scores that have been used specifically for sepsis and might be more predictive of poor outcomes than SIRS.33 In our study, the Shock Index had a higher AUROC than either qSOFA or NEWS2 for predicting in-hospital mortality and ED-to-ICU admission measured as separate outcomes and as a composite outcome using standard cut-points for these scores.
When selecting a severity score to apply in an institution, it is important to carefully evaluate the score’s test characteristics, in addition to considering the availability of reliable data. Tests with high sensitivity and NPV for the population being studied can be useful to rule out disease or risk of poor outcome, while tests with high specificity and PPV can be useful to rule in disease or risk of poor outcome.39 When considering specificity, qSOFA’s performance was superior to the Shock Index and NEWS2 in our study, but a small percentage of the population was identified using a cut-point of qSOFA ≥2. If we used qSOFA and applied this standard cut-point at our institution, we could be confident that those identified were at increased risk, but we would miss a significant number of patients who would experience a poor outcome. When considering sensitivity, performance of NEWS2 was superior to qSOFA and the Shock Index in our study, but one-half of the population was identified using a cut-point of NEWS2 ≥5. If we were to apply this standard NEWS2 cut-point at our institution, we would assume that one-half of our population was at risk, which might drive resource use towards patients who will not experience a poor outcome. Although none of the scores exhibited a robust AUROC measure, the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission when using the standard binary cut-point, and its sensitivity and specificity is between that of qSOFA and NEWS2, potentially making it a score to use in settings where qSOFA and NEWS2 score components, such as altered mentation, are not reliably collected. Finally, our sensitivity analysis varying the binary cut-point of each score within our population demonstrated that the standard cut-points might not be as useful within a specific population and might need to be tailored for implementation, balancing sensitivity, specificity, PPV, and NPV to meet local priorities and ICU capacity.
Our study has limitations. It is a single-center, retrospective analysis, factors that could reduce generalizability. However, it does include a large and diverse patient population spanning several years. Missing GCS data could have affected the predictive ability of qSOFA and NEWS2 in our cohort. We could not reliably perform imputation of GCS because of the high missingness and therefore we assumed missing was normal, as was done in the Sepsis-3 derivation studies.16 Previous studies have attempted to impute GCS and have not observed improved performance of qSOFA to predict mortality.40 Because manually collected variables such as GCS are less reliably documented in the EHR, there might be limitations in their use for triage risk scores.
Although the current analysis focused on the predictive performance of qSOFA, the Shock Index, and NEWS2 at triage, performance of these scores could affect the ED team’s treatment decisions before handoff to the hospitalist team and the expected level of care the patient will receive after in-patient admission. These tests also have the advantage of being easy to calculate at the bedside over time, which could provide an objective assessment of longitudinal predicted prognosis.
CONCLUSION
Local priorities should drive selection of a screening tool, balancing sensitivity, specificity, PPV, and NPV to achieve the institution’s goals. qSOFA, Shock Index, and NEWS2 are risk stratification tools that can be easily implemented at ED triage using data available at the bedside. Although none of these scores performed strongly when comparing AUROCs, qSOFA was highly specific for identifying patients with poor outcomes, and NEWS2 was the most sensitive for ruling out those at high risk among patients with suspected sepsis. The Shock Index exhibited a sensitivity and specificity that fell between qSOFA and NEWS2 and also might be considered to identify those at increased risk, given its ease of implementation, particularly in settings where altered mentation is unreliably or inconsistently documented.
Acknowledgment
The authors thank the UCSF Division of Hospital Medicine Data Core for their assistance with data acquisition.
Sepsis is the leading cause of in-hospital mortality in the United States.1 Sepsis is present on admission in 85% of cases, and each hour delay in antibiotic treatment is associated with 4% to 7% increased odds of mortality.2,3 Prompt identification and treatment of sepsis is essential for reducing morbidity and mortality, but identifying sepsis during triage is challenging.2
Risk stratification scores that rely solely on data readily available at the bedside have been developed to quickly identify those at greatest risk of poor outcomes from sepsis in real time. The quick Sequential Organ Failure Assessment (qSOFA) score, the National Early Warning System (NEWS2), and the Shock Index are easy-to-calculate measures that use routinely collected clinical data that are not subject to laboratory delay. These scores can be incorporated into electronic health record (EHR)-based alerts and can be calculated longitudinally to track the risk of poor outcomes over time. qSOFA was developed to quantify patient risk at bedside in non-intensive care unit (ICU) settings, but there is no consensus about its ability to predict adverse outcomes such as mortality and ICU admission.4-6 The United Kingdom’s National Health Service uses NEWS2 to identify patients at risk for sepsis.7 NEWS has been shown to have similar or better sensitivity in identifying poorer outcomes in sepsis patients compared with systemic inflammatory response syndrome (SIRS) criteria and qSOFA.4,8-11 However, since the latest update of NEWS2 in 2017, there has been little study of its predictive ability. The Shock Index is a simple bedside score (heart rate divided by systolic blood pressure) that was developed to detect changes in cardiovascular performance before systemic shock onset. Although it was not developed for infection and has not been regularly applied in the sepsis literature, the Shock Index might be useful for identifying patients at increased risk of poor outcomes. Patients with higher and sustained Shock Index scores are more likely to experience morbidity, such as hyperlactatemia, vasopressor use, and organ failure, and also have an increased risk of mortality.12-14
Although the predictive abilities of these bedside risk stratification scores have been assessed individually using standard binary cut-points, the comparative performance of qSOFA, the Shock Index, and NEWS2 has not been evaluated in patients presenting to an emergency department (ED) with suspected sepsis.
METHODS
Design and Setting
We conducted a retrospective cohort study of ED patients who presented with suspected sepsis to the University of California San Francisco (UCSF) Helen Diller Medical Center at Parnassus Heights between June 1, 2012, and December 31, 2018. Our institution is a 785-bed academic teaching hospital with approximately 30,000 ED encounters per year. The study was approved with a waiver of informed consent by the UCSF Human Research Protection Program.
Participants
We use an Epic-based EHR platform (Epic 2017, Epic Systems Corporation) for clinical care, which was implemented on June 1, 2012. All data elements were obtained from Clarity, the relational database that stores Epic’s inpatient data. The study included encounters for patients age ≥18 years who had blood cultures ordered within 24 hours of ED presentation and administration of intravenous antibiotics within 24 hours. Repeat encounters were treated independently in our analysis.
Outcomes and Measures
We compared the ability of qSOFA, the Shock Index, and NEWS2 to predict in-hospital mortality and admission to the ICU from the ED (ED-to-ICU admission). We used the
We compared demographic and clinical characteristics of patients who were positive for qSOFA, the Shock Index, and NEWS2. Demographic data were extracted from the EHR and included primary language, age, sex, and insurance status. All International Classification of Diseases (ICD)-9/10 diagnosis codes were pulled from Clarity billing tables. We used the Elixhauser comorbidity groupings19 of ICD-9/10 codes present on admission to identify preexisting comorbidities and underlying organ dysfunction. To estimate burden of comorbid illnesses, we calculated the validated van Walraven comorbidity index,20 which provides an estimated risk of in-hospital death based on documented Elixhauser comorbidities. Admission level of care (acute, stepdown, or intensive care) was collected for inpatient admissions to assess initial illness severity.21 We also evaluated discharge disposition and in-hospital mortality. Index blood culture results were collected, and dates and timestamps of mechanical ventilation, fluid, vasopressor, and antibiotic administration were obtained for the duration of the encounter.
UCSF uses an automated, real-time, algorithm-based severe sepsis alert that is triggered when a patient meets ≥2 SIRS criteria and again when the patient meets severe sepsis or septic shock criteria (ie, ≥2 SIRS criteria in addition to end-organ dysfunction and/or fluid nonresponsive hypotension). This sepsis screening alert was in use for the duration of our study.22
Statistical Analysis
We performed a subgroup analysis among those who were diagnosed with sepsis, according to the 2016 Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) criteria.
All statistical analyses were conducted using Stata 14 (StataCorp). We summarized differences in demographic and clinical characteristics among the populations meeting each severity score but elected not to conduct hypothesis testing because patients could be positive for one or more scores. We calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each score to predict in-hospital mortality and ED-to-ICU admission. To allow comparison with other studies, we also created a composite outcome of either in-hospital mortality or ED-to-ICU admission.
RESULTS
Within our sample 23,837 ED patients had blood cultures ordered within 24 hours of ED presentation and were considered to have suspected sepsis. The mean age of the cohort was 60.8 years, and 1,612 (6.8%) had positive blood cultures. A total of 12,928 patients (54.2%) were found to have sepsis. We documented 1,427 in-hospital deaths (6.0%) and 3,149 (13.2%) ED-to-ICU admissions. At ED triage 1,921 (8.1%) were qSOFA-positive, 4,273 (17.9%) were Shock Index-positive, and 11,832 (49.6%) were NEWS2-positive. At ED triage, blood pressure, heart rate, respiratory rate, and oxygen saturated were documented in >99% of patients, 93.5% had temperature documented, and 28.5% had GCS recorded. If the window of assessment was widened to 1 hour, GCS was only documented among 44.2% of those with suspected sepsis.
Demographic Characteristics and Clinical Course
qSOFA-positive patients received antibiotics more quickly than those who were Shock Index-positive or NEWS2-positive (median 1.5, 1.8, and 2.8 hours after admission, respectively). In addition, those who were qSOFA-positive were more likely to have a positive blood culture (10.9%, 9.4%, and 8.5%, respectively) and to receive an EHR-based diagnosis of sepsis (77.0%, 69.6%, and 60.9%, respectively) than those who were Shock Index- or NEWS2-positive. Those who were qSOFA-positive also were more likely to be mechanically ventilated during their hospital stay (25.4%, 19.2%, and 10.8%, respectively) and to receive vasopressors (33.5%, 22.5%, and 12.2%, respectively). In-hospital mortality also was more common among those who were qSOFA-positive at triage (23.4%, 15.3%, and 9.2%, respectively).
Because both qSOFA and NEWS2 incorporate GCS, we explored baseline characteristics of patients with GCS documented at triage (n = 6,794). These patients were older (median age 63 and 61 years, P < .0001), more likely to be male (54.9% and 53.4%, P = .0031), more likely to have renal failure (22.8% and 20.1%, P < .0001), more likely to have liver disease (14.2% and 12.8%, P = .006), had a higher van Walraven comorbidity score on presentation (median 10 and 8, P < .0001), and were more likely to go directly to the ICU from the ED (20.2% and 10.6%, P < .0001). However, among the 6,397 GCS scores documented at triage, only 1,579 (24.7%) were abnormal.
Test Characteristics of qSOFA, Shock Index, and NEWS2 for Predicting In-hospital Mortality and ED-to-ICU Admission
Among 23,837 patients with suspected sepsis, NEWS2 had the highest sensitivity for predicting in-hospital mortality (76.0%; 95% CI, 73.7%-78.2%) and ED-to-ICU admission (78.9%; 95% CI, 77.5%-80.4%) but had the lowest specificity for in-hospital mortality (52.0%; 95% CI, 51.4%-52.7%) and for ED-to-ICU admission (54.8%; 95% CI, 54.1%-55.5%) (Table 3). qSOFA had the lowest sensitivity for in-hospital mortality (31.5%; 95% CI, 29.1%-33.9%) and ED-to-ICU admission (29.3%; 95% CI, 27.7%-30.9%) but the highest specificity for in-hospital mortality (93.4%; 95% CI, 93.1%-93.8%) and ED-to-ICU admission (95.2%; 95% CI, 94.9%-95.5%). The Shock Index had a sensitivity that fell between qSOFA and NEWS2 for in-hospital mortality (45.8%; 95% CI, 43.2%-48.5%) and ED-to-ICU admission (49.2%; 95% CI, 47.5%-51.0%). The specificity of the Shock Index also was between qSOFA and NEWS2 for in-hospital mortality (83.9%; 95% CI, 83.4%-84.3%) and ED-to-ICU admission (86.8%; 95% CI, 86.4%-87.3%). All three scores exhibited relatively low PPV, ranging from 9.2% to 23.4% for in-hospital mortality and 21.0% to 48.0% for ED-to-ICU triage. Conversely, all three scores exhibited relatively high NPV, ranging from 95.5% to 97.1% for in-hospital mortality and 89.8% to 94.5% for ED-to-ICU triage.
When considering a binary cutoff, the Shock Index exhibited the highest AUROC for in-hospital mortality (0.648; 95% CI, 0.635-0.662) and had a significantly higher AUROC than qSOFA (AUROC, 0.625; 95% CI, 0.612-0.637; P = .0005), but there was no difference compared with NEWS2 (AUROC, 0.640; 95% CI, 0.628-0.652; P = .2112). NEWS2 had a significantly higher AUROC than qSOFA for predicting in-hospital mortality (P = .0227). The Shock Index also exhibited the highest AUROC for ED-to-ICU admission (0.680; 95% CI, 0.617-0.689), which was significantly higher than the AUROC for qSOFA (P < .0001) and NEWS2 (P = 0.0151). NEWS2 had a significantly higher AUROC than qSOFA for predicting ED-to-ICU admission (P < .0001). Similar findings were seen in patients found to have sepsis.
DISCUSSION
In this retrospective cohort study of 23,837 patients who presented to the ED with suspected sepsis, the standard qSOFA threshold was met least frequently, followed by the Shock Index and NEWS2. NEWS2 had the highest sensitivity but the lowest specificity for predicting in-hospital mortality and ED-to-ICU admission, making it a challenging bedside risk stratification scale for identifying patients at risk of poor clinical outcomes. When comparing predictive performance among the three scales, qSOFA had the highest specificity and the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission in this cohort of patients with suspected sepsis. These trends in sensitivity, specificity, and AUROC were consistent among those who met EHR criteria for a sepsis diagnosis. In the analysis of the three scoring systems using all available cut-points, qSOFA and NEWS2 had the highest AUROCs, followed by the Shock Index.
Considering the rapid progression from organ dysfunction to death in sepsis patients, as well as the difficulty establishing a sepsis diagnosis at triage,23 providers must quickly identify patients at increased risk of poor outcomes when they present to the ED. Sepsis alerts often are built using SIRS criteria,27 including the one used for sepsis surveillance at UCSF since 2012,22 but the white blood cell count criterion is subject to a laboratory lag and could lead to a delay in identification. Implementation of a point-of-care bedside score alert that uses readily available clinical data could allow providers to identify patients at greatest risk of poor outcomes immediately at ED presentation and triage, which motivated us to explore the predictive performance of qSOFA, the Shock Index, and NEWS2.
Our study is the first to provide a head-to-head comparison of the predictive performance of qSOFA, the Shock Index, and NEWS2, three easy-to-calculate bedside risk scores that use EHR data collected among patients with suspected sepsis. The Sepsis-3 guidelines recommend qSOFA to quickly identify non-ICU patients at greatest risk of poor outcomes because the measure exhibited predictive performance similar to the more extensive SOFA score outside the ICU.16,23 Although some studies have confirmed qSOFA’s high predictive performance,28-31 our test characteristics and AUROC findings are in line with other published analyses.4,6,10,17 The UK National Health Service is using NEWS2 to screen for patients at risk of poor outcomes from sepsis. Several analyses that assessed the predictive ability of NEWS have reported estimates in line with our findings.4,10,32 The Shock Index was introduced in 1967 and provided a metric to evaluate hemodynamic stability based on heart rate and systolic blood pressure.33 The Shock Index has been studied in several contexts, including sepsis,34 and studies show that a sustained Shock Index is associated with increased odds of vasopressor administration, higher prevalence of hyperlactatemia, and increased risk of poor outcomes in the ICU.13,14
For our study, we were particularly interested in exploring how the Shock Index would compare with more frequently used severity scores such as qSOFA and NEWS2 among patients with suspected sepsis, given the simplicity of its calculation and the easy availability of required data. In our cohort of 23,837 patients, only 159 people had missing blood pressure and only 71 had omitted heart rate. In contrast, both qSOFA and NEWS2 include an assessment of level of consciousness that can be subject to variability in assessment methods and EHR documentation across institutions.11 In our cohort, GCS within 30 minutes of ED presentation was missing in 72 patients, which could have led to incomplete calculation of qSOFA and NEWS2 if a missing value was not actually within normal limits.
Several investigations relate qSOFA to NEWS but few compare qSOFA with the newer NEWS2, and even fewer evaluate the Shock Index with any of these scores.10,11,18,29,35-37 In general, studies have shown that NEWS exhibits a higher AUROC for predicting mortality, sepsis with organ dysfunction, and ICU admission, often as a composite outcome.4,11,18,37,38 A handful of studies compare the Shock Index to SIRS; however, little has been done to compare the Shock Index to qSOFA or NEWS2, scores that have been used specifically for sepsis and might be more predictive of poor outcomes than SIRS.33 In our study, the Shock Index had a higher AUROC than either qSOFA or NEWS2 for predicting in-hospital mortality and ED-to-ICU admission measured as separate outcomes and as a composite outcome using standard cut-points for these scores.
When selecting a severity score to apply in an institution, it is important to carefully evaluate the score’s test characteristics, in addition to considering the availability of reliable data. Tests with high sensitivity and NPV for the population being studied can be useful to rule out disease or risk of poor outcome, while tests with high specificity and PPV can be useful to rule in disease or risk of poor outcome.39 When considering specificity, qSOFA’s performance was superior to the Shock Index and NEWS2 in our study, but a small percentage of the population was identified using a cut-point of qSOFA ≥2. If we used qSOFA and applied this standard cut-point at our institution, we could be confident that those identified were at increased risk, but we would miss a significant number of patients who would experience a poor outcome. When considering sensitivity, performance of NEWS2 was superior to qSOFA and the Shock Index in our study, but one-half of the population was identified using a cut-point of NEWS2 ≥5. If we were to apply this standard NEWS2 cut-point at our institution, we would assume that one-half of our population was at risk, which might drive resource use towards patients who will not experience a poor outcome. Although none of the scores exhibited a robust AUROC measure, the Shock Index had the highest AUROC for in-hospital mortality and ED-to-ICU admission when using the standard binary cut-point, and its sensitivity and specificity is between that of qSOFA and NEWS2, potentially making it a score to use in settings where qSOFA and NEWS2 score components, such as altered mentation, are not reliably collected. Finally, our sensitivity analysis varying the binary cut-point of each score within our population demonstrated that the standard cut-points might not be as useful within a specific population and might need to be tailored for implementation, balancing sensitivity, specificity, PPV, and NPV to meet local priorities and ICU capacity.
Our study has limitations. It is a single-center, retrospective analysis, factors that could reduce generalizability. However, it does include a large and diverse patient population spanning several years. Missing GCS data could have affected the predictive ability of qSOFA and NEWS2 in our cohort. We could not reliably perform imputation of GCS because of the high missingness and therefore we assumed missing was normal, as was done in the Sepsis-3 derivation studies.16 Previous studies have attempted to impute GCS and have not observed improved performance of qSOFA to predict mortality.40 Because manually collected variables such as GCS are less reliably documented in the EHR, there might be limitations in their use for triage risk scores.
Although the current analysis focused on the predictive performance of qSOFA, the Shock Index, and NEWS2 at triage, performance of these scores could affect the ED team’s treatment decisions before handoff to the hospitalist team and the expected level of care the patient will receive after in-patient admission. These tests also have the advantage of being easy to calculate at the bedside over time, which could provide an objective assessment of longitudinal predicted prognosis.
CONCLUSION
Local priorities should drive selection of a screening tool, balancing sensitivity, specificity, PPV, and NPV to achieve the institution’s goals. qSOFA, Shock Index, and NEWS2 are risk stratification tools that can be easily implemented at ED triage using data available at the bedside. Although none of these scores performed strongly when comparing AUROCs, qSOFA was highly specific for identifying patients with poor outcomes, and NEWS2 was the most sensitive for ruling out those at high risk among patients with suspected sepsis. The Shock Index exhibited a sensitivity and specificity that fell between qSOFA and NEWS2 and also might be considered to identify those at increased risk, given its ease of implementation, particularly in settings where altered mentation is unreliably or inconsistently documented.
Acknowledgment
The authors thank the UCSF Division of Hospital Medicine Data Core for their assistance with data acquisition.
1. Jones SL, Ashton CM, Kiehne LB, et al. Outcomes and resource use of sepsis-associated stays by presence on admission, severity, and hospital type. Med Care. 2016;54(3):303-310. https://doi.org/10.1097/MLR.0000000000000481
2. Seymour CW, Gesten F, Prescott HC, et al. Time to treatment and mortality during mandated emergency care for sepsis. N Engl J Med. 2017;376(23):2235-2244. https://doi.org/10.1056/NEJMoa1703058
3. Kumar A, Roberts D, Wood KE, et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589-1596. https://doi.org/10.1097/01.CCM.0000217961.75225.E9
4. Churpek MM, Snyder A, Sokol S, Pettit NN, Edelson DP. Investigating the impact of different suspicion of infection criteria on the accuracy of Quick Sepsis-Related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores. Crit Care Med. 2017;45(11):1805-1812. https://doi.org/10.1097/CCM.0000000000002648
5. Abdullah SMOB, Sørensen RH, Dessau RBC, Sattar SMRU, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
6. Kim KS, Suh GJ, Kim K, et al. Quick Sepsis-related Organ Failure Assessment score is not sensitive enough to predict 28-day mortality in emergency department patients with sepsis: a retrospective review. Clin Exp Emerg Med. 2019;6(1):77-83. HTTPS://DOI.ORG/ 10.15441/ceem.17.294
7. National Early Warning Score (NEWS) 2: Standardising the assessment of acute-illness severity in the NHS. Royal College of Physicians; 2017.
8. Brink A, Alsma J, Verdonschot RJCG, et al. Predicting mortality in patients with suspected sepsis at the emergency department: a retrospective cohort study comparing qSOFA, SIRS and National Early Warning Score. PLoS One. 2019;14(1):e0211133. https://doi.org/ 10.1371/journal.pone.0211133
9. Redfern OC, Smith GB, Prytherch DR, Meredith P, Inada-Kim M, Schmidt PE. A comparison of the Quick Sequential (Sepsis-Related) Organ Failure Assessment Score and the National Early Warning Score in non-ICU patients with/without infection. Crit Care Med. 2018;46(12):1923-1933. https://doi.org/10.1097/CCM.0000000000003359
10. Churpek MM, Snyder A, Han X, et al. Quick Sepsis-related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores for detecting clinical deterioration in infected patients outside the intensive care unit. Am J Respir Crit Care Med. 2017;195(7):906-911. https://doi.org/10.1164/rccm.201604-0854OC
11. Goulden R, Hoyle MC, Monis J, et al. qSOFA, SIRS and NEWS for predicting inhospital mortality and ICU admission in emergency admissions treated as sepsis. Emerg Med J. 2018;35(6):345-349. https://doi.org/10.1136/emermed-2017-207120
12. Biney I, Shepherd A, Thomas J, Mehari A. Shock Index and outcomes in patients admitted to the ICU with sepsis. Chest. 2015;148(suppl 4):337A. https://doi.org/https://doi.org/10.1378/chest.2281151
13. Wira CR, Francis MW, Bhat S, Ehrman R, Conner D, Siegel M. The shock index as a predictor of vasopressor use in emergency department patients with severe sepsis. West J Emerg Med. 2014;15(1):60-66. https://doi.org/10.5811/westjem.2013.7.18472
14. Berger T, Green J, Horeczko T, et al. Shock index and early recognition of sepsis in the emergency department: pilot study. West J Emerg Med. 2013;14(2):168-174. https://doi.org/10.5811/westjem.2012.8.11546
15. Middleton DJ, Smith TO, Bedford R, Neilly M, Myint PK. Shock Index predicts outcome in patients with suspected sepsis or community-acquired pneumonia: a systematic review. J Clin Med. 2019;8(8):1144. https://doi.org/10.3390/jcm8081144
16. Seymour CW, Liu VX, Iwashyna TJ, et al. Assessment of clinical criteria for sepsis: for the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):762-774. https://doi.org/ 10.1001/jama.2016.0288
17. Abdullah S, Sørensen RH, Dessau RBC, Sattar S, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
18. Usman OA, Usman AA, Ward MA. Comparison of SIRS, qSOFA, and NEWS for the early identification of sepsis in the Emergency Department. Am J Emerg Med. 2018;37(8):1490-1497. https://doi.org/10.1016/j.ajem.2018.10.058
19. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
20. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5
21. Prin M, Wunsch H. The role of stepdown beds in hospital care. Am J Respir Crit Care Med. 2014;190(11):1210-1216. https://doi.org/10.1164/rccm.201406-1117PP
22. Narayanan N, Gross AK, Pintens M, Fee C, MacDougall C. Effect of an electronic medical record alert for severe sepsis among ED patients. Am J Emerg Med. 2016;34(2):185-188. https://doi.org/10.1016/j.ajem.2015.10.005
23. Singer M, Deutschman CS, Seymour CW, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-810. https://doi.org/10.1001/jama.2016.0287
24. Rhee C, Dantes R, Epstein L, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009-2014. JAMA. 2017;318(13):1241-1249. https://doi.org/10.1001/jama.2017.13836
25. Safari S, Baratloo A, Elfil M, Negida A. Evidence based emergency medicine; part 5 receiver operating curve and area under the curve. Emerg (Tehran). 2016;4(2):111-113.
26. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
27. Kangas C, Iverson L, Pierce D. Sepsis screening: combining Early Warning Scores and SIRS Criteria. Clin Nurs Res. 2021;30(1):42-49. https://doi.org/10.1177/1054773818823334.
28. Freund Y, Lemachatti N, Krastinova E, et al. Prognostic accuracy of Sepsis-3 Criteria for in-hospital mortality among patients with suspected infection presenting to the emergency department. JAMA. 2017;317(3):301-308. https://doi.org/10.1001/jama.2016.20329
29. Finkelsztein EJ, Jones DS, Ma KC, et al. Comparison of qSOFA and SIRS for predicting adverse outcomes of patients with suspicion of sepsis outside the intensive care unit. Crit Care. 2017;21(1):73. https://doi.org/10.1186/s13054-017-1658-5
30. Canet E, Taylor DM, Khor R, Krishnan V, Bellomo R. qSOFA as predictor of mortality and prolonged ICU admission in Emergency Department patients with suspected infection. J Crit Care. 2018;48:118-123. https://doi.org/10.1016/j.jcrc.2018.08.022
31. Anand V, Zhang Z, Kadri SS, Klompas M, Rhee C; CDC Prevention Epicenters Program. Epidemiology of Quick Sequential Organ Failure Assessment criteria in undifferentiated patients and association with suspected infection and sepsis. Chest. 2019;156(2):289-297. https://doi.org/10.1016/j.chest.2019.03.032
32. Hamilton F, Arnold D, Baird A, Albur M, Whiting P. Early Warning Scores do not accurately predict mortality in sepsis: A meta-analysis and systematic review of the literature. J Infect. 2018;76(3):241-248. https://doi.org/10.1016/j.jinf.2018.01.002
33. Koch E, Lovett S, Nghiem T, Riggs RA, Rech MA. Shock Index in the emergency department: utility and limitations. Open Access Emerg Med. 2019;11:179-199. https://doi.org/10.2147/OAEM.S178358
34. Yussof SJ, Zakaria MI, Mohamed FL, Bujang MA, Lakshmanan S, Asaari AH. Value of Shock Index in prognosticating the short-term outcome of death for patients presenting with severe sepsis and septic shock in the emergency department. Med J Malaysia. 2012;67(4):406-411.
35. Siddiqui S, Chua M, Kumaresh V, Choo R. A comparison of pre ICU admission SIRS, EWS and q SOFA scores for predicting mortality and length of stay in ICU. J Crit Care. 2017;41:191-193. https://doi.org/10.1016/j.jcrc.2017.05.017
36. Costa RT, Nassar AP, Caruso P. Accuracy of SOFA, qSOFA, and SIRS scores for mortality in cancer patients admitted to an intensive care unit with suspected infection. J Crit Care. 2018;45:52-57. https://doi.org/10.1016/j.jcrc.2017.12.024
37. Mellhammar L, Linder A, Tverring J, et al. NEWS2 is Superior to qSOFA in detecting sepsis with organ dysfunction in the emergency department. J Clin Med. 2019;8(8):1128. https://doi.org/10.3390/jcm8081128
38. Szakmany T, Pugh R, Kopczynska M, et al. Defining sepsis on the wards: results of a multi-centre point-prevalence study comparing two sepsis definitions. Anaesthesia. 2018;73(2):195-204. https://doi.org/10.1111/anae.14062
39. Newman TB, Kohn MA. Evidence-Based Diagnosis: An Introduction to Clinical Epidemiology. Cambridge University Press; 2009.
40. Askim Å, Moser F, Gustad LT, et al. Poor performance of quick-SOFA (qSOFA) score in predicting severe sepsis and mortality - a prospective study of patients admitted with infection to the emergency department. Scand J Trauma Resusc Emerg Med. 2017;25(1):56. https://doi.org/10.1186/s13049-017-0399-4
1. Jones SL, Ashton CM, Kiehne LB, et al. Outcomes and resource use of sepsis-associated stays by presence on admission, severity, and hospital type. Med Care. 2016;54(3):303-310. https://doi.org/10.1097/MLR.0000000000000481
2. Seymour CW, Gesten F, Prescott HC, et al. Time to treatment and mortality during mandated emergency care for sepsis. N Engl J Med. 2017;376(23):2235-2244. https://doi.org/10.1056/NEJMoa1703058
3. Kumar A, Roberts D, Wood KE, et al. Duration of hypotension before initiation of effective antimicrobial therapy is the critical determinant of survival in human septic shock. Crit Care Med. 2006;34(6):1589-1596. https://doi.org/10.1097/01.CCM.0000217961.75225.E9
4. Churpek MM, Snyder A, Sokol S, Pettit NN, Edelson DP. Investigating the impact of different suspicion of infection criteria on the accuracy of Quick Sepsis-Related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores. Crit Care Med. 2017;45(11):1805-1812. https://doi.org/10.1097/CCM.0000000000002648
5. Abdullah SMOB, Sørensen RH, Dessau RBC, Sattar SMRU, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
6. Kim KS, Suh GJ, Kim K, et al. Quick Sepsis-related Organ Failure Assessment score is not sensitive enough to predict 28-day mortality in emergency department patients with sepsis: a retrospective review. Clin Exp Emerg Med. 2019;6(1):77-83. HTTPS://DOI.ORG/ 10.15441/ceem.17.294
7. National Early Warning Score (NEWS) 2: Standardising the assessment of acute-illness severity in the NHS. Royal College of Physicians; 2017.
8. Brink A, Alsma J, Verdonschot RJCG, et al. Predicting mortality in patients with suspected sepsis at the emergency department: a retrospective cohort study comparing qSOFA, SIRS and National Early Warning Score. PLoS One. 2019;14(1):e0211133. https://doi.org/ 10.1371/journal.pone.0211133
9. Redfern OC, Smith GB, Prytherch DR, Meredith P, Inada-Kim M, Schmidt PE. A comparison of the Quick Sequential (Sepsis-Related) Organ Failure Assessment Score and the National Early Warning Score in non-ICU patients with/without infection. Crit Care Med. 2018;46(12):1923-1933. https://doi.org/10.1097/CCM.0000000000003359
10. Churpek MM, Snyder A, Han X, et al. Quick Sepsis-related Organ Failure Assessment, Systemic Inflammatory Response Syndrome, and Early Warning Scores for detecting clinical deterioration in infected patients outside the intensive care unit. Am J Respir Crit Care Med. 2017;195(7):906-911. https://doi.org/10.1164/rccm.201604-0854OC
11. Goulden R, Hoyle MC, Monis J, et al. qSOFA, SIRS and NEWS for predicting inhospital mortality and ICU admission in emergency admissions treated as sepsis. Emerg Med J. 2018;35(6):345-349. https://doi.org/10.1136/emermed-2017-207120
12. Biney I, Shepherd A, Thomas J, Mehari A. Shock Index and outcomes in patients admitted to the ICU with sepsis. Chest. 2015;148(suppl 4):337A. https://doi.org/https://doi.org/10.1378/chest.2281151
13. Wira CR, Francis MW, Bhat S, Ehrman R, Conner D, Siegel M. The shock index as a predictor of vasopressor use in emergency department patients with severe sepsis. West J Emerg Med. 2014;15(1):60-66. https://doi.org/10.5811/westjem.2013.7.18472
14. Berger T, Green J, Horeczko T, et al. Shock index and early recognition of sepsis in the emergency department: pilot study. West J Emerg Med. 2013;14(2):168-174. https://doi.org/10.5811/westjem.2012.8.11546
15. Middleton DJ, Smith TO, Bedford R, Neilly M, Myint PK. Shock Index predicts outcome in patients with suspected sepsis or community-acquired pneumonia: a systematic review. J Clin Med. 2019;8(8):1144. https://doi.org/10.3390/jcm8081144
16. Seymour CW, Liu VX, Iwashyna TJ, et al. Assessment of clinical criteria for sepsis: for the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):762-774. https://doi.org/ 10.1001/jama.2016.0288
17. Abdullah S, Sørensen RH, Dessau RBC, Sattar S, Wiese L, Nielsen FE. Prognostic accuracy of qSOFA in predicting 28-day mortality among infected patients in an emergency department: a prospective validation study. Emerg Med J. 2019;36(12):722-728. https://doi.org/10.1136/emermed-2019-208456
18. Usman OA, Usman AA, Ward MA. Comparison of SIRS, qSOFA, and NEWS for the early identification of sepsis in the Emergency Department. Am J Emerg Med. 2018;37(8):1490-1497. https://doi.org/10.1016/j.ajem.2018.10.058
19. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8-27. https://doi.org/10.1097/00005650-199801000-00004
20. van Walraven C, Austin PC, Jennings A, Quan H, Forster AJ. A modification of the Elixhauser comorbidity measures into a point system for hospital death using administrative data. Med Care. 2009;47(6):626-633. https://doi.org/10.1097/MLR.0b013e31819432e5
21. Prin M, Wunsch H. The role of stepdown beds in hospital care. Am J Respir Crit Care Med. 2014;190(11):1210-1216. https://doi.org/10.1164/rccm.201406-1117PP
22. Narayanan N, Gross AK, Pintens M, Fee C, MacDougall C. Effect of an electronic medical record alert for severe sepsis among ED patients. Am J Emerg Med. 2016;34(2):185-188. https://doi.org/10.1016/j.ajem.2015.10.005
23. Singer M, Deutschman CS, Seymour CW, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-810. https://doi.org/10.1001/jama.2016.0287
24. Rhee C, Dantes R, Epstein L, et al. Incidence and trends of sepsis in US hospitals using clinical vs claims data, 2009-2014. JAMA. 2017;318(13):1241-1249. https://doi.org/10.1001/jama.2017.13836
25. Safari S, Baratloo A, Elfil M, Negida A. Evidence based emergency medicine; part 5 receiver operating curve and area under the curve. Emerg (Tehran). 2016;4(2):111-113.
26. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837-845.
27. Kangas C, Iverson L, Pierce D. Sepsis screening: combining Early Warning Scores and SIRS Criteria. Clin Nurs Res. 2021;30(1):42-49. https://doi.org/10.1177/1054773818823334.
28. Freund Y, Lemachatti N, Krastinova E, et al. Prognostic accuracy of Sepsis-3 Criteria for in-hospital mortality among patients with suspected infection presenting to the emergency department. JAMA. 2017;317(3):301-308. https://doi.org/10.1001/jama.2016.20329
29. Finkelsztein EJ, Jones DS, Ma KC, et al. Comparison of qSOFA and SIRS for predicting adverse outcomes of patients with suspicion of sepsis outside the intensive care unit. Crit Care. 2017;21(1):73. https://doi.org/10.1186/s13054-017-1658-5
30. Canet E, Taylor DM, Khor R, Krishnan V, Bellomo R. qSOFA as predictor of mortality and prolonged ICU admission in Emergency Department patients with suspected infection. J Crit Care. 2018;48:118-123. https://doi.org/10.1016/j.jcrc.2018.08.022
31. Anand V, Zhang Z, Kadri SS, Klompas M, Rhee C; CDC Prevention Epicenters Program. Epidemiology of Quick Sequential Organ Failure Assessment criteria in undifferentiated patients and association with suspected infection and sepsis. Chest. 2019;156(2):289-297. https://doi.org/10.1016/j.chest.2019.03.032
32. Hamilton F, Arnold D, Baird A, Albur M, Whiting P. Early Warning Scores do not accurately predict mortality in sepsis: A meta-analysis and systematic review of the literature. J Infect. 2018;76(3):241-248. https://doi.org/10.1016/j.jinf.2018.01.002
33. Koch E, Lovett S, Nghiem T, Riggs RA, Rech MA. Shock Index in the emergency department: utility and limitations. Open Access Emerg Med. 2019;11:179-199. https://doi.org/10.2147/OAEM.S178358
34. Yussof SJ, Zakaria MI, Mohamed FL, Bujang MA, Lakshmanan S, Asaari AH. Value of Shock Index in prognosticating the short-term outcome of death for patients presenting with severe sepsis and septic shock in the emergency department. Med J Malaysia. 2012;67(4):406-411.
35. Siddiqui S, Chua M, Kumaresh V, Choo R. A comparison of pre ICU admission SIRS, EWS and q SOFA scores for predicting mortality and length of stay in ICU. J Crit Care. 2017;41:191-193. https://doi.org/10.1016/j.jcrc.2017.05.017
36. Costa RT, Nassar AP, Caruso P. Accuracy of SOFA, qSOFA, and SIRS scores for mortality in cancer patients admitted to an intensive care unit with suspected infection. J Crit Care. 2018;45:52-57. https://doi.org/10.1016/j.jcrc.2017.12.024
37. Mellhammar L, Linder A, Tverring J, et al. NEWS2 is Superior to qSOFA in detecting sepsis with organ dysfunction in the emergency department. J Clin Med. 2019;8(8):1128. https://doi.org/10.3390/jcm8081128
38. Szakmany T, Pugh R, Kopczynska M, et al. Defining sepsis on the wards: results of a multi-centre point-prevalence study comparing two sepsis definitions. Anaesthesia. 2018;73(2):195-204. https://doi.org/10.1111/anae.14062
39. Newman TB, Kohn MA. Evidence-Based Diagnosis: An Introduction to Clinical Epidemiology. Cambridge University Press; 2009.
40. Askim Å, Moser F, Gustad LT, et al. Poor performance of quick-SOFA (qSOFA) score in predicting severe sepsis and mortality - a prospective study of patients admitted with infection to the emergency department. Scand J Trauma Resusc Emerg Med. 2017;25(1):56. https://doi.org/10.1186/s13049-017-0399-4
© 2021 Society of Hospital Medicine
Drawing Down From Crisis: More Lessons From a Soldier
Last year, I wrote an article for the Journal of Hospital Medicine offering tips to healthcare providers in what was then an expanding COVID-19 environment.1 These lessons were drawn from my experiences during the “tough fights” and crisis situations of my military career, situations similar to what healthcare providers experienced during the pandemic.
Now, as vaccination rates rise and hospitalization rates fall, the nation and healthcare profession begin the transition to “normalcy.” What should healthcare professionals expect as they transition from a year of operating in a crisis to resumption of the habitual? What memories and lessons will linger from a long, tough fight against COVID-19, and how might physicians best approach the many post-crisis challenges they will surely face?
My military experiences inform the tips I offer to those in the medical profession. Both professions depend on adeptly leading and building a functional and effective organizational culture under trying circumstances. It may seem strange, but the challenges healthcare workers (HCWs) faced in fighting COVID-19 are comparable to what soldiers experience on a battlefield. And now, as citizens return to “normal” (however normal is defined), only naïve HCWs will believe they can simply resume their previous habits and practices. This part of the journey will present new challenges and unique opportunities.
Healthcare has changed…and so have you! Just like soldiers coming home from the battlefield face a necessarily new and different world, HCWs will also face changing circumstances, environments, and organizational requirements. Given this new landscape, I offer some of my lessons learned coming out of combat to help you adapt.
REFLECTIONS
Heading home from my last combat tour in Iraq, I found myself gazing out the aircraft window and pondering my personal experiences during a very long combat tour commanding a multinational task force. Pulling out my green soldier’s notebook, I rapidly scratched out some reflections on where I was, what I had learned, and what I needed to address personally and professionally. In talking with physicians in the healthcare organization where I now work, this emotional checklist seems to mirror some of the same thoughts they face coming out of the COVID-19 crisis.
Expect exhaustion. There’s a military axiom that “fatigue can make cowards of us all,” and while I don’t think I had succumbed to cowardice in battle, after 15 months in combat I was exhausted. Commanders in combat—or HCWs fighting a pandemic—face unrelenting demands from a variety of audiences. Leaders are asked to solve unsolvable problems, be at the right place at the right time with the right answers, have more energy than others, be upbeat, and exhibit behaviors that will motivate the “troops.” That’s true even if they’re exhausted and weary to the bone, serving on multiple teams, and attending endless meetings. There is also the common and unfortunate expectation that leaders should not take any time for themselves.
During the pandemic, most HCWs reported sleeping less, having little time to interact casually with others, and having less time for personal reflection, exercise, personal growth, or even prayer. My solution for addressing exhaustion was to develop a personal plan to address each one of these areas—mental, emotional, physical, spiritual—with a detailed rest and recovery strategy. I wrote my plan down, knowing that I would need to discuss this blueprint with both my employer and my spouse, who I suspected would have different ideas on what my schedule should look like after returning “home.” Healthcare providers have been through the same kinds of stresses and need to ask themselves: What recovery plan have I designed to help me overcome the fatigue I feel, and have I talked about this plan with the people who will be affected by it?
Take pride in what your teams accomplished. I was proud of how my teams had accomplished the impossible and how they had adapted to continually changing situations. Whenever military organizations know they’ll face the enemy in combat, they feel heightened anxiety, increased fear, and concern about the preparedness of their team. The Army, like any successful team, attempts to mitigate those emotions through training. During my reflections, I remembered the teams that came together to accomplish very tough missions. Some of those teams were those I had concerns about prior to deployment, but fortunately they often surprised me with their adaptability and successes in combat.
Leaders in healthcare can likely relate. Even in normal situations, organizational fault lines exist between physicians, nurses, and administrators. These fault lines may manifest as communication disconnects and distrust between different members who may not completely trust one another due to differences in training, culture, or role within the organization. But during a crisis, rifts dissipate and trust evolves as different cultures are forced to work together. Many healthcare organizations report that, during the COVID crisis, most personality conflicts, communication disconnects, and organizational dysfunctions receded, and organizations saw more and greater coordination and collaboration. Extensive research on leadership demonstrates that crises drive teams to communicate better and become more effective and efficient in accomplishing stated goals, resulting in team members who relish “being there” for one another like never before. These positive changes must be reinforced to ensure these newly formed high-performing teams do not revert back to work silos, which usually occurs due to distrust.
Just as important as pride in teams is the pride in the accomplishment of specific individuals during times of crisis. Diverse members of any organization deliver some of the best solutions to the toughest problems when they are included in the discussion, allowed to bring their ideas to the table, and rewarded for their actions (and their courage)! Just one example is given by Dr Sasha Shillcut as she describes the innovations and adaptations of the women physicians she observed in her organization during the COVID-19 crisis,2 and there are many examples of other organizations citing similar transformation in areas like telemedicine, emergency department procedures, and equipment design and use.3,4
Anticipate “survivor’s guilt.” During my three combat tours, 253 soldiers under my command or in my organization sacrificed their lives for the mission, and many more were wounded in action. There are times when bad dreams remind me of some of the circumstances surrounding the incidents that took the lives of those who died, and I often wake with a start and in a sweat. The first question I always ask myself in the middle of the night when this happens is, “Why did they die, and why did I survive?” That question is always followed by, “What might I have done differently to prevent those deaths?”
As we draw down from treating patients during the COVID-19 crisis, healthcare providers must also be wary of “survivor’s guilt.” Survivor’s guilt is a strong emotion for anyone who has survived a crisis, especially when their friends or loved ones have not. Healthcare providers have lost many patients, but they have also lost colleagues, friends, and family members. Because you are in the healing profession, many of you will question what more you could have done to prevent the loss of life. You likely won’t ever be completely satisfied with the answer, but I have a recommendation that may assuage your emotions.
In combat, we continually memorialized our fallen comrades in ceremonies that are attended by the entire unit. One of my commanders had an idea to keep pictures of those who had made the ultimate sacrifice, and on my desk is a box with the 253 pictures of those dedicated individuals who were killed in action under my command or in my unit. On the top of the box are the words “Make It Matter.” I look at those pictures often to remember them and their selfless service to the nation, and I often ask myself whether I am “making it matter” in my daily activities. Does your healthcare facility have plans for a memorial service for all those who died while in your care? Is there a special tribute in your hospital to those healthcare providers who paid the ultimate sacrifice in caring for patients? Most importantly, have you rededicated yourself to your profession, knowing that what you learned during the pandemic will help you be a better physician in the future, and do you have the knowledge that you are making a meaningful difference every day you serve in healthcare?
Relish being home. On that flight back to family, my excitement was palpable. But there were challenges too, as I knew I had to continue to focus on my team, my organization, and my profession. While images on the internet often show soldiers returning from war rushing into the arms of their loved ones, soldiers never leave the demands associated with wearing the cloth of the country. As a result, many marriages and families are damaged when one member who has been so singularly focused returns home and is still caught up in the demands of the job. They find it is difficult to pick up where they’ve left off, forgetting their family has also been under a different kind of intense stress.
These same challenges will face HCWs. Many of you voluntarily distanced yourself from family and friends due to a fear of transmitting the disease. Spouses and children underwent traumatic challenges in their jobs, holding together the household and piloting kids through schooling. My biggest recommendation is this: strive for a return to a healthy balance, be wary of any sharp edges that appear in your personality or in your relationships, and be open in communicating with those you love. Relying on friends, counselors, and mentors who can provide trusted advice—as well as therapy, if necessary—is not a sign of weakness, but a sign of strength and courage. The pandemic has affected our lives more than we can imagine, and “coming out” of the crisis will continue to test our humanity and civility like never before. Trust me on this one. I’ve been there.
RECOMMENDATIONS FOR POST-CRISIS ACTIONS
These reflections open us to issues physicians must address in the months after your “redeployment” from dealing with the pandemic. When soldiers redeploy from combat, every unit develops a plan to address personal and professional growth for individual members of the team. Additionally, leaders develop a plan to sustain performance and improve teams and organizational approaches. The objective? Polish the diamond from what we learned during the crisis, while preparing for those things that might detract from effectiveness in future crises. It’s an SOP (standard operating procedure) for military units to do these things. Is this approach also advisable for healthcare professionals and teams in responding to crises?
Crises increase stress on individuals and disrupt the functioning of organizations, but crises also provide phenomenal opportunities for growth.5 Adaptive organizations, be they military or healthcare, must take time to understand how the crises affected people and the organizational framework, while also preparing for potential future disruptions. While HCWs and their respective organizations are usually adept at learning from short-term emergencies (eg, limited disease outbreaks, natural disasters, mass-casualty events), they are less practiced in addressing crises that affect the profession for months. It has been a century since the medical profession has been faced with a global pandemic, but experts suggest other pandemics may be on the short-term horizon.6 We ought to use this past year of experiences to prepare for them.
Pay attention to your personal needs and the conditions of others on your team. After returning from combat, I was exhausted and stressed intellectually, physically, emotionally, and spiritually. From what I’ve seen, healthcare providers fit that same description, and the fatigue is palpable. Many of you have experienced extreme stress. I have experienced extremepost-traumatic stress, and it is important to understand that this will affect some on your team.7 In addition to addressing stress—and this is advice I give to all the physicians I know—find the time to get a physical examination. While the Army requires yearly physicals for all soldiers (especially generals!), most healthcare providers I know are shockingly deficient in taking the time to get a checkup from one of their colleagues. Commit to fixing that.
Reflect on what you have learned during this period. Take an afternoon with an adult beverage (if that’s your style) and reflect on what you learned and what others might learn from your unique experiences. Then, take some notes and shape your ideas. What did you experience? What adaptations did you or your team make during the pandemic? What worked and what didn’t? What things do you want to sustain in your practice and what things do you want to eliminate? What did you learn about the medical arts…or even about your Hippocratic Oath? If you have a mentor, share these thoughts with them; if you don’t have a mentor, find one and then share your thoughts with them. Get some outside feedback.
Assess team strengths and weaknesses. If you’re a formal physician leader (someone with a title and a position on your team), it’s your responsibility to provide feedback on both people and processes. If you’re an informal leader (someone who is a member of the team but doesn’t have specific leadership responsibilities outside your clinical role) and you don’t see this happening, volunteer to run the session for your formal leader and your organization. This session should last several hours and be held in a comfortable setting. You should prepare your team so they aren’t defensive about the points that may arise. Determine strengths and opportunities by asking for feedback on communication, behaviors, medical knowledge, emotional intelligence, and execution of tasks. Determine which processes and systems either worked or didn’t work, and either polish the approaches or drive change to improve systems as you get back to normal. Crises provide an opportunity to fix what’s broken while also reinforcing the things that worked in the crisis that might not be normal procedure. Don’t go back to old ways if those weren’t the things or the approaches you were using under critical conditions.
Encourage completion of an organization-wide after-action review (AAR). As I started writing this article, I watched CNN’s Dr Sanjay Gupta conduct a review of actions with the key physicians who contributed to the last administration’s response to the pandemic. In watching that session—and having conducted hundreds of AARs in my military career—there was discussion of obvious good and bad leadership and management procedures, process issues that needed to be addressed, and decision-making that might be applauded or questioned. Every healthcare organization ought to conduct a similar AAR, with a review of the most important aspects of actions and teamwork, the hospital’s operations, logistical preparation, and leader and organization procedures that demand to be addressed.
The successful conduct of any AAR requires asking (and getting answers to) four questions: What happened?; Why did it happen the way it did?; What needs to be fixed or “polished” in the processes, systems, or leadership approach?; and Who is responsible for ensuring the fixes or adjustments occur? The facilitator (and the key leaders of the organization) must ask the right questions, must be deeply involved in getting the right people to comment on the issues, and must “pin the rose” on someone who will be responsible for carrying through on the fixes. At the end of the AAR, after the key topics are discussed, with a plan for addressing each, the person in charge of the organization must publish an action plan with details for ensuring the fixes.
Like all citizens across our nation, my family is grateful for the skill and professionalism exhibited by clinicians and healthcare providers during this devastating pandemic. While we are all breathing a sigh of relief as we see the end in sight, true professionals must take the opportunity to learn and grow from this crisis and adapt. Hopefully, the reflections and recommendations in this article—things I learned from a different profession—will provide ideas to my new colleagues in healthcare.
1. Hertling M. Ten tips for a crisis: lessons from a soldier. J Hosp Med. 2020;15(5): 275-276. https://doi.org/10.12788/jhm.3424
2. Shillcut S. The inspiring women physicians of the COVID-19 pandemic. MedPage Today. April 9, 2020. Accessed July 7, 2021. https://www.kevinmd.com/blog/2020/04/the-insiring-women-physicians-of-the-covid-19-pandemic.html
3. Daley B. Three medical innovations fueled by COVID-19 that will outlast the pandemic. The Conversation. March 9, 2021. Accessed July 7, 2021. https://theconversation.com/3-medical-innovations-fueled-by-covid-19-that-will-outlast-the-pandemic-156464
4. Drees J, Dyrda L, Adams K. Ten big advancements in healthcare tech during the pandemic. Becker’s Health IT. July 6, 2020. Accessed July 7, 2021. https://www.beckershospitalreview.com/digital-transformation/10-big-advancements-in-healthcare-tech-during-the-pandemic.html
5. Wang J. Developing organizational learning capacity in crisis management. Adv Developing Hum Resources. 10(3):425-445. https://doi.org/10.1177/1523422308316464
6. Morens DM, Fauci AS. Emerging pandemic diseases: how we got COVID-19. Cell. 2020;182(5):1077-1092. https://doi.org/10.1016/j.cell.2020.08.021
7. What is posttraumatic stress disorder? American Psychiatric Association. Reviewed August 2020. Accessed July 7, 2021. https://www.psychiatry.org/patients-families/ptsd/what-is-ptsd
Last year, I wrote an article for the Journal of Hospital Medicine offering tips to healthcare providers in what was then an expanding COVID-19 environment.1 These lessons were drawn from my experiences during the “tough fights” and crisis situations of my military career, situations similar to what healthcare providers experienced during the pandemic.
Now, as vaccination rates rise and hospitalization rates fall, the nation and healthcare profession begin the transition to “normalcy.” What should healthcare professionals expect as they transition from a year of operating in a crisis to resumption of the habitual? What memories and lessons will linger from a long, tough fight against COVID-19, and how might physicians best approach the many post-crisis challenges they will surely face?
My military experiences inform the tips I offer to those in the medical profession. Both professions depend on adeptly leading and building a functional and effective organizational culture under trying circumstances. It may seem strange, but the challenges healthcare workers (HCWs) faced in fighting COVID-19 are comparable to what soldiers experience on a battlefield. And now, as citizens return to “normal” (however normal is defined), only naïve HCWs will believe they can simply resume their previous habits and practices. This part of the journey will present new challenges and unique opportunities.
Healthcare has changed…and so have you! Just like soldiers coming home from the battlefield face a necessarily new and different world, HCWs will also face changing circumstances, environments, and organizational requirements. Given this new landscape, I offer some of my lessons learned coming out of combat to help you adapt.
REFLECTIONS
Heading home from my last combat tour in Iraq, I found myself gazing out the aircraft window and pondering my personal experiences during a very long combat tour commanding a multinational task force. Pulling out my green soldier’s notebook, I rapidly scratched out some reflections on where I was, what I had learned, and what I needed to address personally and professionally. In talking with physicians in the healthcare organization where I now work, this emotional checklist seems to mirror some of the same thoughts they face coming out of the COVID-19 crisis.
Expect exhaustion. There’s a military axiom that “fatigue can make cowards of us all,” and while I don’t think I had succumbed to cowardice in battle, after 15 months in combat I was exhausted. Commanders in combat—or HCWs fighting a pandemic—face unrelenting demands from a variety of audiences. Leaders are asked to solve unsolvable problems, be at the right place at the right time with the right answers, have more energy than others, be upbeat, and exhibit behaviors that will motivate the “troops.” That’s true even if they’re exhausted and weary to the bone, serving on multiple teams, and attending endless meetings. There is also the common and unfortunate expectation that leaders should not take any time for themselves.
During the pandemic, most HCWs reported sleeping less, having little time to interact casually with others, and having less time for personal reflection, exercise, personal growth, or even prayer. My solution for addressing exhaustion was to develop a personal plan to address each one of these areas—mental, emotional, physical, spiritual—with a detailed rest and recovery strategy. I wrote my plan down, knowing that I would need to discuss this blueprint with both my employer and my spouse, who I suspected would have different ideas on what my schedule should look like after returning “home.” Healthcare providers have been through the same kinds of stresses and need to ask themselves: What recovery plan have I designed to help me overcome the fatigue I feel, and have I talked about this plan with the people who will be affected by it?
Take pride in what your teams accomplished. I was proud of how my teams had accomplished the impossible and how they had adapted to continually changing situations. Whenever military organizations know they’ll face the enemy in combat, they feel heightened anxiety, increased fear, and concern about the preparedness of their team. The Army, like any successful team, attempts to mitigate those emotions through training. During my reflections, I remembered the teams that came together to accomplish very tough missions. Some of those teams were those I had concerns about prior to deployment, but fortunately they often surprised me with their adaptability and successes in combat.
Leaders in healthcare can likely relate. Even in normal situations, organizational fault lines exist between physicians, nurses, and administrators. These fault lines may manifest as communication disconnects and distrust between different members who may not completely trust one another due to differences in training, culture, or role within the organization. But during a crisis, rifts dissipate and trust evolves as different cultures are forced to work together. Many healthcare organizations report that, during the COVID crisis, most personality conflicts, communication disconnects, and organizational dysfunctions receded, and organizations saw more and greater coordination and collaboration. Extensive research on leadership demonstrates that crises drive teams to communicate better and become more effective and efficient in accomplishing stated goals, resulting in team members who relish “being there” for one another like never before. These positive changes must be reinforced to ensure these newly formed high-performing teams do not revert back to work silos, which usually occurs due to distrust.
Just as important as pride in teams is the pride in the accomplishment of specific individuals during times of crisis. Diverse members of any organization deliver some of the best solutions to the toughest problems when they are included in the discussion, allowed to bring their ideas to the table, and rewarded for their actions (and their courage)! Just one example is given by Dr Sasha Shillcut as she describes the innovations and adaptations of the women physicians she observed in her organization during the COVID-19 crisis,2 and there are many examples of other organizations citing similar transformation in areas like telemedicine, emergency department procedures, and equipment design and use.3,4
Anticipate “survivor’s guilt.” During my three combat tours, 253 soldiers under my command or in my organization sacrificed their lives for the mission, and many more were wounded in action. There are times when bad dreams remind me of some of the circumstances surrounding the incidents that took the lives of those who died, and I often wake with a start and in a sweat. The first question I always ask myself in the middle of the night when this happens is, “Why did they die, and why did I survive?” That question is always followed by, “What might I have done differently to prevent those deaths?”
As we draw down from treating patients during the COVID-19 crisis, healthcare providers must also be wary of “survivor’s guilt.” Survivor’s guilt is a strong emotion for anyone who has survived a crisis, especially when their friends or loved ones have not. Healthcare providers have lost many patients, but they have also lost colleagues, friends, and family members. Because you are in the healing profession, many of you will question what more you could have done to prevent the loss of life. You likely won’t ever be completely satisfied with the answer, but I have a recommendation that may assuage your emotions.
In combat, we continually memorialized our fallen comrades in ceremonies that are attended by the entire unit. One of my commanders had an idea to keep pictures of those who had made the ultimate sacrifice, and on my desk is a box with the 253 pictures of those dedicated individuals who were killed in action under my command or in my unit. On the top of the box are the words “Make It Matter.” I look at those pictures often to remember them and their selfless service to the nation, and I often ask myself whether I am “making it matter” in my daily activities. Does your healthcare facility have plans for a memorial service for all those who died while in your care? Is there a special tribute in your hospital to those healthcare providers who paid the ultimate sacrifice in caring for patients? Most importantly, have you rededicated yourself to your profession, knowing that what you learned during the pandemic will help you be a better physician in the future, and do you have the knowledge that you are making a meaningful difference every day you serve in healthcare?
Relish being home. On that flight back to family, my excitement was palpable. But there were challenges too, as I knew I had to continue to focus on my team, my organization, and my profession. While images on the internet often show soldiers returning from war rushing into the arms of their loved ones, soldiers never leave the demands associated with wearing the cloth of the country. As a result, many marriages and families are damaged when one member who has been so singularly focused returns home and is still caught up in the demands of the job. They find it is difficult to pick up where they’ve left off, forgetting their family has also been under a different kind of intense stress.
These same challenges will face HCWs. Many of you voluntarily distanced yourself from family and friends due to a fear of transmitting the disease. Spouses and children underwent traumatic challenges in their jobs, holding together the household and piloting kids through schooling. My biggest recommendation is this: strive for a return to a healthy balance, be wary of any sharp edges that appear in your personality or in your relationships, and be open in communicating with those you love. Relying on friends, counselors, and mentors who can provide trusted advice—as well as therapy, if necessary—is not a sign of weakness, but a sign of strength and courage. The pandemic has affected our lives more than we can imagine, and “coming out” of the crisis will continue to test our humanity and civility like never before. Trust me on this one. I’ve been there.
RECOMMENDATIONS FOR POST-CRISIS ACTIONS
These reflections open us to issues physicians must address in the months after your “redeployment” from dealing with the pandemic. When soldiers redeploy from combat, every unit develops a plan to address personal and professional growth for individual members of the team. Additionally, leaders develop a plan to sustain performance and improve teams and organizational approaches. The objective? Polish the diamond from what we learned during the crisis, while preparing for those things that might detract from effectiveness in future crises. It’s an SOP (standard operating procedure) for military units to do these things. Is this approach also advisable for healthcare professionals and teams in responding to crises?
Crises increase stress on individuals and disrupt the functioning of organizations, but crises also provide phenomenal opportunities for growth.5 Adaptive organizations, be they military or healthcare, must take time to understand how the crises affected people and the organizational framework, while also preparing for potential future disruptions. While HCWs and their respective organizations are usually adept at learning from short-term emergencies (eg, limited disease outbreaks, natural disasters, mass-casualty events), they are less practiced in addressing crises that affect the profession for months. It has been a century since the medical profession has been faced with a global pandemic, but experts suggest other pandemics may be on the short-term horizon.6 We ought to use this past year of experiences to prepare for them.
Pay attention to your personal needs and the conditions of others on your team. After returning from combat, I was exhausted and stressed intellectually, physically, emotionally, and spiritually. From what I’ve seen, healthcare providers fit that same description, and the fatigue is palpable. Many of you have experienced extreme stress. I have experienced extremepost-traumatic stress, and it is important to understand that this will affect some on your team.7 In addition to addressing stress—and this is advice I give to all the physicians I know—find the time to get a physical examination. While the Army requires yearly physicals for all soldiers (especially generals!), most healthcare providers I know are shockingly deficient in taking the time to get a checkup from one of their colleagues. Commit to fixing that.
Reflect on what you have learned during this period. Take an afternoon with an adult beverage (if that’s your style) and reflect on what you learned and what others might learn from your unique experiences. Then, take some notes and shape your ideas. What did you experience? What adaptations did you or your team make during the pandemic? What worked and what didn’t? What things do you want to sustain in your practice and what things do you want to eliminate? What did you learn about the medical arts…or even about your Hippocratic Oath? If you have a mentor, share these thoughts with them; if you don’t have a mentor, find one and then share your thoughts with them. Get some outside feedback.
Assess team strengths and weaknesses. If you’re a formal physician leader (someone with a title and a position on your team), it’s your responsibility to provide feedback on both people and processes. If you’re an informal leader (someone who is a member of the team but doesn’t have specific leadership responsibilities outside your clinical role) and you don’t see this happening, volunteer to run the session for your formal leader and your organization. This session should last several hours and be held in a comfortable setting. You should prepare your team so they aren’t defensive about the points that may arise. Determine strengths and opportunities by asking for feedback on communication, behaviors, medical knowledge, emotional intelligence, and execution of tasks. Determine which processes and systems either worked or didn’t work, and either polish the approaches or drive change to improve systems as you get back to normal. Crises provide an opportunity to fix what’s broken while also reinforcing the things that worked in the crisis that might not be normal procedure. Don’t go back to old ways if those weren’t the things or the approaches you were using under critical conditions.
Encourage completion of an organization-wide after-action review (AAR). As I started writing this article, I watched CNN’s Dr Sanjay Gupta conduct a review of actions with the key physicians who contributed to the last administration’s response to the pandemic. In watching that session—and having conducted hundreds of AARs in my military career—there was discussion of obvious good and bad leadership and management procedures, process issues that needed to be addressed, and decision-making that might be applauded or questioned. Every healthcare organization ought to conduct a similar AAR, with a review of the most important aspects of actions and teamwork, the hospital’s operations, logistical preparation, and leader and organization procedures that demand to be addressed.
The successful conduct of any AAR requires asking (and getting answers to) four questions: What happened?; Why did it happen the way it did?; What needs to be fixed or “polished” in the processes, systems, or leadership approach?; and Who is responsible for ensuring the fixes or adjustments occur? The facilitator (and the key leaders of the organization) must ask the right questions, must be deeply involved in getting the right people to comment on the issues, and must “pin the rose” on someone who will be responsible for carrying through on the fixes. At the end of the AAR, after the key topics are discussed, with a plan for addressing each, the person in charge of the organization must publish an action plan with details for ensuring the fixes.
Like all citizens across our nation, my family is grateful for the skill and professionalism exhibited by clinicians and healthcare providers during this devastating pandemic. While we are all breathing a sigh of relief as we see the end in sight, true professionals must take the opportunity to learn and grow from this crisis and adapt. Hopefully, the reflections and recommendations in this article—things I learned from a different profession—will provide ideas to my new colleagues in healthcare.
Last year, I wrote an article for the Journal of Hospital Medicine offering tips to healthcare providers in what was then an expanding COVID-19 environment.1 These lessons were drawn from my experiences during the “tough fights” and crisis situations of my military career, situations similar to what healthcare providers experienced during the pandemic.
Now, as vaccination rates rise and hospitalization rates fall, the nation and healthcare profession begin the transition to “normalcy.” What should healthcare professionals expect as they transition from a year of operating in a crisis to resumption of the habitual? What memories and lessons will linger from a long, tough fight against COVID-19, and how might physicians best approach the many post-crisis challenges they will surely face?
My military experiences inform the tips I offer to those in the medical profession. Both professions depend on adeptly leading and building a functional and effective organizational culture under trying circumstances. It may seem strange, but the challenges healthcare workers (HCWs) faced in fighting COVID-19 are comparable to what soldiers experience on a battlefield. And now, as citizens return to “normal” (however normal is defined), only naïve HCWs will believe they can simply resume their previous habits and practices. This part of the journey will present new challenges and unique opportunities.
Healthcare has changed…and so have you! Just like soldiers coming home from the battlefield face a necessarily new and different world, HCWs will also face changing circumstances, environments, and organizational requirements. Given this new landscape, I offer some of my lessons learned coming out of combat to help you adapt.
REFLECTIONS
Heading home from my last combat tour in Iraq, I found myself gazing out the aircraft window and pondering my personal experiences during a very long combat tour commanding a multinational task force. Pulling out my green soldier’s notebook, I rapidly scratched out some reflections on where I was, what I had learned, and what I needed to address personally and professionally. In talking with physicians in the healthcare organization where I now work, this emotional checklist seems to mirror some of the same thoughts they face coming out of the COVID-19 crisis.
Expect exhaustion. There’s a military axiom that “fatigue can make cowards of us all,” and while I don’t think I had succumbed to cowardice in battle, after 15 months in combat I was exhausted. Commanders in combat—or HCWs fighting a pandemic—face unrelenting demands from a variety of audiences. Leaders are asked to solve unsolvable problems, be at the right place at the right time with the right answers, have more energy than others, be upbeat, and exhibit behaviors that will motivate the “troops.” That’s true even if they’re exhausted and weary to the bone, serving on multiple teams, and attending endless meetings. There is also the common and unfortunate expectation that leaders should not take any time for themselves.
During the pandemic, most HCWs reported sleeping less, having little time to interact casually with others, and having less time for personal reflection, exercise, personal growth, or even prayer. My solution for addressing exhaustion was to develop a personal plan to address each one of these areas—mental, emotional, physical, spiritual—with a detailed rest and recovery strategy. I wrote my plan down, knowing that I would need to discuss this blueprint with both my employer and my spouse, who I suspected would have different ideas on what my schedule should look like after returning “home.” Healthcare providers have been through the same kinds of stresses and need to ask themselves: What recovery plan have I designed to help me overcome the fatigue I feel, and have I talked about this plan with the people who will be affected by it?
Take pride in what your teams accomplished. I was proud of how my teams had accomplished the impossible and how they had adapted to continually changing situations. Whenever military organizations know they’ll face the enemy in combat, they feel heightened anxiety, increased fear, and concern about the preparedness of their team. The Army, like any successful team, attempts to mitigate those emotions through training. During my reflections, I remembered the teams that came together to accomplish very tough missions. Some of those teams were those I had concerns about prior to deployment, but fortunately they often surprised me with their adaptability and successes in combat.
Leaders in healthcare can likely relate. Even in normal situations, organizational fault lines exist between physicians, nurses, and administrators. These fault lines may manifest as communication disconnects and distrust between different members who may not completely trust one another due to differences in training, culture, or role within the organization. But during a crisis, rifts dissipate and trust evolves as different cultures are forced to work together. Many healthcare organizations report that, during the COVID crisis, most personality conflicts, communication disconnects, and organizational dysfunctions receded, and organizations saw more and greater coordination and collaboration. Extensive research on leadership demonstrates that crises drive teams to communicate better and become more effective and efficient in accomplishing stated goals, resulting in team members who relish “being there” for one another like never before. These positive changes must be reinforced to ensure these newly formed high-performing teams do not revert back to work silos, which usually occurs due to distrust.
Just as important as pride in teams is the pride in the accomplishment of specific individuals during times of crisis. Diverse members of any organization deliver some of the best solutions to the toughest problems when they are included in the discussion, allowed to bring their ideas to the table, and rewarded for their actions (and their courage)! Just one example is given by Dr Sasha Shillcut as she describes the innovations and adaptations of the women physicians she observed in her organization during the COVID-19 crisis,2 and there are many examples of other organizations citing similar transformation in areas like telemedicine, emergency department procedures, and equipment design and use.3,4
Anticipate “survivor’s guilt.” During my three combat tours, 253 soldiers under my command or in my organization sacrificed their lives for the mission, and many more were wounded in action. There are times when bad dreams remind me of some of the circumstances surrounding the incidents that took the lives of those who died, and I often wake with a start and in a sweat. The first question I always ask myself in the middle of the night when this happens is, “Why did they die, and why did I survive?” That question is always followed by, “What might I have done differently to prevent those deaths?”
As we draw down from treating patients during the COVID-19 crisis, healthcare providers must also be wary of “survivor’s guilt.” Survivor’s guilt is a strong emotion for anyone who has survived a crisis, especially when their friends or loved ones have not. Healthcare providers have lost many patients, but they have also lost colleagues, friends, and family members. Because you are in the healing profession, many of you will question what more you could have done to prevent the loss of life. You likely won’t ever be completely satisfied with the answer, but I have a recommendation that may assuage your emotions.
In combat, we continually memorialized our fallen comrades in ceremonies that are attended by the entire unit. One of my commanders had an idea to keep pictures of those who had made the ultimate sacrifice, and on my desk is a box with the 253 pictures of those dedicated individuals who were killed in action under my command or in my unit. On the top of the box are the words “Make It Matter.” I look at those pictures often to remember them and their selfless service to the nation, and I often ask myself whether I am “making it matter” in my daily activities. Does your healthcare facility have plans for a memorial service for all those who died while in your care? Is there a special tribute in your hospital to those healthcare providers who paid the ultimate sacrifice in caring for patients? Most importantly, have you rededicated yourself to your profession, knowing that what you learned during the pandemic will help you be a better physician in the future, and do you have the knowledge that you are making a meaningful difference every day you serve in healthcare?
Relish being home. On that flight back to family, my excitement was palpable. But there were challenges too, as I knew I had to continue to focus on my team, my organization, and my profession. While images on the internet often show soldiers returning from war rushing into the arms of their loved ones, soldiers never leave the demands associated with wearing the cloth of the country. As a result, many marriages and families are damaged when one member who has been so singularly focused returns home and is still caught up in the demands of the job. They find it is difficult to pick up where they’ve left off, forgetting their family has also been under a different kind of intense stress.
These same challenges will face HCWs. Many of you voluntarily distanced yourself from family and friends due to a fear of transmitting the disease. Spouses and children underwent traumatic challenges in their jobs, holding together the household and piloting kids through schooling. My biggest recommendation is this: strive for a return to a healthy balance, be wary of any sharp edges that appear in your personality or in your relationships, and be open in communicating with those you love. Relying on friends, counselors, and mentors who can provide trusted advice—as well as therapy, if necessary—is not a sign of weakness, but a sign of strength and courage. The pandemic has affected our lives more than we can imagine, and “coming out” of the crisis will continue to test our humanity and civility like never before. Trust me on this one. I’ve been there.
RECOMMENDATIONS FOR POST-CRISIS ACTIONS
These reflections open us to issues physicians must address in the months after your “redeployment” from dealing with the pandemic. When soldiers redeploy from combat, every unit develops a plan to address personal and professional growth for individual members of the team. Additionally, leaders develop a plan to sustain performance and improve teams and organizational approaches. The objective? Polish the diamond from what we learned during the crisis, while preparing for those things that might detract from effectiveness in future crises. It’s an SOP (standard operating procedure) for military units to do these things. Is this approach also advisable for healthcare professionals and teams in responding to crises?
Crises increase stress on individuals and disrupt the functioning of organizations, but crises also provide phenomenal opportunities for growth.5 Adaptive organizations, be they military or healthcare, must take time to understand how the crises affected people and the organizational framework, while also preparing for potential future disruptions. While HCWs and their respective organizations are usually adept at learning from short-term emergencies (eg, limited disease outbreaks, natural disasters, mass-casualty events), they are less practiced in addressing crises that affect the profession for months. It has been a century since the medical profession has been faced with a global pandemic, but experts suggest other pandemics may be on the short-term horizon.6 We ought to use this past year of experiences to prepare for them.
Pay attention to your personal needs and the conditions of others on your team. After returning from combat, I was exhausted and stressed intellectually, physically, emotionally, and spiritually. From what I’ve seen, healthcare providers fit that same description, and the fatigue is palpable. Many of you have experienced extreme stress. I have experienced extremepost-traumatic stress, and it is important to understand that this will affect some on your team.7 In addition to addressing stress—and this is advice I give to all the physicians I know—find the time to get a physical examination. While the Army requires yearly physicals for all soldiers (especially generals!), most healthcare providers I know are shockingly deficient in taking the time to get a checkup from one of their colleagues. Commit to fixing that.
Reflect on what you have learned during this period. Take an afternoon with an adult beverage (if that’s your style) and reflect on what you learned and what others might learn from your unique experiences. Then, take some notes and shape your ideas. What did you experience? What adaptations did you or your team make during the pandemic? What worked and what didn’t? What things do you want to sustain in your practice and what things do you want to eliminate? What did you learn about the medical arts…or even about your Hippocratic Oath? If you have a mentor, share these thoughts with them; if you don’t have a mentor, find one and then share your thoughts with them. Get some outside feedback.
Assess team strengths and weaknesses. If you’re a formal physician leader (someone with a title and a position on your team), it’s your responsibility to provide feedback on both people and processes. If you’re an informal leader (someone who is a member of the team but doesn’t have specific leadership responsibilities outside your clinical role) and you don’t see this happening, volunteer to run the session for your formal leader and your organization. This session should last several hours and be held in a comfortable setting. You should prepare your team so they aren’t defensive about the points that may arise. Determine strengths and opportunities by asking for feedback on communication, behaviors, medical knowledge, emotional intelligence, and execution of tasks. Determine which processes and systems either worked or didn’t work, and either polish the approaches or drive change to improve systems as you get back to normal. Crises provide an opportunity to fix what’s broken while also reinforcing the things that worked in the crisis that might not be normal procedure. Don’t go back to old ways if those weren’t the things or the approaches you were using under critical conditions.
Encourage completion of an organization-wide after-action review (AAR). As I started writing this article, I watched CNN’s Dr Sanjay Gupta conduct a review of actions with the key physicians who contributed to the last administration’s response to the pandemic. In watching that session—and having conducted hundreds of AARs in my military career—there was discussion of obvious good and bad leadership and management procedures, process issues that needed to be addressed, and decision-making that might be applauded or questioned. Every healthcare organization ought to conduct a similar AAR, with a review of the most important aspects of actions and teamwork, the hospital’s operations, logistical preparation, and leader and organization procedures that demand to be addressed.
The successful conduct of any AAR requires asking (and getting answers to) four questions: What happened?; Why did it happen the way it did?; What needs to be fixed or “polished” in the processes, systems, or leadership approach?; and Who is responsible for ensuring the fixes or adjustments occur? The facilitator (and the key leaders of the organization) must ask the right questions, must be deeply involved in getting the right people to comment on the issues, and must “pin the rose” on someone who will be responsible for carrying through on the fixes. At the end of the AAR, after the key topics are discussed, with a plan for addressing each, the person in charge of the organization must publish an action plan with details for ensuring the fixes.
Like all citizens across our nation, my family is grateful for the skill and professionalism exhibited by clinicians and healthcare providers during this devastating pandemic. While we are all breathing a sigh of relief as we see the end in sight, true professionals must take the opportunity to learn and grow from this crisis and adapt. Hopefully, the reflections and recommendations in this article—things I learned from a different profession—will provide ideas to my new colleagues in healthcare.
1. Hertling M. Ten tips for a crisis: lessons from a soldier. J Hosp Med. 2020;15(5): 275-276. https://doi.org/10.12788/jhm.3424
2. Shillcut S. The inspiring women physicians of the COVID-19 pandemic. MedPage Today. April 9, 2020. Accessed July 7, 2021. https://www.kevinmd.com/blog/2020/04/the-insiring-women-physicians-of-the-covid-19-pandemic.html
3. Daley B. Three medical innovations fueled by COVID-19 that will outlast the pandemic. The Conversation. March 9, 2021. Accessed July 7, 2021. https://theconversation.com/3-medical-innovations-fueled-by-covid-19-that-will-outlast-the-pandemic-156464
4. Drees J, Dyrda L, Adams K. Ten big advancements in healthcare tech during the pandemic. Becker’s Health IT. July 6, 2020. Accessed July 7, 2021. https://www.beckershospitalreview.com/digital-transformation/10-big-advancements-in-healthcare-tech-during-the-pandemic.html
5. Wang J. Developing organizational learning capacity in crisis management. Adv Developing Hum Resources. 10(3):425-445. https://doi.org/10.1177/1523422308316464
6. Morens DM, Fauci AS. Emerging pandemic diseases: how we got COVID-19. Cell. 2020;182(5):1077-1092. https://doi.org/10.1016/j.cell.2020.08.021
7. What is posttraumatic stress disorder? American Psychiatric Association. Reviewed August 2020. Accessed July 7, 2021. https://www.psychiatry.org/patients-families/ptsd/what-is-ptsd
1. Hertling M. Ten tips for a crisis: lessons from a soldier. J Hosp Med. 2020;15(5): 275-276. https://doi.org/10.12788/jhm.3424
2. Shillcut S. The inspiring women physicians of the COVID-19 pandemic. MedPage Today. April 9, 2020. Accessed July 7, 2021. https://www.kevinmd.com/blog/2020/04/the-insiring-women-physicians-of-the-covid-19-pandemic.html
3. Daley B. Three medical innovations fueled by COVID-19 that will outlast the pandemic. The Conversation. March 9, 2021. Accessed July 7, 2021. https://theconversation.com/3-medical-innovations-fueled-by-covid-19-that-will-outlast-the-pandemic-156464
4. Drees J, Dyrda L, Adams K. Ten big advancements in healthcare tech during the pandemic. Becker’s Health IT. July 6, 2020. Accessed July 7, 2021. https://www.beckershospitalreview.com/digital-transformation/10-big-advancements-in-healthcare-tech-during-the-pandemic.html
5. Wang J. Developing organizational learning capacity in crisis management. Adv Developing Hum Resources. 10(3):425-445. https://doi.org/10.1177/1523422308316464
6. Morens DM, Fauci AS. Emerging pandemic diseases: how we got COVID-19. Cell. 2020;182(5):1077-1092. https://doi.org/10.1016/j.cell.2020.08.021
7. What is posttraumatic stress disorder? American Psychiatric Association. Reviewed August 2020. Accessed July 7, 2021. https://www.psychiatry.org/patients-families/ptsd/what-is-ptsd
© 2021 Society of Hospital Medicine
Preoperative Care Assessment of Need Scores Are Associated With Postoperative Mortality and Length of Stay in Veterans Undergoing Knee Replacement
Risk calculators can be of great value in guiding clinical decision making, patient-centered precision medicine, and resource allocation.1 Several perioperative risk prediction models have emerged in recent decades that estimate specific hazards (eg, cardiovascular complications after noncardiac surgery) with varying accuracy and utility. In the perioperative sphere, the time windows are often limited to an index hospitalization or 30 days following surgery or discharge.2-9 Although longer periods are of interest to patients, families, and health systems, few widely used or validated models are designed to look beyond this very narrow window.10,11 In addition, perioperative risk prediction models do not routinely incorporate parameters of a wide variety of health or demographic domains, such as patterns of health care, health care utilization, or medication use.
In 2013, in response to the need for near real-time information to guide delivery of enhanced care management services, the Veterans Health Administration (VHA) Office of Informatics and Analytics developed automated risk prediction models that used detailed electronic health record (EHR) data. These models were used to report Care Assessment Need (CAN) scores each week for all VHA enrollees and include data from a wide array of health domains. These CAN scores predict the risk for hospitalization, death, or either event within 90 days and 1 year.12,13 Each score is reported as both a predicted probability (0-1) and as a percentile in relation to all other VHA enrollees (a value between 1 and 99).13 The data used to calculate CAN scores are listed in Table 1.12
Surgical procedures or admissions would not be differentiated from nonsurgical admissions or other procedural clinic visits, and as such, it is not possible to isolate the effect of undergoing a surgical procedure from another health-related event on the CAN score. At the same time though, a short-term increase in system utilization caused by an elective surgical procedure such as a total knee replacement (TKR) would presumably be reflected in a change in CAN score, but this has not been studied.
Since their introduction, CAN scores have been routinely accessed by primary care teams and used to facilitate care coordination for thousands of VHA patients. However, these CAN scores are currently not available to VHA surgeons, anesthesiologists, or other perioperative clinicians. In this study, we examine the distributions of preoperative CAN scores and explore the relationships of preoperative CAN 1-year mortality scores with 1-year survival following discharge and length of stay (LOS) during index hospitalization in a cohort of US veterans who underwent TKR, the most common elective operation performed within the VHA system.
Methods
Following approval of the Durham Veterans Affairs Medical Center Institutional Review Board, all necessary data were extracted from the VHA Corporate Data Warehouse (CDW) repository.14 Informed consent was waived due to the minimal risk nature of the study.
We used Current Procedural Terminology codes (27438, 27446, 27447, 27486, 27487, 27488) and International Classification of Diseases, 9th edition clinical modification procedure codes (81.54, 81.55, 81.59, 00.80-00.84) to identify all veterans who had undergone primary or revision TKR between July 2014 and December 2015 in VHA Veterans Integrated Service Network 1 (Maine, Vermont, New Hampshire, Massachusetts, Connecticut, Rhode Island, New York, Pennsylvania, West Virginia, Virginia, North Carolina). Because we focused on outcomes following hospital discharge, patients who died before discharge were excluded from the analysis. Preoperative CAN 1-year mortality score was chosen as the measure under the assumption that long-term survival may be the most meaningful of the 4 possible CAN score measures.
Our primary objective was to determine distribution of preoperative CAN scores in the study population. Our secondary was to study relationships among the preoperative CAN 1-year mortality scores and 1-year mortality and hospital LOS.
Study Variables
For each patient, we extracted the date of index surgery. The primary exposure or independent variable was the CAN score in the week prior to this date. Because prior study has shown that CAN scores trajectories do not significantly change over time, the date-stamped CAN scores in the week before surgery represent what would have been available to clinicians in a preoperative setting.15 Since CAN scores are refreshed and overwritten every week, we extracted archived scores from the CDW.
For the 1-year survival outcome, the primary dependent variable, we queried the vital status files in the CDW for the date of death if applicable. We confirmed survival beyond 1 year by examining vital signs in the CDW for a minimum of 2 independent encounters beyond 1 year after the date of discharge. To compute the index LOS, the secondary outcome, we computed the difference between the date of admission and date of hospital discharge.
Statistical Methods
The parameters and performance of the multivariable logistic regression models developed to compute the various CAN mortality and hospitalization risk scores have been previously described.12 Briefly, Wang and colleagues created parsimonious regression models using backward selection. Model discrimination was evaluated using C (concordance)-statistic. Model calibration was assessed by comparing predicted vs observed event rates by risk deciles and performing Cox proportional hazards regression.
We plotted histograms to display preoperative CAN scores as a simple measure of distribution (Figure 1). We also examined the cumulative proportion of patients at each preoperative CAN 1-year mortality score.
Using a conventional t test, we compared means of preoperative CAN 1-year mortality scores in patients who survived vs those who died within 1 year. We also constructed a plot of the proportion of patients who had died within 1 year vs preoperative CAN 1-year mortality scores. Kaplan-Meier curves were then constructed examining 1-year survival by CAN 1-year mortality score by terciles.
Finally, we examined the relationship between preoperative CAN 1-year mortality scores and index LOS in 2 ways: We plotted LOS across CAN scores, and we constructed a
Results
We identified 8206 patients who had undergone a TKR over the 18-month study period. The overall mean (SD) for age was 65 (8.41) years; 93% were male, and 78% were White veterans. Patient demographics are well described in a previous publication.16,17
In terms of model parameters for the CAN score models, C-statistics for the 90-day outcome models were as follows: 0.833 for the model predicting hospitalization (95% CI, 0.832-0.834); 0.865 for the model predicting death (95% CI, 0.863-0.876); and 0.811 for the model predicting either event (95% CI, 0.810-0.812). C-statistics for the 1-year outcome models were 0.809 for the model predicting hospitalization (95% CI, 0.808-0.810); 0.851 for the model predicting death (95% CI, 0.849-0.852); and 0.787 for the model predicting either event (95% CI, 0.786-0.787). Models were well calibrated with α = 0 and β = 1, demonstrating strong agreement between observed and predicted event rates.
The distribution of preoperative CAN 1-year mortality scores was close to normal (median, 50; interquartile range, 40; mean [SD] 48 [25.6]) (eTable). The original CAN score models were developed having an equal number of patients in each strata and as such, are normally distributed.12 Our cohort was similar in pattern of distribution. Distributions of the remaining preoperative CAN scores (90-day mortality, 1-year hospitalization, 90-day hospitalization) are shown in Figures 2, 3, and 4. Not surprisingly, histograms for both 90-day and 1-year hospitalization were skewed toward higher scores, indicating that these patients were expected to be hospitalized in the near future.
Overall, 1.4% (110/8096) of patients died within 1 year of surgery. Comparing 1-year mortality CAN scores in survivors vs nonsurvivors, we found statistically significant differences in means (47 vs 66 respectively, P < .001) and medians (45 vs 75 respectively, P < .001) (Table 2). In the plot examining the relationship between preoperative 1-year mortality CAN scores and 1-year mortality, the percentage who died within 1 year increased initially for patients with CAN scores > 60 and again exponentially for patients with CAN scores > 80. Examining Kaplan-Meier curves, we found that survivors and nonsurvivors separated early after surgery, and the differences between the top tercile and the middle/lower terciles were statistically significant (P < .001). Mortality rates were about 0.5% in the lower and middle terciles but about 2% in the upper tercile (Figure 5).
In the plot examining the relationship between CAN scores and index LOS, the LOS rose significantly beyond a CAN score of 60 and dramatically beyond a CAN score of 80 (Figure 6). LOESS curves also showed 2 inflection points suggesting an incremental and sequential rise in the LOS with increasing CAN scores (Figure 7). Mean (SD) LOS in days for the lowest to highest terciles was 2.6 (1.7), 2.8 (2.1), and 3.6 (2.2), respectively.
Discussion
CAN scores are automatically generated each week by EHR-based multivariable risk models. These scores have excellent predictive accuracy for 90-day and 1-year mortality and hospitalization and are routinely used by VHA primary care teams to assist with clinical operations.13 We studied the distribution of CAN 1-year mortality scores in a preoperative context and examined relationships of the preoperative CAN 1-year mortality scores with postoperative mortality and LOS in 8206 veterans who underwent TKR.
There are several noteworthy findings. First, the overall 1-year mortality rate observed following TKR (1.4%) was similar to other published reports.18,19 Not surprisingly, preoperative CAN 1-year mortality scores were significantly higher in veterans who died compared with those of survivors. The majority of patients who died had a preoperative CAN 1-year mortality score > 75 while most who survived had a preoperative CAN 1-year mortality score < 45 (P < .001). Interestingly, the same scores showed a nonlinear correlation with LOS. Index LOS was about 4 days in patients in the highest tercile of CAN scores vs 2.5 days in the lowest tercile, but the initial increase in LOS was detected at a CAN score of about 55 to 60.
In addition, mortality rate varied widely in different segments of the population when grouped according to preoperative CAN scores. One-year mortality rates in the highest tercile reached 2%, about 4-fold higher than that of lower terciles (0.5%). Examination of the Kaplan-Meier curves showed that this difference in mortality between the highest tercile and the lower 2 groups appears soon after discharge and continues to increase over time, suggesting that the factors contributing to the increased mortality are present at the time of discharge and persist beyond the postoperative period. In summary, although CAN scores were not designed for use in the perioperative context, we found that preoperative CAN 1-year mortality scores are broadly predictive of mortality, but especially for increases in LOS following elective TKA, both increases in hospital LOS following elective TKA and mortality over the year after TKA.
Our findings raise several important questions. The decision to undergo elective surgery is complex. Arguably, individuals who undergo elective knee replacement should be healthy enough to undergo, recover, and reap the benefits from a procedure that does not extend life. The distribution of preoperative CAN 1-year mortality scores for our study population was similar to that of the general VHA enrollee population with similar measured mortality rates (≤ 0.5% vs ≥ 1.7% in the low and high terciles, respectively).1 Further study comparing outcomes in matched cohorts who did and did not undergo joint replacement would be of interest. In lieu of this, though, the association of high but not extreme CAN scores with increased hospital LOS may potentially be used to guide allocation of resources to this group, obviating the increased cost and risk to which this group is exposed. And the additional insight afforded by CAN scores may enhance shared decision-making models by identifying patients at the very highest risk (eg, 1-year mortality CAN score ≥ 90), patients who conceivably might not survive long enough to recover from and enjoy their reconstructed knee, who might in the long run be harmed by undergoing the procedure.
Many total joint arthroplasties are performed in older patients, a population in which frailty is increasingly recognized as a significant risk factor for poor outcomes.20,21 CAN scores reliably identify high-risk patients and have been shown to correlate with frailty in this group.22 Multiple authors have reported improved outcomes with cost reductions after implementation of programs targeting modifiable risk factors in high-risk surgical candidates.23-25 A preoperative assessment that includes the CAN score may be valuable in identifying patients who would benefit most from prehabilitation programs or other interventions designed to blunt the impact of frailty. It is true that many elements used to calculate the CAN score would not be considered modifiable, especially in the short term. However, specific contributors to frailty, such as nutritional status and polypharmacy might be potential candidates. As with all multivariable risk prediction models, there are multiple paths to a high CAN score, and further research to identify clinically relevant subgroups may help inform efforts to improve perioperative care within this population.
Hospital LOS is of intense interest for many reasons, not least its utility as a surrogate for cost and increased risk for immediate perioperative adverse events, such as multidrug-resistant hospital acquired infections, need for postacute facility-based rehabilitation, and deconditioning that increase risks of falls and fractures in the older population.26-29 In addition, its importance is magnified due to the COVID-19 pandemic context in which restarting elective surgery programs has changed traditional criteria by which patients are scheduled for surgery.
We have shown that elevated CAN scores are able to identify patients at risk for extended hospital stays and, as such, may be useful additional data in allocating scarce operating room time and other resources for optimal patient and health care provider safety.30,31 Individual surgeons and hospital systems would, of course, decide which patients should be triaged to go first, based on local priorities; however, choosing lower risk patients with minimal risk of morbidity and mortality while pursuing prehabilitation for higher risk patients is a reasonable approach.
Limitations
Our study has several limitations. Only a single surgical procedure was included, albeit the most common one performed in the VHA. In addition, no information was available concerning the precise clinical course for these patients, such as the duration of surgery, anesthetic technique, and management of acute, perioperative course. Although the assumption was made that patients received standard care in a manner such that these factors would not significantly affect either their mortality or their LOS out of proportion to their preoperative clinical status, confounding cannot be excluded. Therefore, further study is necessary to determine whether CAN scores can accurately predict mortality and/or LOS for patients undergoing other procedures. Further, a clinical trial is required to assess whether systematic provision of the CAN score at the point of surgery would impact care and, more important, impact outcomes. In addition, multivariable analyses were not performed, including and excluding various components of the CAN score models. Currently, CAN scores could be made available to the surgical/anesthesia communities at minimal or no cost and are updated automatically. Model calibration and discrimination in this particular setting were not validated.
Because our interest is in leveraging an existing resource to a current clinical and operational problem rather than in creating or validating a new tool, we chose to test the simple bivariate relationship between preoperative CAN scores and outcomes. We chose the preoperative 1-year mortality CAN score from among the 4 options under the assumption that long-term survival is the most meaningful of the 4 candidate outcomes. Finally, while the CAN scores are currently only calculated and generated for patients cared for within the VHA, few data elements are unavailable to civilian health systems. The most problematic would be documentation of actual prescription filling, but this is a topic of increasing interest to the medical and academic communities and access to such information we hope will improve.32-34
Conclusions
Although designed for use by VHA primary care teams, CAN scores also may have value for perioperative clinicians, predicting mortality and prolonged hospital LOS in those with elevated 1-year mortality scores. Advantages of CAN scores relative to other perioperative risk calculators lies in their ability to predict long-term rather than 30-day survival and that they are automatically generated on a near-real-time basis for all patients who receive care in VHA ambulatory clinics. Further study is needed to determine practical utility in shared decision making, preoperative evaluation and optimization, and perioperative resource allocation.
Acknowledgments
This work was supported by the US Department of Veterans Affairs (VA) National Center for Patient Safety, Field Office 10A4E, through the Patient Safety Center of Inquiry at the Durham VA Medical Center in North Carolina. The study also received support from the Center of Innovation to Accelerate Discovery and Practice Transformation (CIN 13-410) at the Durham VA Health Care System.
1. McNair AGK, MacKichan F, Donovan JL, et al. What surgeons tell patients and what patients want to know before major cancer surgery: a qualitative study. BMC Cancer. 2016;16:258. doi:10.1186/s12885-016-2292-3
2. Grover FL, Hammermeister KE, Burchfiel C. Initial report of the Veterans Administration Preoperative Risk Assessment Study for Cardiac Surgery. Ann Thorac Surg. 1990;50(1):12-26; discussion 27-18. doi:10.1016/0003-4975(90)90073-f
3. Khuri SF, Daley J, Henderson W, et al. The National Veterans Administration Surgical Risk Study: risk adjustment for the comparative assessment of the quality of surgical care. J Am Coll Surg. 1995;180(5):519-531.
4. Glance LG, Lustik SJ, Hannan EL, et al. The Surgical Mortality Probability Model: derivation and validation of a simple simple risk prediction rule for noncardiac surgery. Ann Surg. 2012;255(4):696-702. doi:10.1097/SLA.0b013e31824b45af
5. Keller DS, Kroll D, Papaconstantinou HT, Ellis CN. Development and validation of a methodology to reduce mortality using the veterans affairs surgical quality improvement program risk calculator. J Am Coll Surg. 2017;224(4):602-607. doi:10.1016/j.jamcollsurg.2016.12.033
6. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842.e831-833. doi:10.1016/j.jamcollsurg.2013.07.385
7. Ford MK, Beattie WS, Wijeysundera DN. Systematic review: prediction of perioperative cardiac complications and mortality by the revised cardiac risk index. Ann Intern Med. 2010;152(1):26-35. doi:10.7326/0003-4819-152-1-201001050-00007
8. Gupta PK, Gupta H, Sundaram A, et al. Development and validation of a risk calculator for prediction of cardiac risk after surgery. Circulation. 2011;124(4):381-387. doi:10.1161/CIRCULATIONAHA.110.015701
9. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. doi:10.1161/01.cir.100.10.1043
10. Smith T, Li X, Nylander W, Gunnar W. Thirty-day postoperative mortality risk estimates and 1-year survival in Veterans Health Administration surgery patients. JAMA Surg. 2016;151(5):417-422. doi:10.1001/jamasurg.2015.4882
11. Damhuis RA, Wijnhoven BP, Plaisier PW, Kirkels WJ, Kranse R, van Lanschot JJ. Comparison of 30-day, 90- day and in-hospital postoperative mortality for eight different cancer types. Br J Surg. 2012;99(8):1149-1154. doi:10.1002/bjs.8813
12. Wang L, Porter B, Maynard C, et al. Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration. Med Care. 2013;51(4):368-373. doi:10.1016/j.amjcard.2012.06.038
13. Fihn SD, Francis J, Clancy C, et al. Insights from advanced analytics at the Veterans Health Administration. Health Aff (Millwood). 2014;33(7):1203-1211. doi:10.1377/hlthaff.2014.0054
14. Noël PH, Copeland LA, Perrin RA, et al. VHA Corporate Data Warehouse height and weight data: opportunities and challenges for health services research. J Rehabil Res Dev. 2010;47(8):739-750. doi:10.1682/jrrd.2009.08.0110
15. Wong ES, Yoon J, Piegari RI, Rosland AM, Fihn SD, Chang ET. Identifying latent subgroups of high-risk patients using risk score trajectories. J Gen Intern Med. 2018;33(12):2120-2126. doi:10.1007/s11606-018-4653-x
16. Chen Q, Hsia HL, Overman R, et al. Impact of an opioid safety initiative on patients undergoing total knee arthroplasty: a time series analysis. Anesthesiology. 2019;131(2):369-380. doi:10.1097/ALN.0000000000002771
17. Hsia HL, Takemoto S, van de Ven T, et al. Acute pain is associated with chronic opioid use after total knee arthroplasty. Reg Anesth Pain Med. 2018;43(7):705-711. doi:10.1097/AAP.0000000000000831
18. Inacio MCS, Dillon MT, Miric A, Navarro RA, Paxton EW. Mortality after total knee and total hip arthroplasty in a large integrated health care system. Perm J. 2017;21:16-171. doi:10.7812/TPP/16-171
19. Lee QJ, Mak WP, Wong YC. Mortality following primary total knee replacement in public hospitals in Hong Kong. Hong Kong Med J. 2016;22(3):237-241. doi:10.12809/hkmj154712
20. Lin HS, Watts JN, Peel NM, Hubbard RE. Frailty and post-operative outcomes in older surgical patients: a systematic review. BMC Geriatr. 2016;16(1):157. doi:10.1186/s12877-016-0329-8
21. Shinall MC Jr, Arya S, Youk A, et al. Association of preoperative patient frailty and operative stress with postoperative mortality. JAMA Surg. 2019;155(1):e194620. doi:10.1001/jamasurg.2019.4620
22. Ruiz JG, Priyadarshni S, Rahaman Z, et al. Validation of an automatically generated screening score for frailty: the care assessment need (CAN) score. BMC Geriatr. 2018;18(1):106. doi:10.1186/s12877-018-0802-7
23. Bernstein DN, Liu TC, Winegar AL, et al. Evaluation of a preoperative optimization protocol for primary hip and knee arthroplasty patients. J Arthroplasty. 2018;33(12):3642- 3648. doi:10.1016/j.arth.2018.08.018
24. Sodhi N, Anis HK, Coste M, et al. A nationwide analysis of preoperative planning on operative times and postoperative complications in total knee arthroplasty. J Knee Surg. 2019;32(11):1040-1045. doi:10.1055/s-0039-1677790
25. Krause A, Sayeed Z, El-Othmani M, Pallekonda V, Mihalko W, Saleh KJ. Outpatient total knee arthroplasty: are we there yet? (part 1). Orthop Clin North Am. 2018;49(1):1-6. doi:10.1016/j.ocl.2017.08.002
26. Barrasa-Villar JI, Aibar-Remón C, Prieto-Andrés P, Mareca- Doñate R, Moliner-Lahoz J. Impact on morbidity, mortality, and length of stay of hospital-acquired infections by resistant microorganisms. Clin Infect Dis. 2017;65(4):644-652. doi:10.1093/cid/cix411
27. Nikkel LE, Kates SL, Schreck M, Maceroli M, Mahmood B, Elfar JC. Length of hospital stay after hip fracture and risk of early mortality after discharge in New York state: retrospective cohort study. BMJ. 2015;351:h6246. doi:10.1136/bmj.h6246
28. Marfil-Garza BA, Belaunzarán-Zamudio PF, Gulias-Herrero A, et al. Risk factors associated with prolonged hospital length-of-stay: 18-year retrospective study of hospitalizations in a tertiary healthcare center in Mexico. PLoS One. 2018;13(11):e0207203. doi:10.1371/journal.pone.0207203
29. Hirsch CH, Sommers L, Olsen A, Mullen L, Winograd CH. The natural history of functional morbidity in hospitalized older patients. J Am Geriatr Soc. 1990;38(12):1296-1303. doi:10.1111/j.1532-5415.1990.tb03451.x
30. Iyengar KP, Jain VK, Vaish A, Vaishya R, Maini L, Lal H. Post COVID-19: planning strategies to resume orthopaedic surgery -challenges and considerations. J Clin Orthop Trauma. 2020;11(suppl 3):S291-S295. doi:10.1016/j.jcot.2020.04.028
31. O’Connor CM, Anoushiravani AA, DiCaprio MR, Healy WL, Iorio R. Economic recovery after the COVID-19 pandemic: resuming elective orthopedic surgery and total joint arthroplasty. J Arthroplasty. 2020;35(suppl 7):S32-S36. doi:10.1016/j.arth.2020.04.038.
32. Mauseth SA, Skurtveit S, Skovlund E, Langhammer A, Spigset O. Medication use and association with urinary incontinence in women: data from the Norwegian Prescription Database and the HUNT study. Neurourol Urodyn. 2018;37(4):1448-1457. doi:10.1002/nau.23473
33. Sultan RS, Correll CU, Schoenbaum M, King M, Walkup JT, Olfson M. National patterns of commonly prescribed psychotropic medications to young people. J Child Adolesc Psychopharmacol. 2018;28(3):158-165. doi:10.1089/cap.2017.0077
34. McCoy RG, Dykhoff HJ, Sangaralingham L, et al. Adoption of new glucose-lowering medications in the U.S.-the case of SGLT2 inhibitors: nationwide cohort study. Diabetes Technol Ther. 2019;21(12):702-712. doi:10.1089/dia.2019.0213
Risk calculators can be of great value in guiding clinical decision making, patient-centered precision medicine, and resource allocation.1 Several perioperative risk prediction models have emerged in recent decades that estimate specific hazards (eg, cardiovascular complications after noncardiac surgery) with varying accuracy and utility. In the perioperative sphere, the time windows are often limited to an index hospitalization or 30 days following surgery or discharge.2-9 Although longer periods are of interest to patients, families, and health systems, few widely used or validated models are designed to look beyond this very narrow window.10,11 In addition, perioperative risk prediction models do not routinely incorporate parameters of a wide variety of health or demographic domains, such as patterns of health care, health care utilization, or medication use.
In 2013, in response to the need for near real-time information to guide delivery of enhanced care management services, the Veterans Health Administration (VHA) Office of Informatics and Analytics developed automated risk prediction models that used detailed electronic health record (EHR) data. These models were used to report Care Assessment Need (CAN) scores each week for all VHA enrollees and include data from a wide array of health domains. These CAN scores predict the risk for hospitalization, death, or either event within 90 days and 1 year.12,13 Each score is reported as both a predicted probability (0-1) and as a percentile in relation to all other VHA enrollees (a value between 1 and 99).13 The data used to calculate CAN scores are listed in Table 1.12
Surgical procedures or admissions would not be differentiated from nonsurgical admissions or other procedural clinic visits, and as such, it is not possible to isolate the effect of undergoing a surgical procedure from another health-related event on the CAN score. At the same time though, a short-term increase in system utilization caused by an elective surgical procedure such as a total knee replacement (TKR) would presumably be reflected in a change in CAN score, but this has not been studied.
Since their introduction, CAN scores have been routinely accessed by primary care teams and used to facilitate care coordination for thousands of VHA patients. However, these CAN scores are currently not available to VHA surgeons, anesthesiologists, or other perioperative clinicians. In this study, we examine the distributions of preoperative CAN scores and explore the relationships of preoperative CAN 1-year mortality scores with 1-year survival following discharge and length of stay (LOS) during index hospitalization in a cohort of US veterans who underwent TKR, the most common elective operation performed within the VHA system.
Methods
Following approval of the Durham Veterans Affairs Medical Center Institutional Review Board, all necessary data were extracted from the VHA Corporate Data Warehouse (CDW) repository.14 Informed consent was waived due to the minimal risk nature of the study.
We used Current Procedural Terminology codes (27438, 27446, 27447, 27486, 27487, 27488) and International Classification of Diseases, 9th edition clinical modification procedure codes (81.54, 81.55, 81.59, 00.80-00.84) to identify all veterans who had undergone primary or revision TKR between July 2014 and December 2015 in VHA Veterans Integrated Service Network 1 (Maine, Vermont, New Hampshire, Massachusetts, Connecticut, Rhode Island, New York, Pennsylvania, West Virginia, Virginia, North Carolina). Because we focused on outcomes following hospital discharge, patients who died before discharge were excluded from the analysis. Preoperative CAN 1-year mortality score was chosen as the measure under the assumption that long-term survival may be the most meaningful of the 4 possible CAN score measures.
Our primary objective was to determine distribution of preoperative CAN scores in the study population. Our secondary was to study relationships among the preoperative CAN 1-year mortality scores and 1-year mortality and hospital LOS.
Study Variables
For each patient, we extracted the date of index surgery. The primary exposure or independent variable was the CAN score in the week prior to this date. Because prior study has shown that CAN scores trajectories do not significantly change over time, the date-stamped CAN scores in the week before surgery represent what would have been available to clinicians in a preoperative setting.15 Since CAN scores are refreshed and overwritten every week, we extracted archived scores from the CDW.
For the 1-year survival outcome, the primary dependent variable, we queried the vital status files in the CDW for the date of death if applicable. We confirmed survival beyond 1 year by examining vital signs in the CDW for a minimum of 2 independent encounters beyond 1 year after the date of discharge. To compute the index LOS, the secondary outcome, we computed the difference between the date of admission and date of hospital discharge.
Statistical Methods
The parameters and performance of the multivariable logistic regression models developed to compute the various CAN mortality and hospitalization risk scores have been previously described.12 Briefly, Wang and colleagues created parsimonious regression models using backward selection. Model discrimination was evaluated using C (concordance)-statistic. Model calibration was assessed by comparing predicted vs observed event rates by risk deciles and performing Cox proportional hazards regression.
We plotted histograms to display preoperative CAN scores as a simple measure of distribution (Figure 1). We also examined the cumulative proportion of patients at each preoperative CAN 1-year mortality score.
Using a conventional t test, we compared means of preoperative CAN 1-year mortality scores in patients who survived vs those who died within 1 year. We also constructed a plot of the proportion of patients who had died within 1 year vs preoperative CAN 1-year mortality scores. Kaplan-Meier curves were then constructed examining 1-year survival by CAN 1-year mortality score by terciles.
Finally, we examined the relationship between preoperative CAN 1-year mortality scores and index LOS in 2 ways: We plotted LOS across CAN scores, and we constructed a
Results
We identified 8206 patients who had undergone a TKR over the 18-month study period. The overall mean (SD) for age was 65 (8.41) years; 93% were male, and 78% were White veterans. Patient demographics are well described in a previous publication.16,17
In terms of model parameters for the CAN score models, C-statistics for the 90-day outcome models were as follows: 0.833 for the model predicting hospitalization (95% CI, 0.832-0.834); 0.865 for the model predicting death (95% CI, 0.863-0.876); and 0.811 for the model predicting either event (95% CI, 0.810-0.812). C-statistics for the 1-year outcome models were 0.809 for the model predicting hospitalization (95% CI, 0.808-0.810); 0.851 for the model predicting death (95% CI, 0.849-0.852); and 0.787 for the model predicting either event (95% CI, 0.786-0.787). Models were well calibrated with α = 0 and β = 1, demonstrating strong agreement between observed and predicted event rates.
The distribution of preoperative CAN 1-year mortality scores was close to normal (median, 50; interquartile range, 40; mean [SD] 48 [25.6]) (eTable). The original CAN score models were developed having an equal number of patients in each strata and as such, are normally distributed.12 Our cohort was similar in pattern of distribution. Distributions of the remaining preoperative CAN scores (90-day mortality, 1-year hospitalization, 90-day hospitalization) are shown in Figures 2, 3, and 4. Not surprisingly, histograms for both 90-day and 1-year hospitalization were skewed toward higher scores, indicating that these patients were expected to be hospitalized in the near future.
Overall, 1.4% (110/8096) of patients died within 1 year of surgery. Comparing 1-year mortality CAN scores in survivors vs nonsurvivors, we found statistically significant differences in means (47 vs 66 respectively, P < .001) and medians (45 vs 75 respectively, P < .001) (Table 2). In the plot examining the relationship between preoperative 1-year mortality CAN scores and 1-year mortality, the percentage who died within 1 year increased initially for patients with CAN scores > 60 and again exponentially for patients with CAN scores > 80. Examining Kaplan-Meier curves, we found that survivors and nonsurvivors separated early after surgery, and the differences between the top tercile and the middle/lower terciles were statistically significant (P < .001). Mortality rates were about 0.5% in the lower and middle terciles but about 2% in the upper tercile (Figure 5).
In the plot examining the relationship between CAN scores and index LOS, the LOS rose significantly beyond a CAN score of 60 and dramatically beyond a CAN score of 80 (Figure 6). LOESS curves also showed 2 inflection points suggesting an incremental and sequential rise in the LOS with increasing CAN scores (Figure 7). Mean (SD) LOS in days for the lowest to highest terciles was 2.6 (1.7), 2.8 (2.1), and 3.6 (2.2), respectively.
Discussion
CAN scores are automatically generated each week by EHR-based multivariable risk models. These scores have excellent predictive accuracy for 90-day and 1-year mortality and hospitalization and are routinely used by VHA primary care teams to assist with clinical operations.13 We studied the distribution of CAN 1-year mortality scores in a preoperative context and examined relationships of the preoperative CAN 1-year mortality scores with postoperative mortality and LOS in 8206 veterans who underwent TKR.
There are several noteworthy findings. First, the overall 1-year mortality rate observed following TKR (1.4%) was similar to other published reports.18,19 Not surprisingly, preoperative CAN 1-year mortality scores were significantly higher in veterans who died compared with those of survivors. The majority of patients who died had a preoperative CAN 1-year mortality score > 75 while most who survived had a preoperative CAN 1-year mortality score < 45 (P < .001). Interestingly, the same scores showed a nonlinear correlation with LOS. Index LOS was about 4 days in patients in the highest tercile of CAN scores vs 2.5 days in the lowest tercile, but the initial increase in LOS was detected at a CAN score of about 55 to 60.
In addition, mortality rate varied widely in different segments of the population when grouped according to preoperative CAN scores. One-year mortality rates in the highest tercile reached 2%, about 4-fold higher than that of lower terciles (0.5%). Examination of the Kaplan-Meier curves showed that this difference in mortality between the highest tercile and the lower 2 groups appears soon after discharge and continues to increase over time, suggesting that the factors contributing to the increased mortality are present at the time of discharge and persist beyond the postoperative period. In summary, although CAN scores were not designed for use in the perioperative context, we found that preoperative CAN 1-year mortality scores are broadly predictive of mortality, but especially for increases in LOS following elective TKA, both increases in hospital LOS following elective TKA and mortality over the year after TKA.
Our findings raise several important questions. The decision to undergo elective surgery is complex. Arguably, individuals who undergo elective knee replacement should be healthy enough to undergo, recover, and reap the benefits from a procedure that does not extend life. The distribution of preoperative CAN 1-year mortality scores for our study population was similar to that of the general VHA enrollee population with similar measured mortality rates (≤ 0.5% vs ≥ 1.7% in the low and high terciles, respectively).1 Further study comparing outcomes in matched cohorts who did and did not undergo joint replacement would be of interest. In lieu of this, though, the association of high but not extreme CAN scores with increased hospital LOS may potentially be used to guide allocation of resources to this group, obviating the increased cost and risk to which this group is exposed. And the additional insight afforded by CAN scores may enhance shared decision-making models by identifying patients at the very highest risk (eg, 1-year mortality CAN score ≥ 90), patients who conceivably might not survive long enough to recover from and enjoy their reconstructed knee, who might in the long run be harmed by undergoing the procedure.
Many total joint arthroplasties are performed in older patients, a population in which frailty is increasingly recognized as a significant risk factor for poor outcomes.20,21 CAN scores reliably identify high-risk patients and have been shown to correlate with frailty in this group.22 Multiple authors have reported improved outcomes with cost reductions after implementation of programs targeting modifiable risk factors in high-risk surgical candidates.23-25 A preoperative assessment that includes the CAN score may be valuable in identifying patients who would benefit most from prehabilitation programs or other interventions designed to blunt the impact of frailty. It is true that many elements used to calculate the CAN score would not be considered modifiable, especially in the short term. However, specific contributors to frailty, such as nutritional status and polypharmacy might be potential candidates. As with all multivariable risk prediction models, there are multiple paths to a high CAN score, and further research to identify clinically relevant subgroups may help inform efforts to improve perioperative care within this population.
Hospital LOS is of intense interest for many reasons, not least its utility as a surrogate for cost and increased risk for immediate perioperative adverse events, such as multidrug-resistant hospital acquired infections, need for postacute facility-based rehabilitation, and deconditioning that increase risks of falls and fractures in the older population.26-29 In addition, its importance is magnified due to the COVID-19 pandemic context in which restarting elective surgery programs has changed traditional criteria by which patients are scheduled for surgery.
We have shown that elevated CAN scores are able to identify patients at risk for extended hospital stays and, as such, may be useful additional data in allocating scarce operating room time and other resources for optimal patient and health care provider safety.30,31 Individual surgeons and hospital systems would, of course, decide which patients should be triaged to go first, based on local priorities; however, choosing lower risk patients with minimal risk of morbidity and mortality while pursuing prehabilitation for higher risk patients is a reasonable approach.
Limitations
Our study has several limitations. Only a single surgical procedure was included, albeit the most common one performed in the VHA. In addition, no information was available concerning the precise clinical course for these patients, such as the duration of surgery, anesthetic technique, and management of acute, perioperative course. Although the assumption was made that patients received standard care in a manner such that these factors would not significantly affect either their mortality or their LOS out of proportion to their preoperative clinical status, confounding cannot be excluded. Therefore, further study is necessary to determine whether CAN scores can accurately predict mortality and/or LOS for patients undergoing other procedures. Further, a clinical trial is required to assess whether systematic provision of the CAN score at the point of surgery would impact care and, more important, impact outcomes. In addition, multivariable analyses were not performed, including and excluding various components of the CAN score models. Currently, CAN scores could be made available to the surgical/anesthesia communities at minimal or no cost and are updated automatically. Model calibration and discrimination in this particular setting were not validated.
Because our interest is in leveraging an existing resource to a current clinical and operational problem rather than in creating or validating a new tool, we chose to test the simple bivariate relationship between preoperative CAN scores and outcomes. We chose the preoperative 1-year mortality CAN score from among the 4 options under the assumption that long-term survival is the most meaningful of the 4 candidate outcomes. Finally, while the CAN scores are currently only calculated and generated for patients cared for within the VHA, few data elements are unavailable to civilian health systems. The most problematic would be documentation of actual prescription filling, but this is a topic of increasing interest to the medical and academic communities and access to such information we hope will improve.32-34
Conclusions
Although designed for use by VHA primary care teams, CAN scores also may have value for perioperative clinicians, predicting mortality and prolonged hospital LOS in those with elevated 1-year mortality scores. Advantages of CAN scores relative to other perioperative risk calculators lies in their ability to predict long-term rather than 30-day survival and that they are automatically generated on a near-real-time basis for all patients who receive care in VHA ambulatory clinics. Further study is needed to determine practical utility in shared decision making, preoperative evaluation and optimization, and perioperative resource allocation.
Acknowledgments
This work was supported by the US Department of Veterans Affairs (VA) National Center for Patient Safety, Field Office 10A4E, through the Patient Safety Center of Inquiry at the Durham VA Medical Center in North Carolina. The study also received support from the Center of Innovation to Accelerate Discovery and Practice Transformation (CIN 13-410) at the Durham VA Health Care System.
Risk calculators can be of great value in guiding clinical decision making, patient-centered precision medicine, and resource allocation.1 Several perioperative risk prediction models have emerged in recent decades that estimate specific hazards (eg, cardiovascular complications after noncardiac surgery) with varying accuracy and utility. In the perioperative sphere, the time windows are often limited to an index hospitalization or 30 days following surgery or discharge.2-9 Although longer periods are of interest to patients, families, and health systems, few widely used or validated models are designed to look beyond this very narrow window.10,11 In addition, perioperative risk prediction models do not routinely incorporate parameters of a wide variety of health or demographic domains, such as patterns of health care, health care utilization, or medication use.
In 2013, in response to the need for near real-time information to guide delivery of enhanced care management services, the Veterans Health Administration (VHA) Office of Informatics and Analytics developed automated risk prediction models that used detailed electronic health record (EHR) data. These models were used to report Care Assessment Need (CAN) scores each week for all VHA enrollees and include data from a wide array of health domains. These CAN scores predict the risk for hospitalization, death, or either event within 90 days and 1 year.12,13 Each score is reported as both a predicted probability (0-1) and as a percentile in relation to all other VHA enrollees (a value between 1 and 99).13 The data used to calculate CAN scores are listed in Table 1.12
Surgical procedures or admissions would not be differentiated from nonsurgical admissions or other procedural clinic visits, and as such, it is not possible to isolate the effect of undergoing a surgical procedure from another health-related event on the CAN score. At the same time though, a short-term increase in system utilization caused by an elective surgical procedure such as a total knee replacement (TKR) would presumably be reflected in a change in CAN score, but this has not been studied.
Since their introduction, CAN scores have been routinely accessed by primary care teams and used to facilitate care coordination for thousands of VHA patients. However, these CAN scores are currently not available to VHA surgeons, anesthesiologists, or other perioperative clinicians. In this study, we examine the distributions of preoperative CAN scores and explore the relationships of preoperative CAN 1-year mortality scores with 1-year survival following discharge and length of stay (LOS) during index hospitalization in a cohort of US veterans who underwent TKR, the most common elective operation performed within the VHA system.
Methods
Following approval of the Durham Veterans Affairs Medical Center Institutional Review Board, all necessary data were extracted from the VHA Corporate Data Warehouse (CDW) repository.14 Informed consent was waived due to the minimal risk nature of the study.
We used Current Procedural Terminology codes (27438, 27446, 27447, 27486, 27487, 27488) and International Classification of Diseases, 9th edition clinical modification procedure codes (81.54, 81.55, 81.59, 00.80-00.84) to identify all veterans who had undergone primary or revision TKR between July 2014 and December 2015 in VHA Veterans Integrated Service Network 1 (Maine, Vermont, New Hampshire, Massachusetts, Connecticut, Rhode Island, New York, Pennsylvania, West Virginia, Virginia, North Carolina). Because we focused on outcomes following hospital discharge, patients who died before discharge were excluded from the analysis. Preoperative CAN 1-year mortality score was chosen as the measure under the assumption that long-term survival may be the most meaningful of the 4 possible CAN score measures.
Our primary objective was to determine distribution of preoperative CAN scores in the study population. Our secondary was to study relationships among the preoperative CAN 1-year mortality scores and 1-year mortality and hospital LOS.
Study Variables
For each patient, we extracted the date of index surgery. The primary exposure or independent variable was the CAN score in the week prior to this date. Because prior study has shown that CAN scores trajectories do not significantly change over time, the date-stamped CAN scores in the week before surgery represent what would have been available to clinicians in a preoperative setting.15 Since CAN scores are refreshed and overwritten every week, we extracted archived scores from the CDW.
For the 1-year survival outcome, the primary dependent variable, we queried the vital status files in the CDW for the date of death if applicable. We confirmed survival beyond 1 year by examining vital signs in the CDW for a minimum of 2 independent encounters beyond 1 year after the date of discharge. To compute the index LOS, the secondary outcome, we computed the difference between the date of admission and date of hospital discharge.
Statistical Methods
The parameters and performance of the multivariable logistic regression models developed to compute the various CAN mortality and hospitalization risk scores have been previously described.12 Briefly, Wang and colleagues created parsimonious regression models using backward selection. Model discrimination was evaluated using C (concordance)-statistic. Model calibration was assessed by comparing predicted vs observed event rates by risk deciles and performing Cox proportional hazards regression.
We plotted histograms to display preoperative CAN scores as a simple measure of distribution (Figure 1). We also examined the cumulative proportion of patients at each preoperative CAN 1-year mortality score.
Using a conventional t test, we compared means of preoperative CAN 1-year mortality scores in patients who survived vs those who died within 1 year. We also constructed a plot of the proportion of patients who had died within 1 year vs preoperative CAN 1-year mortality scores. Kaplan-Meier curves were then constructed examining 1-year survival by CAN 1-year mortality score by terciles.
Finally, we examined the relationship between preoperative CAN 1-year mortality scores and index LOS in 2 ways: We plotted LOS across CAN scores, and we constructed a
Results
We identified 8206 patients who had undergone a TKR over the 18-month study period. The overall mean (SD) for age was 65 (8.41) years; 93% were male, and 78% were White veterans. Patient demographics are well described in a previous publication.16,17
In terms of model parameters for the CAN score models, C-statistics for the 90-day outcome models were as follows: 0.833 for the model predicting hospitalization (95% CI, 0.832-0.834); 0.865 for the model predicting death (95% CI, 0.863-0.876); and 0.811 for the model predicting either event (95% CI, 0.810-0.812). C-statistics for the 1-year outcome models were 0.809 for the model predicting hospitalization (95% CI, 0.808-0.810); 0.851 for the model predicting death (95% CI, 0.849-0.852); and 0.787 for the model predicting either event (95% CI, 0.786-0.787). Models were well calibrated with α = 0 and β = 1, demonstrating strong agreement between observed and predicted event rates.
The distribution of preoperative CAN 1-year mortality scores was close to normal (median, 50; interquartile range, 40; mean [SD] 48 [25.6]) (eTable). The original CAN score models were developed having an equal number of patients in each strata and as such, are normally distributed.12 Our cohort was similar in pattern of distribution. Distributions of the remaining preoperative CAN scores (90-day mortality, 1-year hospitalization, 90-day hospitalization) are shown in Figures 2, 3, and 4. Not surprisingly, histograms for both 90-day and 1-year hospitalization were skewed toward higher scores, indicating that these patients were expected to be hospitalized in the near future.
Overall, 1.4% (110/8096) of patients died within 1 year of surgery. Comparing 1-year mortality CAN scores in survivors vs nonsurvivors, we found statistically significant differences in means (47 vs 66 respectively, P < .001) and medians (45 vs 75 respectively, P < .001) (Table 2). In the plot examining the relationship between preoperative 1-year mortality CAN scores and 1-year mortality, the percentage who died within 1 year increased initially for patients with CAN scores > 60 and again exponentially for patients with CAN scores > 80. Examining Kaplan-Meier curves, we found that survivors and nonsurvivors separated early after surgery, and the differences between the top tercile and the middle/lower terciles were statistically significant (P < .001). Mortality rates were about 0.5% in the lower and middle terciles but about 2% in the upper tercile (Figure 5).
In the plot examining the relationship between CAN scores and index LOS, the LOS rose significantly beyond a CAN score of 60 and dramatically beyond a CAN score of 80 (Figure 6). LOESS curves also showed 2 inflection points suggesting an incremental and sequential rise in the LOS with increasing CAN scores (Figure 7). Mean (SD) LOS in days for the lowest to highest terciles was 2.6 (1.7), 2.8 (2.1), and 3.6 (2.2), respectively.
Discussion
CAN scores are automatically generated each week by EHR-based multivariable risk models. These scores have excellent predictive accuracy for 90-day and 1-year mortality and hospitalization and are routinely used by VHA primary care teams to assist with clinical operations.13 We studied the distribution of CAN 1-year mortality scores in a preoperative context and examined relationships of the preoperative CAN 1-year mortality scores with postoperative mortality and LOS in 8206 veterans who underwent TKR.
There are several noteworthy findings. First, the overall 1-year mortality rate observed following TKR (1.4%) was similar to other published reports.18,19 Not surprisingly, preoperative CAN 1-year mortality scores were significantly higher in veterans who died compared with those of survivors. The majority of patients who died had a preoperative CAN 1-year mortality score > 75 while most who survived had a preoperative CAN 1-year mortality score < 45 (P < .001). Interestingly, the same scores showed a nonlinear correlation with LOS. Index LOS was about 4 days in patients in the highest tercile of CAN scores vs 2.5 days in the lowest tercile, but the initial increase in LOS was detected at a CAN score of about 55 to 60.
In addition, mortality rate varied widely in different segments of the population when grouped according to preoperative CAN scores. One-year mortality rates in the highest tercile reached 2%, about 4-fold higher than that of lower terciles (0.5%). Examination of the Kaplan-Meier curves showed that this difference in mortality between the highest tercile and the lower 2 groups appears soon after discharge and continues to increase over time, suggesting that the factors contributing to the increased mortality are present at the time of discharge and persist beyond the postoperative period. In summary, although CAN scores were not designed for use in the perioperative context, we found that preoperative CAN 1-year mortality scores are broadly predictive of mortality, but especially for increases in LOS following elective TKA, both increases in hospital LOS following elective TKA and mortality over the year after TKA.
Our findings raise several important questions. The decision to undergo elective surgery is complex. Arguably, individuals who undergo elective knee replacement should be healthy enough to undergo, recover, and reap the benefits from a procedure that does not extend life. The distribution of preoperative CAN 1-year mortality scores for our study population was similar to that of the general VHA enrollee population with similar measured mortality rates (≤ 0.5% vs ≥ 1.7% in the low and high terciles, respectively).1 Further study comparing outcomes in matched cohorts who did and did not undergo joint replacement would be of interest. In lieu of this, though, the association of high but not extreme CAN scores with increased hospital LOS may potentially be used to guide allocation of resources to this group, obviating the increased cost and risk to which this group is exposed. And the additional insight afforded by CAN scores may enhance shared decision-making models by identifying patients at the very highest risk (eg, 1-year mortality CAN score ≥ 90), patients who conceivably might not survive long enough to recover from and enjoy their reconstructed knee, who might in the long run be harmed by undergoing the procedure.
Many total joint arthroplasties are performed in older patients, a population in which frailty is increasingly recognized as a significant risk factor for poor outcomes.20,21 CAN scores reliably identify high-risk patients and have been shown to correlate with frailty in this group.22 Multiple authors have reported improved outcomes with cost reductions after implementation of programs targeting modifiable risk factors in high-risk surgical candidates.23-25 A preoperative assessment that includes the CAN score may be valuable in identifying patients who would benefit most from prehabilitation programs or other interventions designed to blunt the impact of frailty. It is true that many elements used to calculate the CAN score would not be considered modifiable, especially in the short term. However, specific contributors to frailty, such as nutritional status and polypharmacy might be potential candidates. As with all multivariable risk prediction models, there are multiple paths to a high CAN score, and further research to identify clinically relevant subgroups may help inform efforts to improve perioperative care within this population.
Hospital LOS is of intense interest for many reasons, not least its utility as a surrogate for cost and increased risk for immediate perioperative adverse events, such as multidrug-resistant hospital acquired infections, need for postacute facility-based rehabilitation, and deconditioning that increase risks of falls and fractures in the older population.26-29 In addition, its importance is magnified due to the COVID-19 pandemic context in which restarting elective surgery programs has changed traditional criteria by which patients are scheduled for surgery.
We have shown that elevated CAN scores are able to identify patients at risk for extended hospital stays and, as such, may be useful additional data in allocating scarce operating room time and other resources for optimal patient and health care provider safety.30,31 Individual surgeons and hospital systems would, of course, decide which patients should be triaged to go first, based on local priorities; however, choosing lower risk patients with minimal risk of morbidity and mortality while pursuing prehabilitation for higher risk patients is a reasonable approach.
Limitations
Our study has several limitations. Only a single surgical procedure was included, albeit the most common one performed in the VHA. In addition, no information was available concerning the precise clinical course for these patients, such as the duration of surgery, anesthetic technique, and management of acute, perioperative course. Although the assumption was made that patients received standard care in a manner such that these factors would not significantly affect either their mortality or their LOS out of proportion to their preoperative clinical status, confounding cannot be excluded. Therefore, further study is necessary to determine whether CAN scores can accurately predict mortality and/or LOS for patients undergoing other procedures. Further, a clinical trial is required to assess whether systematic provision of the CAN score at the point of surgery would impact care and, more important, impact outcomes. In addition, multivariable analyses were not performed, including and excluding various components of the CAN score models. Currently, CAN scores could be made available to the surgical/anesthesia communities at minimal or no cost and are updated automatically. Model calibration and discrimination in this particular setting were not validated.
Because our interest is in leveraging an existing resource to a current clinical and operational problem rather than in creating or validating a new tool, we chose to test the simple bivariate relationship between preoperative CAN scores and outcomes. We chose the preoperative 1-year mortality CAN score from among the 4 options under the assumption that long-term survival is the most meaningful of the 4 candidate outcomes. Finally, while the CAN scores are currently only calculated and generated for patients cared for within the VHA, few data elements are unavailable to civilian health systems. The most problematic would be documentation of actual prescription filling, but this is a topic of increasing interest to the medical and academic communities and access to such information we hope will improve.32-34
Conclusions
Although designed for use by VHA primary care teams, CAN scores also may have value for perioperative clinicians, predicting mortality and prolonged hospital LOS in those with elevated 1-year mortality scores. Advantages of CAN scores relative to other perioperative risk calculators lies in their ability to predict long-term rather than 30-day survival and that they are automatically generated on a near-real-time basis for all patients who receive care in VHA ambulatory clinics. Further study is needed to determine practical utility in shared decision making, preoperative evaluation and optimization, and perioperative resource allocation.
Acknowledgments
This work was supported by the US Department of Veterans Affairs (VA) National Center for Patient Safety, Field Office 10A4E, through the Patient Safety Center of Inquiry at the Durham VA Medical Center in North Carolina. The study also received support from the Center of Innovation to Accelerate Discovery and Practice Transformation (CIN 13-410) at the Durham VA Health Care System.
1. McNair AGK, MacKichan F, Donovan JL, et al. What surgeons tell patients and what patients want to know before major cancer surgery: a qualitative study. BMC Cancer. 2016;16:258. doi:10.1186/s12885-016-2292-3
2. Grover FL, Hammermeister KE, Burchfiel C. Initial report of the Veterans Administration Preoperative Risk Assessment Study for Cardiac Surgery. Ann Thorac Surg. 1990;50(1):12-26; discussion 27-18. doi:10.1016/0003-4975(90)90073-f
3. Khuri SF, Daley J, Henderson W, et al. The National Veterans Administration Surgical Risk Study: risk adjustment for the comparative assessment of the quality of surgical care. J Am Coll Surg. 1995;180(5):519-531.
4. Glance LG, Lustik SJ, Hannan EL, et al. The Surgical Mortality Probability Model: derivation and validation of a simple simple risk prediction rule for noncardiac surgery. Ann Surg. 2012;255(4):696-702. doi:10.1097/SLA.0b013e31824b45af
5. Keller DS, Kroll D, Papaconstantinou HT, Ellis CN. Development and validation of a methodology to reduce mortality using the veterans affairs surgical quality improvement program risk calculator. J Am Coll Surg. 2017;224(4):602-607. doi:10.1016/j.jamcollsurg.2016.12.033
6. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842.e831-833. doi:10.1016/j.jamcollsurg.2013.07.385
7. Ford MK, Beattie WS, Wijeysundera DN. Systematic review: prediction of perioperative cardiac complications and mortality by the revised cardiac risk index. Ann Intern Med. 2010;152(1):26-35. doi:10.7326/0003-4819-152-1-201001050-00007
8. Gupta PK, Gupta H, Sundaram A, et al. Development and validation of a risk calculator for prediction of cardiac risk after surgery. Circulation. 2011;124(4):381-387. doi:10.1161/CIRCULATIONAHA.110.015701
9. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. doi:10.1161/01.cir.100.10.1043
10. Smith T, Li X, Nylander W, Gunnar W. Thirty-day postoperative mortality risk estimates and 1-year survival in Veterans Health Administration surgery patients. JAMA Surg. 2016;151(5):417-422. doi:10.1001/jamasurg.2015.4882
11. Damhuis RA, Wijnhoven BP, Plaisier PW, Kirkels WJ, Kranse R, van Lanschot JJ. Comparison of 30-day, 90- day and in-hospital postoperative mortality for eight different cancer types. Br J Surg. 2012;99(8):1149-1154. doi:10.1002/bjs.8813
12. Wang L, Porter B, Maynard C, et al. Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration. Med Care. 2013;51(4):368-373. doi:10.1016/j.amjcard.2012.06.038
13. Fihn SD, Francis J, Clancy C, et al. Insights from advanced analytics at the Veterans Health Administration. Health Aff (Millwood). 2014;33(7):1203-1211. doi:10.1377/hlthaff.2014.0054
14. Noël PH, Copeland LA, Perrin RA, et al. VHA Corporate Data Warehouse height and weight data: opportunities and challenges for health services research. J Rehabil Res Dev. 2010;47(8):739-750. doi:10.1682/jrrd.2009.08.0110
15. Wong ES, Yoon J, Piegari RI, Rosland AM, Fihn SD, Chang ET. Identifying latent subgroups of high-risk patients using risk score trajectories. J Gen Intern Med. 2018;33(12):2120-2126. doi:10.1007/s11606-018-4653-x
16. Chen Q, Hsia HL, Overman R, et al. Impact of an opioid safety initiative on patients undergoing total knee arthroplasty: a time series analysis. Anesthesiology. 2019;131(2):369-380. doi:10.1097/ALN.0000000000002771
17. Hsia HL, Takemoto S, van de Ven T, et al. Acute pain is associated with chronic opioid use after total knee arthroplasty. Reg Anesth Pain Med. 2018;43(7):705-711. doi:10.1097/AAP.0000000000000831
18. Inacio MCS, Dillon MT, Miric A, Navarro RA, Paxton EW. Mortality after total knee and total hip arthroplasty in a large integrated health care system. Perm J. 2017;21:16-171. doi:10.7812/TPP/16-171
19. Lee QJ, Mak WP, Wong YC. Mortality following primary total knee replacement in public hospitals in Hong Kong. Hong Kong Med J. 2016;22(3):237-241. doi:10.12809/hkmj154712
20. Lin HS, Watts JN, Peel NM, Hubbard RE. Frailty and post-operative outcomes in older surgical patients: a systematic review. BMC Geriatr. 2016;16(1):157. doi:10.1186/s12877-016-0329-8
21. Shinall MC Jr, Arya S, Youk A, et al. Association of preoperative patient frailty and operative stress with postoperative mortality. JAMA Surg. 2019;155(1):e194620. doi:10.1001/jamasurg.2019.4620
22. Ruiz JG, Priyadarshni S, Rahaman Z, et al. Validation of an automatically generated screening score for frailty: the care assessment need (CAN) score. BMC Geriatr. 2018;18(1):106. doi:10.1186/s12877-018-0802-7
23. Bernstein DN, Liu TC, Winegar AL, et al. Evaluation of a preoperative optimization protocol for primary hip and knee arthroplasty patients. J Arthroplasty. 2018;33(12):3642- 3648. doi:10.1016/j.arth.2018.08.018
24. Sodhi N, Anis HK, Coste M, et al. A nationwide analysis of preoperative planning on operative times and postoperative complications in total knee arthroplasty. J Knee Surg. 2019;32(11):1040-1045. doi:10.1055/s-0039-1677790
25. Krause A, Sayeed Z, El-Othmani M, Pallekonda V, Mihalko W, Saleh KJ. Outpatient total knee arthroplasty: are we there yet? (part 1). Orthop Clin North Am. 2018;49(1):1-6. doi:10.1016/j.ocl.2017.08.002
26. Barrasa-Villar JI, Aibar-Remón C, Prieto-Andrés P, Mareca- Doñate R, Moliner-Lahoz J. Impact on morbidity, mortality, and length of stay of hospital-acquired infections by resistant microorganisms. Clin Infect Dis. 2017;65(4):644-652. doi:10.1093/cid/cix411
27. Nikkel LE, Kates SL, Schreck M, Maceroli M, Mahmood B, Elfar JC. Length of hospital stay after hip fracture and risk of early mortality after discharge in New York state: retrospective cohort study. BMJ. 2015;351:h6246. doi:10.1136/bmj.h6246
28. Marfil-Garza BA, Belaunzarán-Zamudio PF, Gulias-Herrero A, et al. Risk factors associated with prolonged hospital length-of-stay: 18-year retrospective study of hospitalizations in a tertiary healthcare center in Mexico. PLoS One. 2018;13(11):e0207203. doi:10.1371/journal.pone.0207203
29. Hirsch CH, Sommers L, Olsen A, Mullen L, Winograd CH. The natural history of functional morbidity in hospitalized older patients. J Am Geriatr Soc. 1990;38(12):1296-1303. doi:10.1111/j.1532-5415.1990.tb03451.x
30. Iyengar KP, Jain VK, Vaish A, Vaishya R, Maini L, Lal H. Post COVID-19: planning strategies to resume orthopaedic surgery -challenges and considerations. J Clin Orthop Trauma. 2020;11(suppl 3):S291-S295. doi:10.1016/j.jcot.2020.04.028
31. O’Connor CM, Anoushiravani AA, DiCaprio MR, Healy WL, Iorio R. Economic recovery after the COVID-19 pandemic: resuming elective orthopedic surgery and total joint arthroplasty. J Arthroplasty. 2020;35(suppl 7):S32-S36. doi:10.1016/j.arth.2020.04.038.
32. Mauseth SA, Skurtveit S, Skovlund E, Langhammer A, Spigset O. Medication use and association with urinary incontinence in women: data from the Norwegian Prescription Database and the HUNT study. Neurourol Urodyn. 2018;37(4):1448-1457. doi:10.1002/nau.23473
33. Sultan RS, Correll CU, Schoenbaum M, King M, Walkup JT, Olfson M. National patterns of commonly prescribed psychotropic medications to young people. J Child Adolesc Psychopharmacol. 2018;28(3):158-165. doi:10.1089/cap.2017.0077
34. McCoy RG, Dykhoff HJ, Sangaralingham L, et al. Adoption of new glucose-lowering medications in the U.S.-the case of SGLT2 inhibitors: nationwide cohort study. Diabetes Technol Ther. 2019;21(12):702-712. doi:10.1089/dia.2019.0213
1. McNair AGK, MacKichan F, Donovan JL, et al. What surgeons tell patients and what patients want to know before major cancer surgery: a qualitative study. BMC Cancer. 2016;16:258. doi:10.1186/s12885-016-2292-3
2. Grover FL, Hammermeister KE, Burchfiel C. Initial report of the Veterans Administration Preoperative Risk Assessment Study for Cardiac Surgery. Ann Thorac Surg. 1990;50(1):12-26; discussion 27-18. doi:10.1016/0003-4975(90)90073-f
3. Khuri SF, Daley J, Henderson W, et al. The National Veterans Administration Surgical Risk Study: risk adjustment for the comparative assessment of the quality of surgical care. J Am Coll Surg. 1995;180(5):519-531.
4. Glance LG, Lustik SJ, Hannan EL, et al. The Surgical Mortality Probability Model: derivation and validation of a simple simple risk prediction rule for noncardiac surgery. Ann Surg. 2012;255(4):696-702. doi:10.1097/SLA.0b013e31824b45af
5. Keller DS, Kroll D, Papaconstantinou HT, Ellis CN. Development and validation of a methodology to reduce mortality using the veterans affairs surgical quality improvement program risk calculator. J Am Coll Surg. 2017;224(4):602-607. doi:10.1016/j.jamcollsurg.2016.12.033
6. Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5):833-842.e831-833. doi:10.1016/j.jamcollsurg.2013.07.385
7. Ford MK, Beattie WS, Wijeysundera DN. Systematic review: prediction of perioperative cardiac complications and mortality by the revised cardiac risk index. Ann Intern Med. 2010;152(1):26-35. doi:10.7326/0003-4819-152-1-201001050-00007
8. Gupta PK, Gupta H, Sundaram A, et al. Development and validation of a risk calculator for prediction of cardiac risk after surgery. Circulation. 2011;124(4):381-387. doi:10.1161/CIRCULATIONAHA.110.015701
9. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation. 1999;100(10):1043-1049. doi:10.1161/01.cir.100.10.1043
10. Smith T, Li X, Nylander W, Gunnar W. Thirty-day postoperative mortality risk estimates and 1-year survival in Veterans Health Administration surgery patients. JAMA Surg. 2016;151(5):417-422. doi:10.1001/jamasurg.2015.4882
11. Damhuis RA, Wijnhoven BP, Plaisier PW, Kirkels WJ, Kranse R, van Lanschot JJ. Comparison of 30-day, 90- day and in-hospital postoperative mortality for eight different cancer types. Br J Surg. 2012;99(8):1149-1154. doi:10.1002/bjs.8813
12. Wang L, Porter B, Maynard C, et al. Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration. Med Care. 2013;51(4):368-373. doi:10.1016/j.amjcard.2012.06.038
13. Fihn SD, Francis J, Clancy C, et al. Insights from advanced analytics at the Veterans Health Administration. Health Aff (Millwood). 2014;33(7):1203-1211. doi:10.1377/hlthaff.2014.0054
14. Noël PH, Copeland LA, Perrin RA, et al. VHA Corporate Data Warehouse height and weight data: opportunities and challenges for health services research. J Rehabil Res Dev. 2010;47(8):739-750. doi:10.1682/jrrd.2009.08.0110
15. Wong ES, Yoon J, Piegari RI, Rosland AM, Fihn SD, Chang ET. Identifying latent subgroups of high-risk patients using risk score trajectories. J Gen Intern Med. 2018;33(12):2120-2126. doi:10.1007/s11606-018-4653-x
16. Chen Q, Hsia HL, Overman R, et al. Impact of an opioid safety initiative on patients undergoing total knee arthroplasty: a time series analysis. Anesthesiology. 2019;131(2):369-380. doi:10.1097/ALN.0000000000002771
17. Hsia HL, Takemoto S, van de Ven T, et al. Acute pain is associated with chronic opioid use after total knee arthroplasty. Reg Anesth Pain Med. 2018;43(7):705-711. doi:10.1097/AAP.0000000000000831
18. Inacio MCS, Dillon MT, Miric A, Navarro RA, Paxton EW. Mortality after total knee and total hip arthroplasty in a large integrated health care system. Perm J. 2017;21:16-171. doi:10.7812/TPP/16-171
19. Lee QJ, Mak WP, Wong YC. Mortality following primary total knee replacement in public hospitals in Hong Kong. Hong Kong Med J. 2016;22(3):237-241. doi:10.12809/hkmj154712
20. Lin HS, Watts JN, Peel NM, Hubbard RE. Frailty and post-operative outcomes in older surgical patients: a systematic review. BMC Geriatr. 2016;16(1):157. doi:10.1186/s12877-016-0329-8
21. Shinall MC Jr, Arya S, Youk A, et al. Association of preoperative patient frailty and operative stress with postoperative mortality. JAMA Surg. 2019;155(1):e194620. doi:10.1001/jamasurg.2019.4620
22. Ruiz JG, Priyadarshni S, Rahaman Z, et al. Validation of an automatically generated screening score for frailty: the care assessment need (CAN) score. BMC Geriatr. 2018;18(1):106. doi:10.1186/s12877-018-0802-7
23. Bernstein DN, Liu TC, Winegar AL, et al. Evaluation of a preoperative optimization protocol for primary hip and knee arthroplasty patients. J Arthroplasty. 2018;33(12):3642- 3648. doi:10.1016/j.arth.2018.08.018
24. Sodhi N, Anis HK, Coste M, et al. A nationwide analysis of preoperative planning on operative times and postoperative complications in total knee arthroplasty. J Knee Surg. 2019;32(11):1040-1045. doi:10.1055/s-0039-1677790
25. Krause A, Sayeed Z, El-Othmani M, Pallekonda V, Mihalko W, Saleh KJ. Outpatient total knee arthroplasty: are we there yet? (part 1). Orthop Clin North Am. 2018;49(1):1-6. doi:10.1016/j.ocl.2017.08.002
26. Barrasa-Villar JI, Aibar-Remón C, Prieto-Andrés P, Mareca- Doñate R, Moliner-Lahoz J. Impact on morbidity, mortality, and length of stay of hospital-acquired infections by resistant microorganisms. Clin Infect Dis. 2017;65(4):644-652. doi:10.1093/cid/cix411
27. Nikkel LE, Kates SL, Schreck M, Maceroli M, Mahmood B, Elfar JC. Length of hospital stay after hip fracture and risk of early mortality after discharge in New York state: retrospective cohort study. BMJ. 2015;351:h6246. doi:10.1136/bmj.h6246
28. Marfil-Garza BA, Belaunzarán-Zamudio PF, Gulias-Herrero A, et al. Risk factors associated with prolonged hospital length-of-stay: 18-year retrospective study of hospitalizations in a tertiary healthcare center in Mexico. PLoS One. 2018;13(11):e0207203. doi:10.1371/journal.pone.0207203
29. Hirsch CH, Sommers L, Olsen A, Mullen L, Winograd CH. The natural history of functional morbidity in hospitalized older patients. J Am Geriatr Soc. 1990;38(12):1296-1303. doi:10.1111/j.1532-5415.1990.tb03451.x
30. Iyengar KP, Jain VK, Vaish A, Vaishya R, Maini L, Lal H. Post COVID-19: planning strategies to resume orthopaedic surgery -challenges and considerations. J Clin Orthop Trauma. 2020;11(suppl 3):S291-S295. doi:10.1016/j.jcot.2020.04.028
31. O’Connor CM, Anoushiravani AA, DiCaprio MR, Healy WL, Iorio R. Economic recovery after the COVID-19 pandemic: resuming elective orthopedic surgery and total joint arthroplasty. J Arthroplasty. 2020;35(suppl 7):S32-S36. doi:10.1016/j.arth.2020.04.038.
32. Mauseth SA, Skurtveit S, Skovlund E, Langhammer A, Spigset O. Medication use and association with urinary incontinence in women: data from the Norwegian Prescription Database and the HUNT study. Neurourol Urodyn. 2018;37(4):1448-1457. doi:10.1002/nau.23473
33. Sultan RS, Correll CU, Schoenbaum M, King M, Walkup JT, Olfson M. National patterns of commonly prescribed psychotropic medications to young people. J Child Adolesc Psychopharmacol. 2018;28(3):158-165. doi:10.1089/cap.2017.0077
34. McCoy RG, Dykhoff HJ, Sangaralingham L, et al. Adoption of new glucose-lowering medications in the U.S.-the case of SGLT2 inhibitors: nationwide cohort study. Diabetes Technol Ther. 2019;21(12):702-712. doi:10.1089/dia.2019.0213
The Hospital Readmissions Reduction Program: Inconvenient Observations
Centers for Medicare and Medicaid Services (CMS)–promulgated quality metrics continue to attract critics. Physicians decry that many metrics are outside their control, while patient groups are frustrated that metrics lack meaning for beneficiaries. The Hospital Readmissions Reduction Program (HRRP) reduces payments for “excess” 30-day risk-standardized readmissions for six conditions and procedures, and may be less effective in reducing readmissions than previously reported due to intentional and increasing use of hospital observation stays.1
In this issue, Sheehy et al2 report that nearly one in five rehospitalizations were unrecognized because either the index hospitalization or the rehospitalization was an observation stay, highlighting yet another challenge with the HRRP. Limitations of their study include the use of a single year of claims data and the exclusion of Medicare Advantage claims data, as one might expect lower readmission rates in this capitated program. Opportunities for improving the HRRP could consist of updating the HRRP metric to include observation stays and, for surgical hospitalizations, extended-stay surgical recovery, wherein patients may be observed for up to 2 days following a procedure. Unfortunately, despite the HRRP missing nearly one in five readmissions, CMS would likely need additional statutory authority from Congress in order to reinterpret the definition of readmission3 to include observation stays.
Challenges with the HRRP metrics raise broader concerns about the program. For decades, administrators viewed readmissions as a utilization metric, only to have the Affordable Care Act re-designate and define all-cause readmissions as a quality metric. Yet hospitals and health systems control only some factors driving readmission. Readmissions occur for a variety of reasons, including not only poor quality of initial hospital care and inadequate care coordination, but also factors that are beyond the hospital’s purview, such as lack of access to ambulatory services, multiple and severe chronic conditions that progress or remain unresponsive to intervention,4 and demographic and social factors such as housing instability, health literacy, or residence in a food desert. These non-hospital factors reside within the domain of other market participants or local, state, and federal government agencies.
Challenges to the utility, validity, and appropriateness of HRRP metrics should remind policymakers of the dangers of over-legislating the details of healthcare policy and the statutory inflexibility that can ensue. Clinical care evolves, and artificial constructs—including payment categories such as observation status—may age poorly over time, exemplified best by the challenges of accessing post-acute care due to the 3-day rule.5 Introduced as a statutory requirement in 1967, when the average length of stay was 13.8 days and observation care did not exist as a payment category, the 3-day rule requires Medicare beneficiaries to spend 3 days admitted to the hospital in order to qualify for coverage of post-acute care, creating care gaps for observation stay patients.
Observation care itself is an artificial construct of CMS payment policy. In the Medicare program, observation care falls under Part B, exposing patients to both greater financial responsibility and billing complexity through the engagement of their supplemental insurance, even though those receiving observation care experience the same care as if hospitalized— routine monitoring, nursing care, blood draws, imaging, and diagnostic tests. While CMS requires notification of observation status and explanation of the difference in patient financial responsibility, in clinical practice, patient understanding is limited. Policymakers can support both Medicare beneficiaries and hospitals by reexamining observation care as a payment category.
Sheehy and colleagues’ work simultaneously challenges the face validity of the HRRP and the reasonableness of categorizing some inpatient stays as outpatient care in the hospital—issues that policymakers can and should address.
1. Sabbatini AK, Wright B. Excluding observation stays from readmission rates – what quality measures are missing. N Engl J Med. 2018;378(22):2062-2065. https://doi.org/10.1056/NEJMp1800732
2. Sheehy AM, Kaiksow F, Powell WR, et al. The hospital readmissions reduction program’s blind spot: observation hospitalizations. J Hosp Med. 2021;16(7):409-411. https://doi.org/10.12788/jhm.3634
3. The Patient Protection and Affordable Care Act, 42 USC 18001§3025 (2010).
4. Reuben DB, Tinetti ME. The hospital-dependent patient. N Engl J Med. 2014;370(8):694-697. https://doi.org/10.1056/NEJMp1315568
5. Patel N, Slota JM, Miller BJ. The continued conundrum of discharge to a skilled nursing facility after a medicare observation stay. JAMA Health Forum. 2020;1(5):e200577. https://doi.org/10.1001/jamahealthforum.2020.0577
Centers for Medicare and Medicaid Services (CMS)–promulgated quality metrics continue to attract critics. Physicians decry that many metrics are outside their control, while patient groups are frustrated that metrics lack meaning for beneficiaries. The Hospital Readmissions Reduction Program (HRRP) reduces payments for “excess” 30-day risk-standardized readmissions for six conditions and procedures, and may be less effective in reducing readmissions than previously reported due to intentional and increasing use of hospital observation stays.1
In this issue, Sheehy et al2 report that nearly one in five rehospitalizations were unrecognized because either the index hospitalization or the rehospitalization was an observation stay, highlighting yet another challenge with the HRRP. Limitations of their study include the use of a single year of claims data and the exclusion of Medicare Advantage claims data, as one might expect lower readmission rates in this capitated program. Opportunities for improving the HRRP could consist of updating the HRRP metric to include observation stays and, for surgical hospitalizations, extended-stay surgical recovery, wherein patients may be observed for up to 2 days following a procedure. Unfortunately, despite the HRRP missing nearly one in five readmissions, CMS would likely need additional statutory authority from Congress in order to reinterpret the definition of readmission3 to include observation stays.
Challenges with the HRRP metrics raise broader concerns about the program. For decades, administrators viewed readmissions as a utilization metric, only to have the Affordable Care Act re-designate and define all-cause readmissions as a quality metric. Yet hospitals and health systems control only some factors driving readmission. Readmissions occur for a variety of reasons, including not only poor quality of initial hospital care and inadequate care coordination, but also factors that are beyond the hospital’s purview, such as lack of access to ambulatory services, multiple and severe chronic conditions that progress or remain unresponsive to intervention,4 and demographic and social factors such as housing instability, health literacy, or residence in a food desert. These non-hospital factors reside within the domain of other market participants or local, state, and federal government agencies.
Challenges to the utility, validity, and appropriateness of HRRP metrics should remind policymakers of the dangers of over-legislating the details of healthcare policy and the statutory inflexibility that can ensue. Clinical care evolves, and artificial constructs—including payment categories such as observation status—may age poorly over time, exemplified best by the challenges of accessing post-acute care due to the 3-day rule.5 Introduced as a statutory requirement in 1967, when the average length of stay was 13.8 days and observation care did not exist as a payment category, the 3-day rule requires Medicare beneficiaries to spend 3 days admitted to the hospital in order to qualify for coverage of post-acute care, creating care gaps for observation stay patients.
Observation care itself is an artificial construct of CMS payment policy. In the Medicare program, observation care falls under Part B, exposing patients to both greater financial responsibility and billing complexity through the engagement of their supplemental insurance, even though those receiving observation care experience the same care as if hospitalized— routine monitoring, nursing care, blood draws, imaging, and diagnostic tests. While CMS requires notification of observation status and explanation of the difference in patient financial responsibility, in clinical practice, patient understanding is limited. Policymakers can support both Medicare beneficiaries and hospitals by reexamining observation care as a payment category.
Sheehy and colleagues’ work simultaneously challenges the face validity of the HRRP and the reasonableness of categorizing some inpatient stays as outpatient care in the hospital—issues that policymakers can and should address.
Centers for Medicare and Medicaid Services (CMS)–promulgated quality metrics continue to attract critics. Physicians decry that many metrics are outside their control, while patient groups are frustrated that metrics lack meaning for beneficiaries. The Hospital Readmissions Reduction Program (HRRP) reduces payments for “excess” 30-day risk-standardized readmissions for six conditions and procedures, and may be less effective in reducing readmissions than previously reported due to intentional and increasing use of hospital observation stays.1
In this issue, Sheehy et al2 report that nearly one in five rehospitalizations were unrecognized because either the index hospitalization or the rehospitalization was an observation stay, highlighting yet another challenge with the HRRP. Limitations of their study include the use of a single year of claims data and the exclusion of Medicare Advantage claims data, as one might expect lower readmission rates in this capitated program. Opportunities for improving the HRRP could consist of updating the HRRP metric to include observation stays and, for surgical hospitalizations, extended-stay surgical recovery, wherein patients may be observed for up to 2 days following a procedure. Unfortunately, despite the HRRP missing nearly one in five readmissions, CMS would likely need additional statutory authority from Congress in order to reinterpret the definition of readmission3 to include observation stays.
Challenges with the HRRP metrics raise broader concerns about the program. For decades, administrators viewed readmissions as a utilization metric, only to have the Affordable Care Act re-designate and define all-cause readmissions as a quality metric. Yet hospitals and health systems control only some factors driving readmission. Readmissions occur for a variety of reasons, including not only poor quality of initial hospital care and inadequate care coordination, but also factors that are beyond the hospital’s purview, such as lack of access to ambulatory services, multiple and severe chronic conditions that progress or remain unresponsive to intervention,4 and demographic and social factors such as housing instability, health literacy, or residence in a food desert. These non-hospital factors reside within the domain of other market participants or local, state, and federal government agencies.
Challenges to the utility, validity, and appropriateness of HRRP metrics should remind policymakers of the dangers of over-legislating the details of healthcare policy and the statutory inflexibility that can ensue. Clinical care evolves, and artificial constructs—including payment categories such as observation status—may age poorly over time, exemplified best by the challenges of accessing post-acute care due to the 3-day rule.5 Introduced as a statutory requirement in 1967, when the average length of stay was 13.8 days and observation care did not exist as a payment category, the 3-day rule requires Medicare beneficiaries to spend 3 days admitted to the hospital in order to qualify for coverage of post-acute care, creating care gaps for observation stay patients.
Observation care itself is an artificial construct of CMS payment policy. In the Medicare program, observation care falls under Part B, exposing patients to both greater financial responsibility and billing complexity through the engagement of their supplemental insurance, even though those receiving observation care experience the same care as if hospitalized— routine monitoring, nursing care, blood draws, imaging, and diagnostic tests. While CMS requires notification of observation status and explanation of the difference in patient financial responsibility, in clinical practice, patient understanding is limited. Policymakers can support both Medicare beneficiaries and hospitals by reexamining observation care as a payment category.
Sheehy and colleagues’ work simultaneously challenges the face validity of the HRRP and the reasonableness of categorizing some inpatient stays as outpatient care in the hospital—issues that policymakers can and should address.
1. Sabbatini AK, Wright B. Excluding observation stays from readmission rates – what quality measures are missing. N Engl J Med. 2018;378(22):2062-2065. https://doi.org/10.1056/NEJMp1800732
2. Sheehy AM, Kaiksow F, Powell WR, et al. The hospital readmissions reduction program’s blind spot: observation hospitalizations. J Hosp Med. 2021;16(7):409-411. https://doi.org/10.12788/jhm.3634
3. The Patient Protection and Affordable Care Act, 42 USC 18001§3025 (2010).
4. Reuben DB, Tinetti ME. The hospital-dependent patient. N Engl J Med. 2014;370(8):694-697. https://doi.org/10.1056/NEJMp1315568
5. Patel N, Slota JM, Miller BJ. The continued conundrum of discharge to a skilled nursing facility after a medicare observation stay. JAMA Health Forum. 2020;1(5):e200577. https://doi.org/10.1001/jamahealthforum.2020.0577
1. Sabbatini AK, Wright B. Excluding observation stays from readmission rates – what quality measures are missing. N Engl J Med. 2018;378(22):2062-2065. https://doi.org/10.1056/NEJMp1800732
2. Sheehy AM, Kaiksow F, Powell WR, et al. The hospital readmissions reduction program’s blind spot: observation hospitalizations. J Hosp Med. 2021;16(7):409-411. https://doi.org/10.12788/jhm.3634
3. The Patient Protection and Affordable Care Act, 42 USC 18001§3025 (2010).
4. Reuben DB, Tinetti ME. The hospital-dependent patient. N Engl J Med. 2014;370(8):694-697. https://doi.org/10.1056/NEJMp1315568
5. Patel N, Slota JM, Miller BJ. The continued conundrum of discharge to a skilled nursing facility after a medicare observation stay. JAMA Health Forum. 2020;1(5):e200577. https://doi.org/10.1001/jamahealthforum.2020.0577
© 2021 Society of Hospital Medicine
Measuring Trainee Duty Hours: The Times They Are a-Changin’
“If your time to you is worth savin’
Then you better start swimmin’ or you’ll sink like a stone
For the times they are a-changin’...”
–Bob Dylan
The Accreditation Council for Graduate Medical Education requires residency programs to limit and track trainee work hours to reduce the risk of fatigue, burnout, and medical errors. These hours are documented most often by self-report, at the cost of additional administrative burden for trainees and programs, dubious accuracy, and potentially incentivizing misrepresentation.1
Thus, the study by Soleimani and colleagues2 in this issue is a welcome addition to the literature on duty-hours tracking. Using timestamp data from the electronic health record (EHR), the authors developed and collected validity evidence for an automated computerized algorithm to measure how much time trainees were spending on clinical work. The study was conducted at a large academic internal medicine residency program and tracked 203 trainees working 14,610 days. The authors compared their results to trainee self-report data. Though the approach centered on EHR access logs, it accommodated common scenarios of time away from the computer while at the hospital (eg, during patient rounds). Crucially, the algorithm included EHR access while at home. The absolute discrepancy between the algorithm and self-report averaged 1.38 hours per day. Notably, EHR work at home accounted for about an extra hour per day. When considering in-hospital work alone, the authors found 3% to 13% of trainees exceeding 80-hour workweek limits, but when adding out-of-hospital work, this percentage rose to 10% to 21%.
The authors used inventive methods to improve accuracy. They prespecified EHR functions that constituted active clinical work, classifying reading without editing notes or placing orders simply as “educational study,” which they excluded from duty hours. They ensured that time spent off-site was included and that logins from personal devices while in-hospital were not double-counted. Caveats to the study include the limited generalizability for institutions without the computational resources to replicate the model. The authors acknowledged the inherent flaw in using trainee self-report as the “gold standard,” and potentially some subset of the results could have been corroborated with time-motion observation studies.3 The decision to exclude passive medical record review at home as work arguably discounts the integral value that the “chart biopsy” has on direct patient care; it probably led to systematic underestimation of duty hours for junior and senior residents, who may be most likely to contribute in this way. Similarly, not counting time spent with patients at the end of the day after sign-out risks undercounting hours as well. Nonetheless, this study represents a rigorously designed and scalable approach to meeting regulatory requirements that can potentially lighten the administrative task load for trainees, improve reporting accuracy, and facilitate research comparing work hours to other variables of interest (eg, efficiency). The model can be generalized to other specialties and could document workload for staff physicians as well.
Merits of the study aside, the algorithm underscores troubling realities about the practice of medicine in the 21st century. Do we now equate clinical work with time on the computer? Is our contribution as physicians defined primarily by our presence at the keyboard, rather than the bedside?4 Future research facilitated by automated hours tracking is likely to further elucidate a connection between time spent in the EHR with burnout4 and job dissatisfaction, and the premise of this study is emblematic of the erosion of clinical work-life boundaries that began even before the pandemic.5 While the “times they are a-changin’,” in this respect, it may not be for the better.
1. Grabski DF, Goudreau BJ, Gillen JR, et al. Compliance with the Accreditation Council for Graduate Medical Education duty hours in a general surgery residency program: challenges and solutions in a teaching hospital. Surgery. 2020;167(2):302-307. https://doi.org/10.1016/j.surg.2019.05.029
2. Soleimani H, Adler-Milstein J, Cucina RJ, Murray SG. Automating measurement of trainee work hours. J Hosp Med. 2021;16(7):404-408. https://doi.org/10.12788/jhm.3607
3. Tipping MD, Forth VE, O’Leary KJ, et al. Where did the day go?—a time-motion study of hospitalists. J Hosp Med. 2010;5(6):323-328. https://doi.org/10.1002/jhm.790
4. Gardner RL, Cooper E, Haskell J, et al. Physician stress and burnout: the impact of health information technology. J Am Med Inform Assoc. 2019;26(2):106-114. https://doi.org/10.1093/jamia/ocy145
5. Saag HS, Shah K, Jones SA, Testa PA, Horwitz LI. Pajama time: working after work in the electronic health record. J Gen Intern Med. 2019;34(9):1695-1696. https://doi.org/10.1007/s11606-019-05055-x
“If your time to you is worth savin’
Then you better start swimmin’ or you’ll sink like a stone
For the times they are a-changin’...”
–Bob Dylan
The Accreditation Council for Graduate Medical Education requires residency programs to limit and track trainee work hours to reduce the risk of fatigue, burnout, and medical errors. These hours are documented most often by self-report, at the cost of additional administrative burden for trainees and programs, dubious accuracy, and potentially incentivizing misrepresentation.1
Thus, the study by Soleimani and colleagues2 in this issue is a welcome addition to the literature on duty-hours tracking. Using timestamp data from the electronic health record (EHR), the authors developed and collected validity evidence for an automated computerized algorithm to measure how much time trainees were spending on clinical work. The study was conducted at a large academic internal medicine residency program and tracked 203 trainees working 14,610 days. The authors compared their results to trainee self-report data. Though the approach centered on EHR access logs, it accommodated common scenarios of time away from the computer while at the hospital (eg, during patient rounds). Crucially, the algorithm included EHR access while at home. The absolute discrepancy between the algorithm and self-report averaged 1.38 hours per day. Notably, EHR work at home accounted for about an extra hour per day. When considering in-hospital work alone, the authors found 3% to 13% of trainees exceeding 80-hour workweek limits, but when adding out-of-hospital work, this percentage rose to 10% to 21%.
The authors used inventive methods to improve accuracy. They prespecified EHR functions that constituted active clinical work, classifying reading without editing notes or placing orders simply as “educational study,” which they excluded from duty hours. They ensured that time spent off-site was included and that logins from personal devices while in-hospital were not double-counted. Caveats to the study include the limited generalizability for institutions without the computational resources to replicate the model. The authors acknowledged the inherent flaw in using trainee self-report as the “gold standard,” and potentially some subset of the results could have been corroborated with time-motion observation studies.3 The decision to exclude passive medical record review at home as work arguably discounts the integral value that the “chart biopsy” has on direct patient care; it probably led to systematic underestimation of duty hours for junior and senior residents, who may be most likely to contribute in this way. Similarly, not counting time spent with patients at the end of the day after sign-out risks undercounting hours as well. Nonetheless, this study represents a rigorously designed and scalable approach to meeting regulatory requirements that can potentially lighten the administrative task load for trainees, improve reporting accuracy, and facilitate research comparing work hours to other variables of interest (eg, efficiency). The model can be generalized to other specialties and could document workload for staff physicians as well.
Merits of the study aside, the algorithm underscores troubling realities about the practice of medicine in the 21st century. Do we now equate clinical work with time on the computer? Is our contribution as physicians defined primarily by our presence at the keyboard, rather than the bedside?4 Future research facilitated by automated hours tracking is likely to further elucidate a connection between time spent in the EHR with burnout4 and job dissatisfaction, and the premise of this study is emblematic of the erosion of clinical work-life boundaries that began even before the pandemic.5 While the “times they are a-changin’,” in this respect, it may not be for the better.
“If your time to you is worth savin’
Then you better start swimmin’ or you’ll sink like a stone
For the times they are a-changin’...”
–Bob Dylan
The Accreditation Council for Graduate Medical Education requires residency programs to limit and track trainee work hours to reduce the risk of fatigue, burnout, and medical errors. These hours are documented most often by self-report, at the cost of additional administrative burden for trainees and programs, dubious accuracy, and potentially incentivizing misrepresentation.1
Thus, the study by Soleimani and colleagues2 in this issue is a welcome addition to the literature on duty-hours tracking. Using timestamp data from the electronic health record (EHR), the authors developed and collected validity evidence for an automated computerized algorithm to measure how much time trainees were spending on clinical work. The study was conducted at a large academic internal medicine residency program and tracked 203 trainees working 14,610 days. The authors compared their results to trainee self-report data. Though the approach centered on EHR access logs, it accommodated common scenarios of time away from the computer while at the hospital (eg, during patient rounds). Crucially, the algorithm included EHR access while at home. The absolute discrepancy between the algorithm and self-report averaged 1.38 hours per day. Notably, EHR work at home accounted for about an extra hour per day. When considering in-hospital work alone, the authors found 3% to 13% of trainees exceeding 80-hour workweek limits, but when adding out-of-hospital work, this percentage rose to 10% to 21%.
The authors used inventive methods to improve accuracy. They prespecified EHR functions that constituted active clinical work, classifying reading without editing notes or placing orders simply as “educational study,” which they excluded from duty hours. They ensured that time spent off-site was included and that logins from personal devices while in-hospital were not double-counted. Caveats to the study include the limited generalizability for institutions without the computational resources to replicate the model. The authors acknowledged the inherent flaw in using trainee self-report as the “gold standard,” and potentially some subset of the results could have been corroborated with time-motion observation studies.3 The decision to exclude passive medical record review at home as work arguably discounts the integral value that the “chart biopsy” has on direct patient care; it probably led to systematic underestimation of duty hours for junior and senior residents, who may be most likely to contribute in this way. Similarly, not counting time spent with patients at the end of the day after sign-out risks undercounting hours as well. Nonetheless, this study represents a rigorously designed and scalable approach to meeting regulatory requirements that can potentially lighten the administrative task load for trainees, improve reporting accuracy, and facilitate research comparing work hours to other variables of interest (eg, efficiency). The model can be generalized to other specialties and could document workload for staff physicians as well.
Merits of the study aside, the algorithm underscores troubling realities about the practice of medicine in the 21st century. Do we now equate clinical work with time on the computer? Is our contribution as physicians defined primarily by our presence at the keyboard, rather than the bedside?4 Future research facilitated by automated hours tracking is likely to further elucidate a connection between time spent in the EHR with burnout4 and job dissatisfaction, and the premise of this study is emblematic of the erosion of clinical work-life boundaries that began even before the pandemic.5 While the “times they are a-changin’,” in this respect, it may not be for the better.
1. Grabski DF, Goudreau BJ, Gillen JR, et al. Compliance with the Accreditation Council for Graduate Medical Education duty hours in a general surgery residency program: challenges and solutions in a teaching hospital. Surgery. 2020;167(2):302-307. https://doi.org/10.1016/j.surg.2019.05.029
2. Soleimani H, Adler-Milstein J, Cucina RJ, Murray SG. Automating measurement of trainee work hours. J Hosp Med. 2021;16(7):404-408. https://doi.org/10.12788/jhm.3607
3. Tipping MD, Forth VE, O’Leary KJ, et al. Where did the day go?—a time-motion study of hospitalists. J Hosp Med. 2010;5(6):323-328. https://doi.org/10.1002/jhm.790
4. Gardner RL, Cooper E, Haskell J, et al. Physician stress and burnout: the impact of health information technology. J Am Med Inform Assoc. 2019;26(2):106-114. https://doi.org/10.1093/jamia/ocy145
5. Saag HS, Shah K, Jones SA, Testa PA, Horwitz LI. Pajama time: working after work in the electronic health record. J Gen Intern Med. 2019;34(9):1695-1696. https://doi.org/10.1007/s11606-019-05055-x
1. Grabski DF, Goudreau BJ, Gillen JR, et al. Compliance with the Accreditation Council for Graduate Medical Education duty hours in a general surgery residency program: challenges and solutions in a teaching hospital. Surgery. 2020;167(2):302-307. https://doi.org/10.1016/j.surg.2019.05.029
2. Soleimani H, Adler-Milstein J, Cucina RJ, Murray SG. Automating measurement of trainee work hours. J Hosp Med. 2021;16(7):404-408. https://doi.org/10.12788/jhm.3607
3. Tipping MD, Forth VE, O’Leary KJ, et al. Where did the day go?—a time-motion study of hospitalists. J Hosp Med. 2010;5(6):323-328. https://doi.org/10.1002/jhm.790
4. Gardner RL, Cooper E, Haskell J, et al. Physician stress and burnout: the impact of health information technology. J Am Med Inform Assoc. 2019;26(2):106-114. https://doi.org/10.1093/jamia/ocy145
5. Saag HS, Shah K, Jones SA, Testa PA, Horwitz LI. Pajama time: working after work in the electronic health record. J Gen Intern Med. 2019;34(9):1695-1696. https://doi.org/10.1007/s11606-019-05055-x
© 2021 Society of Hospital Medicine
The Medical Liability Environment: Is It Really Any Worse for Hospitalists?
Although malpractice “crises” come and go, liability fears persist near top of mind for most physicians.1 Liability insurance premiums have plateaued in recent years, but remain at high levels, and the prospect of being reported to the National Practitioner Data Bank (NPDB) or listed on a state medical board’s website for a paid liability claim is unsettling. The high-acuity setting and the absence of longitudinal patient relationships in hospital medicine may theoretically raise malpractice risk, yet hospitalists’ liability risk remains understudied.2
The contribution by Schaffer and colleagues3 in this issue of the Journal of Hospital Medicine is thus welcome and illuminating. The researchers examine the liability risk of hospitalists compared to that of other specialties by utilizing a large database of malpractice claims compiled from multiple insurers across a decade.3 In a field of research plagued by inadequate data, the Comparative Benchmarking System (CBS) built by CRICO/RMF is a treasure. Unlike the primary national database of malpractice claims, the NPDB, the CBS contains information on claims that did not result in a payment, as well as physicians’ specialty and detailed information on the allegations, injuries, and their causes. The CBS contains almost a third of all medical liability claims made in the United States during the study period, supporting generalizability.
Schaffer and colleagues1 found that hospitalists had a lower claims rate than physicians in emergency medicine or neurosurgery. The rate was on par with that for non-hospital general internists, even though hospitalists often care for higher-acuity patients. Although claims rates dropped over the study period for physicians in neurosurgery, emergency medicine, psychiatry, and internal medicine subspecialties, the rate for hospitalists did not change significantly. Further, the median payout on claims against hospitalists was the highest of all the specialties examined, except neurosurgery. This reflects higher injury severity in hospitalist cases: half the claims against hospitalists involved death and three-quarters were high severity.
The study is not without limitations. Due to missing data, only a fraction of the claims (8.2% to 11%) in the full dataset are used in the claims rate analysis. Regression models predicting a payment are based on a small number of payments for hospitalists (n = 363). Further, the authors advance, as a potential explanation for hospitalists’ higher liability risk, that hospitalists are disproportionately young compared to other specialists, but the dataset lacks age data. These limitations suggest caution in the authors’ overall conclusion that “the malpractice environment for hospitalists is becoming less favorable.”
Nevertheless, several important insights emerge from their analysis. The very existence of claims demonstrates that patient harm continues. The contributing factors and judgment errors found in these claims demonstrate that much of this harm is potentially preventable and a risk to patient safety. Whether or not the authors’ young-hospitalist hypothesis is ultimately proven, it is difficult to argue with more mentorship as a means to improve safety. Also, preventing or intercepting judgment errors remains a vexing challenge in medicine that undoubtedly calls for creative clinical decision support solutions. Schaffer and colleagues1 also note that hospitalists are increasingly co-managing patients with other specialties, such as orthopedic surgery. Whether this new practice model drives hospitalist liability risk because hospitalists are practicing in areas in which they have less experience (as the authors posit) or whether hospitalists are simply more likely to be named in a suit as part of a specialty team with higher liability risk remains unknown and merits further investigation.
Ultimately, regardless of whether the liability environment is worsening for hospitalists, the need to improve our liability system is clear. There is room to improve the system on a number of metrics, including properly compensating negligently harmed patients without unduly burdening providers. The system also induces defensive medicine and has not driven safety improvements as expected. The liability environment, as a result, remains challenging not just for hospitalists, but for all patients and physicians as well.
1. Sage WM, Boothman RC, Gallagher TH. Another medical malpractice crisis? Try something different. JAMA. 2020;324(14):1395-1396. https://doi.org/10.1001/jama.2020.16557
2. Schaffer AC, Puopolo AL, Raman S, Kachalia A. Liability impact of the hospitalist model of care. J Hosp Med. 2014;9(12):750-755. https://doi.org/10.1002/jhm.2244
3. Schaffer AC, Yu-Moe CW, Babayan A, Wachter RM, Einbinder JS. Rates and characteristics of medical malpractice claims against hospitalists. J Hosp Med. 2021;16(7):390-396. https://doi.org/10.12788/jhm.3557
Although malpractice “crises” come and go, liability fears persist near top of mind for most physicians.1 Liability insurance premiums have plateaued in recent years, but remain at high levels, and the prospect of being reported to the National Practitioner Data Bank (NPDB) or listed on a state medical board’s website for a paid liability claim is unsettling. The high-acuity setting and the absence of longitudinal patient relationships in hospital medicine may theoretically raise malpractice risk, yet hospitalists’ liability risk remains understudied.2
The contribution by Schaffer and colleagues3 in this issue of the Journal of Hospital Medicine is thus welcome and illuminating. The researchers examine the liability risk of hospitalists compared to that of other specialties by utilizing a large database of malpractice claims compiled from multiple insurers across a decade.3 In a field of research plagued by inadequate data, the Comparative Benchmarking System (CBS) built by CRICO/RMF is a treasure. Unlike the primary national database of malpractice claims, the NPDB, the CBS contains information on claims that did not result in a payment, as well as physicians’ specialty and detailed information on the allegations, injuries, and their causes. The CBS contains almost a third of all medical liability claims made in the United States during the study period, supporting generalizability.
Schaffer and colleagues1 found that hospitalists had a lower claims rate than physicians in emergency medicine or neurosurgery. The rate was on par with that for non-hospital general internists, even though hospitalists often care for higher-acuity patients. Although claims rates dropped over the study period for physicians in neurosurgery, emergency medicine, psychiatry, and internal medicine subspecialties, the rate for hospitalists did not change significantly. Further, the median payout on claims against hospitalists was the highest of all the specialties examined, except neurosurgery. This reflects higher injury severity in hospitalist cases: half the claims against hospitalists involved death and three-quarters were high severity.
The study is not without limitations. Due to missing data, only a fraction of the claims (8.2% to 11%) in the full dataset are used in the claims rate analysis. Regression models predicting a payment are based on a small number of payments for hospitalists (n = 363). Further, the authors advance, as a potential explanation for hospitalists’ higher liability risk, that hospitalists are disproportionately young compared to other specialists, but the dataset lacks age data. These limitations suggest caution in the authors’ overall conclusion that “the malpractice environment for hospitalists is becoming less favorable.”
Nevertheless, several important insights emerge from their analysis. The very existence of claims demonstrates that patient harm continues. The contributing factors and judgment errors found in these claims demonstrate that much of this harm is potentially preventable and a risk to patient safety. Whether or not the authors’ young-hospitalist hypothesis is ultimately proven, it is difficult to argue with more mentorship as a means to improve safety. Also, preventing or intercepting judgment errors remains a vexing challenge in medicine that undoubtedly calls for creative clinical decision support solutions. Schaffer and colleagues1 also note that hospitalists are increasingly co-managing patients with other specialties, such as orthopedic surgery. Whether this new practice model drives hospitalist liability risk because hospitalists are practicing in areas in which they have less experience (as the authors posit) or whether hospitalists are simply more likely to be named in a suit as part of a specialty team with higher liability risk remains unknown and merits further investigation.
Ultimately, regardless of whether the liability environment is worsening for hospitalists, the need to improve our liability system is clear. There is room to improve the system on a number of metrics, including properly compensating negligently harmed patients without unduly burdening providers. The system also induces defensive medicine and has not driven safety improvements as expected. The liability environment, as a result, remains challenging not just for hospitalists, but for all patients and physicians as well.
Although malpractice “crises” come and go, liability fears persist near top of mind for most physicians.1 Liability insurance premiums have plateaued in recent years, but remain at high levels, and the prospect of being reported to the National Practitioner Data Bank (NPDB) or listed on a state medical board’s website for a paid liability claim is unsettling. The high-acuity setting and the absence of longitudinal patient relationships in hospital medicine may theoretically raise malpractice risk, yet hospitalists’ liability risk remains understudied.2
The contribution by Schaffer and colleagues3 in this issue of the Journal of Hospital Medicine is thus welcome and illuminating. The researchers examine the liability risk of hospitalists compared to that of other specialties by utilizing a large database of malpractice claims compiled from multiple insurers across a decade.3 In a field of research plagued by inadequate data, the Comparative Benchmarking System (CBS) built by CRICO/RMF is a treasure. Unlike the primary national database of malpractice claims, the NPDB, the CBS contains information on claims that did not result in a payment, as well as physicians’ specialty and detailed information on the allegations, injuries, and their causes. The CBS contains almost a third of all medical liability claims made in the United States during the study period, supporting generalizability.
Schaffer and colleagues1 found that hospitalists had a lower claims rate than physicians in emergency medicine or neurosurgery. The rate was on par with that for non-hospital general internists, even though hospitalists often care for higher-acuity patients. Although claims rates dropped over the study period for physicians in neurosurgery, emergency medicine, psychiatry, and internal medicine subspecialties, the rate for hospitalists did not change significantly. Further, the median payout on claims against hospitalists was the highest of all the specialties examined, except neurosurgery. This reflects higher injury severity in hospitalist cases: half the claims against hospitalists involved death and three-quarters were high severity.
The study is not without limitations. Due to missing data, only a fraction of the claims (8.2% to 11%) in the full dataset are used in the claims rate analysis. Regression models predicting a payment are based on a small number of payments for hospitalists (n = 363). Further, the authors advance, as a potential explanation for hospitalists’ higher liability risk, that hospitalists are disproportionately young compared to other specialists, but the dataset lacks age data. These limitations suggest caution in the authors’ overall conclusion that “the malpractice environment for hospitalists is becoming less favorable.”
Nevertheless, several important insights emerge from their analysis. The very existence of claims demonstrates that patient harm continues. The contributing factors and judgment errors found in these claims demonstrate that much of this harm is potentially preventable and a risk to patient safety. Whether or not the authors’ young-hospitalist hypothesis is ultimately proven, it is difficult to argue with more mentorship as a means to improve safety. Also, preventing or intercepting judgment errors remains a vexing challenge in medicine that undoubtedly calls for creative clinical decision support solutions. Schaffer and colleagues1 also note that hospitalists are increasingly co-managing patients with other specialties, such as orthopedic surgery. Whether this new practice model drives hospitalist liability risk because hospitalists are practicing in areas in which they have less experience (as the authors posit) or whether hospitalists are simply more likely to be named in a suit as part of a specialty team with higher liability risk remains unknown and merits further investigation.
Ultimately, regardless of whether the liability environment is worsening for hospitalists, the need to improve our liability system is clear. There is room to improve the system on a number of metrics, including properly compensating negligently harmed patients without unduly burdening providers. The system also induces defensive medicine and has not driven safety improvements as expected. The liability environment, as a result, remains challenging not just for hospitalists, but for all patients and physicians as well.
1. Sage WM, Boothman RC, Gallagher TH. Another medical malpractice crisis? Try something different. JAMA. 2020;324(14):1395-1396. https://doi.org/10.1001/jama.2020.16557
2. Schaffer AC, Puopolo AL, Raman S, Kachalia A. Liability impact of the hospitalist model of care. J Hosp Med. 2014;9(12):750-755. https://doi.org/10.1002/jhm.2244
3. Schaffer AC, Yu-Moe CW, Babayan A, Wachter RM, Einbinder JS. Rates and characteristics of medical malpractice claims against hospitalists. J Hosp Med. 2021;16(7):390-396. https://doi.org/10.12788/jhm.3557
1. Sage WM, Boothman RC, Gallagher TH. Another medical malpractice crisis? Try something different. JAMA. 2020;324(14):1395-1396. https://doi.org/10.1001/jama.2020.16557
2. Schaffer AC, Puopolo AL, Raman S, Kachalia A. Liability impact of the hospitalist model of care. J Hosp Med. 2014;9(12):750-755. https://doi.org/10.1002/jhm.2244
3. Schaffer AC, Yu-Moe CW, Babayan A, Wachter RM, Einbinder JS. Rates and characteristics of medical malpractice claims against hospitalists. J Hosp Med. 2021;16(7):390-396. https://doi.org/10.12788/jhm.3557
© 2021 Society of Hospital Medicine
Leadership & Professional Development: Cultivating Microcultures of Well-being
“As we work to create light for others, we naturally light our own way.”
– Mary Anne Radmacher
Perhaps unknowingly, hospitalists establish microcultures in their everyday work. Hospitalists’ interactions with colleagues often occur in the context of shared workspaces. The nature of these seemingly minor exchanges shapes the microculture, often described as the culture shared by a small group based on location within an organization. Hospitalists have an opportunity to cultivate well-being within these microcultures through gracious and thoughtful acknowledgments of their peers. Collegial support at the micro level influences wellness at the organizational level. A larger shared culture of wellness is necessary to nurture physicians’ personal fulfillment and professional development.1
We propose the CARE framework for cultivating well-being within the microcultures of hospital medicine shared workspaces. CARE consists of Capitalization, Active listening, Recognition, and Empathy. This framework is based on positive psychology research and inspired by lessons from The Happiness Advantage by Shawn Achor.2
Capitalization. Capitalization is defined as sharing upbeat news and receiving a positive reaction. Emotional support during good times, more so than during bad times, strengthens relationships. When a peer shares good news, show enthusiasm and counter with an active, constructive response to maximize the validation she perceives.2
For example, Alex sits at her desk and says to Kristen: “
My workshop proposal was accepted for medical education day!” “
Congratulations, Alex! Tell me more about the workshop.”
Active listening. Active listening requires concentration and observation of body language. Show engagement by maintaining an open posture, using positive facial expressions, and providing occasional cues that you’re paying attention. Paraphrasing and asking targeted questions to dive deeper demonstrates genuine interest.
“Katie, I could use your advice. Do you have a minute?”
Katie turns to face John and smiles. “Of course. How can I help?”
“My team seems drained after a code this morning. I planned a lecture for later, but I’m not sure this is the right time.”
Katie nods. “I think you’re right, John. How have you thought about handling the situation?”
Recognition. Acts of recognition and encouragement are catalysts for boosting morale. Even brief expressions of gratitude can have a significant emotional impact. Recognition is most meaningful when delivered deliberately and with warmth.
Kevin walks into the hospitalist workroom. “Diane, congratulations on your publication! I plan to make a medication interaction review part of my discharge workflow.”
Leah turns to Diane. “Diane, that’s great news! Can you send me the link to your article?”
Empathy. Burnout is prevalent in medicine, and our fellow hospitalists deserve empathy. Showing empathy reduces stress and promotes connectedness. Sense when your colleagues are in distress and take time to share in their feelings and emotions. Draw on your own clinical experience to find common ground and convey understanding.
“I transferred another patient with COVID-19 to the ICU. I spent the last hour talking to family.”
“Ashwin, you’ve had a tough week. I know how you must feel—I had to transfer a patient yesterday. Want to take a quick walk outside?”
Hospitalists are inherently busy while on service, but these four interventions are brief, requiring only several minutes. Each small investment of your time will pay significant emotional dividends. These practices will not only enhance your colleagues’ sense of well-being, but will also bolster your happiness and productivity. A positive mindset fosters creative thinking and enhances complex problem solving. Recharging the microcultures of hospitalist workspaces with positivity will spark a larger transformation at the organizational level. That’s because positive actions are contagious.2 One hospitalist’s commitment to CARE will encourage other hospitalists to adopt these behaviors, establishing a virtuous cycle that sustains an organization’s culture of wellness.
1. Bohman B, Dyrbye L, Sinsky CA, et al. Physician well-being: the reciprocity of practice efficiency, culture of wellness, and personal resilience. NEJM Catalyst. August 7, 2017. Accessed June 24, 2021. https://catalyst.nejm.org/doi/full/10.1056/CAT.17.0429
2. Achor S. The Happiness Advantage: How a Positive Brain Fuels Success in Work and Life. Currency; 2010.
“As we work to create light for others, we naturally light our own way.”
– Mary Anne Radmacher
Perhaps unknowingly, hospitalists establish microcultures in their everyday work. Hospitalists’ interactions with colleagues often occur in the context of shared workspaces. The nature of these seemingly minor exchanges shapes the microculture, often described as the culture shared by a small group based on location within an organization. Hospitalists have an opportunity to cultivate well-being within these microcultures through gracious and thoughtful acknowledgments of their peers. Collegial support at the micro level influences wellness at the organizational level. A larger shared culture of wellness is necessary to nurture physicians’ personal fulfillment and professional development.1
We propose the CARE framework for cultivating well-being within the microcultures of hospital medicine shared workspaces. CARE consists of Capitalization, Active listening, Recognition, and Empathy. This framework is based on positive psychology research and inspired by lessons from The Happiness Advantage by Shawn Achor.2
Capitalization. Capitalization is defined as sharing upbeat news and receiving a positive reaction. Emotional support during good times, more so than during bad times, strengthens relationships. When a peer shares good news, show enthusiasm and counter with an active, constructive response to maximize the validation she perceives.2
For example, Alex sits at her desk and says to Kristen: “
My workshop proposal was accepted for medical education day!” “
Congratulations, Alex! Tell me more about the workshop.”
Active listening. Active listening requires concentration and observation of body language. Show engagement by maintaining an open posture, using positive facial expressions, and providing occasional cues that you’re paying attention. Paraphrasing and asking targeted questions to dive deeper demonstrates genuine interest.
“Katie, I could use your advice. Do you have a minute?”
Katie turns to face John and smiles. “Of course. How can I help?”
“My team seems drained after a code this morning. I planned a lecture for later, but I’m not sure this is the right time.”
Katie nods. “I think you’re right, John. How have you thought about handling the situation?”
Recognition. Acts of recognition and encouragement are catalysts for boosting morale. Even brief expressions of gratitude can have a significant emotional impact. Recognition is most meaningful when delivered deliberately and with warmth.
Kevin walks into the hospitalist workroom. “Diane, congratulations on your publication! I plan to make a medication interaction review part of my discharge workflow.”
Leah turns to Diane. “Diane, that’s great news! Can you send me the link to your article?”
Empathy. Burnout is prevalent in medicine, and our fellow hospitalists deserve empathy. Showing empathy reduces stress and promotes connectedness. Sense when your colleagues are in distress and take time to share in their feelings and emotions. Draw on your own clinical experience to find common ground and convey understanding.
“I transferred another patient with COVID-19 to the ICU. I spent the last hour talking to family.”
“Ashwin, you’ve had a tough week. I know how you must feel—I had to transfer a patient yesterday. Want to take a quick walk outside?”
Hospitalists are inherently busy while on service, but these four interventions are brief, requiring only several minutes. Each small investment of your time will pay significant emotional dividends. These practices will not only enhance your colleagues’ sense of well-being, but will also bolster your happiness and productivity. A positive mindset fosters creative thinking and enhances complex problem solving. Recharging the microcultures of hospitalist workspaces with positivity will spark a larger transformation at the organizational level. That’s because positive actions are contagious.2 One hospitalist’s commitment to CARE will encourage other hospitalists to adopt these behaviors, establishing a virtuous cycle that sustains an organization’s culture of wellness.
“As we work to create light for others, we naturally light our own way.”
– Mary Anne Radmacher
Perhaps unknowingly, hospitalists establish microcultures in their everyday work. Hospitalists’ interactions with colleagues often occur in the context of shared workspaces. The nature of these seemingly minor exchanges shapes the microculture, often described as the culture shared by a small group based on location within an organization. Hospitalists have an opportunity to cultivate well-being within these microcultures through gracious and thoughtful acknowledgments of their peers. Collegial support at the micro level influences wellness at the organizational level. A larger shared culture of wellness is necessary to nurture physicians’ personal fulfillment and professional development.1
We propose the CARE framework for cultivating well-being within the microcultures of hospital medicine shared workspaces. CARE consists of Capitalization, Active listening, Recognition, and Empathy. This framework is based on positive psychology research and inspired by lessons from The Happiness Advantage by Shawn Achor.2
Capitalization. Capitalization is defined as sharing upbeat news and receiving a positive reaction. Emotional support during good times, more so than during bad times, strengthens relationships. When a peer shares good news, show enthusiasm and counter with an active, constructive response to maximize the validation she perceives.2
For example, Alex sits at her desk and says to Kristen: “
My workshop proposal was accepted for medical education day!” “
Congratulations, Alex! Tell me more about the workshop.”
Active listening. Active listening requires concentration and observation of body language. Show engagement by maintaining an open posture, using positive facial expressions, and providing occasional cues that you’re paying attention. Paraphrasing and asking targeted questions to dive deeper demonstrates genuine interest.
“Katie, I could use your advice. Do you have a minute?”
Katie turns to face John and smiles. “Of course. How can I help?”
“My team seems drained after a code this morning. I planned a lecture for later, but I’m not sure this is the right time.”
Katie nods. “I think you’re right, John. How have you thought about handling the situation?”
Recognition. Acts of recognition and encouragement are catalysts for boosting morale. Even brief expressions of gratitude can have a significant emotional impact. Recognition is most meaningful when delivered deliberately and with warmth.
Kevin walks into the hospitalist workroom. “Diane, congratulations on your publication! I plan to make a medication interaction review part of my discharge workflow.”
Leah turns to Diane. “Diane, that’s great news! Can you send me the link to your article?”
Empathy. Burnout is prevalent in medicine, and our fellow hospitalists deserve empathy. Showing empathy reduces stress and promotes connectedness. Sense when your colleagues are in distress and take time to share in their feelings and emotions. Draw on your own clinical experience to find common ground and convey understanding.
“I transferred another patient with COVID-19 to the ICU. I spent the last hour talking to family.”
“Ashwin, you’ve had a tough week. I know how you must feel—I had to transfer a patient yesterday. Want to take a quick walk outside?”
Hospitalists are inherently busy while on service, but these four interventions are brief, requiring only several minutes. Each small investment of your time will pay significant emotional dividends. These practices will not only enhance your colleagues’ sense of well-being, but will also bolster your happiness and productivity. A positive mindset fosters creative thinking and enhances complex problem solving. Recharging the microcultures of hospitalist workspaces with positivity will spark a larger transformation at the organizational level. That’s because positive actions are contagious.2 One hospitalist’s commitment to CARE will encourage other hospitalists to adopt these behaviors, establishing a virtuous cycle that sustains an organization’s culture of wellness.
1. Bohman B, Dyrbye L, Sinsky CA, et al. Physician well-being: the reciprocity of practice efficiency, culture of wellness, and personal resilience. NEJM Catalyst. August 7, 2017. Accessed June 24, 2021. https://catalyst.nejm.org/doi/full/10.1056/CAT.17.0429
2. Achor S. The Happiness Advantage: How a Positive Brain Fuels Success in Work and Life. Currency; 2010.
1. Bohman B, Dyrbye L, Sinsky CA, et al. Physician well-being: the reciprocity of practice efficiency, culture of wellness, and personal resilience. NEJM Catalyst. August 7, 2017. Accessed June 24, 2021. https://catalyst.nejm.org/doi/full/10.1056/CAT.17.0429
2. Achor S. The Happiness Advantage: How a Positive Brain Fuels Success in Work and Life. Currency; 2010.
© 2021 Society of Hospital Medicine
Algorithms for Prediction of Clinical Deterioration on the General Wards: A Scoping Review
The early identification of clinical deterioration among adult hospitalized patients remains a challenge.1 Delayed identification is associated with increased morbidity and mortality, unplanned intensive care unit (ICU) admissions, prolonged hospitalization, and higher costs.2,3 Earlier detection of deterioration using predictive algorithms of vital sign monitoring might avoid these negative outcomes.4 In this scoping review, we summarize current algorithms and their evidence.
Vital signs provide the backbone for detecting clinical deterioration. Early warning scores (EWS) and outreach protocols were developed to bring structure to the assessment of vital signs. Most EWS claim to predict clinical end points such as unplanned ICU admission up to 24 hours in advance.5,6 Reviews of EWS showed a positive trend toward reduced length of stay and mortality. However, conclusions about general efficacy could not be generated because of case heterogeneity and methodologic shortcomings.4,7 Continuous automated vital sign monitoring of patients on the general ward can now be accomplished with wearable devices.8 The first reports on continuous monitoring showed earlier detection of deterioration but not improved clinical end points.4,9 Since then, different reports on continuous monitoring have shown positive effects but concluded that unprocessed monitoring data per se falls short of generating actionable alarms.4,10,11
Predictive algorithms, which often use artificial intelligence (AI), are increasingly employed to recognize complex patterns or abnormalities and support predictions of events in big data sets.12,13 Especially when combined with continuous vital sign monitoring, predictive algorithms have the potential to expedite detection of clinical deterioration and improve patient outcomes. Predictive algorithms using vital signs in the ICU have shown promising results.14 The impact of predictive algorithms on the general wards, however, is unclear.
The aims of our scoping review were to explore the extent and range of and evidence for predictive vital signs–based algorithms on the adult general ward; to describe the variety of these algorithms; and to categorize effects, facilitators, and barriers of their implementation.15
MATERIALS AND METHODS
We performed a scoping review to create a summary of the current state of research. We used the five-step method of Levac and followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses Extension for Scoping Reviews guidelines (Appendix 1).16,17
PubMed, Embase, and CINAHL databases were searched for English-language articles written between January 1, 2010, and November 20, 2020. We developed the search queries with an experienced information scientist, and we used database-specific terms and strategies for input, clinical outcome, method, predictive capability, and population (Appendix 2). Additionally, we searched the references of the selected articles, as well as publications citing these articles.
All studies identified were screened by title and abstract by two researchers (RP and YE). The selected studies were read in their entirety and checked for eligibility using the following inclusion criteria: automated algorithm; vital signs-based; real-time prediction; of clinical deterioration; in an adult, general ward population. In cases where there were successive publications with the same algorithm and population, we selected the most recent study.
For screening and selection, we used the Rayyan QCRI online tool (Qatar Computing Research Institute) and Endnote X9 (Clarivate Analytics). We extracted information using a data extraction form and organized it into descriptive characteristics of the selected studies (Table 1): an input data table showing number of admissions, intermittent or continuous measurements, vital signs measured, laboratory results (Appendix Table 1), a table summarizing study designs and settings (Appendix Table 2), and a prediction performance table (Table 2). We report characteristics of the populations and algorithms, prediction specifications such as area under the receiver operating curve (AUROC), and predictive values. Predictive values are affected by prevalence, which may differ among populations. To compare the algorithms, we calculated an indexed positive predictive value (PPV) and a number needed to evaluate (NNE) using a weighted average prevalence of clinical deterioration of 3.0%.
We defined clinical deterioration as end points, including rapid response team activation, cardiopulmonary resuscitation, transfer to an ICU, or death.
Effects, facilitators, and barriers were identified and categorized using ATLAS.ti 8 software (ATLAS.ti) and evaluated by three researchers (RP, MK, and THvdB). These were categorized using the adapted frameworks of Gagnon et al18 for the barriers and facilitators and of Donabedian19 for the effects (Appendix 3).
The Gagnon et al framework was adapted by changing two of four domains—that is, “Individual” was changed to “Professional” and “Human” to “Physiology.” The domains of “Technology” and “Organization” remained unchanged. The Donabedian domains of “Outcome,” “Process,” and “Structure” also remained unchanged (Table 3).
We divided the studies into two groups: studies on predictive algorithms with and without AI when reporting on characteristics and performance. For the secondary aim of exploring implementation impact, we reported facilitators and barriers in a narrative way, highlighting the most frequent and notable findings.
RESULTS
As shown in the Figure, we found 1741 publications, of which we read the full-text of 109. There were 1632 publications that did not meet the inclusion criteria. The publications by Churpek et al,20,21 Bartkiowak et al,22 Edelson et al,23 Escobar et al,24,25 and Kipnis et al26 reported on the same algorithms or databases but had significantly different approaches. For multiple publications using the same algorithm and population, the most recent was named with inclusion of the earlier findings.20,21,27-29 The resulting 21 papers are included in this review.
Descriptive characteristics of the studies are summarized in Table 1. Nineteen of the publications were full papers and two were conference abstracts. Most of the studies (n = 18) were from the United States; there was one study from South Korea,30 one study from Portugal,31 and one study from the United Kingdom.32 In 15 of the studies, there was a strict focus on general or specific wards; 6 studies also included the ICU and/or emergency departments.
Two of the studies were clinical trials, 2 were prospective observational studies, and 17 were retrospective studies. Five studies reported on an active predictive model during admission. Of these, 3 reported that the model was clinically implemented, using the predictions in their clinical workflow. None of the implemented studies used AI.
All input variables are presented in Appendix Table 1.
The non-AI algorithm prediction horizons ranged from 4 to 24 hours, with a median of 24 hours (interquartile range [IQR], 12-24 hours). The AI algorithms ranged from 2 to 48 hours and had a median horizon of 14 hours (IQR, 12-24 hours).
We found three studies reporting patient outcomes. The most recent of these was a large multicenter implementation study by Escobar et al25 that included an extensive follow-up response. This study reported a significantly decreased 30-day mortality in the intervention cohort. A smaller randomized controlled trial reported no significant differences in patient outcomes with earlier warning alarms.27 A third study reported more appropriate rapid response team deployment and decreased mortality in a subgroup analysis.35
Effects, Facilitators, and Barriers
As shown in the Appendix Figure and further detailed in Table 3, the described effects were predominantly positive—57 positive effects vs 11 negative effects. These positive effects sorted primarily into the outcome and process domains.
All of the studies that compared their proposed model with one of various warning systems (eg, EWS, National Early Warning Score [NEWS], Modified Early Warning Score [MEWS]) showed superior performance (based on AUROC and reported predictive values). In 17 studies, the authors reported their model as more useful or superior to the EWS.20-23,26-28,34,36-41 Four studies reported real-time detection of deterioration before regular EWS,20,26,42 and three studies reported positive effects on patient-related outcomes.26,35 Four negative effects were noted on the controllability, validity, and potential limitations.27,42
Of the 38 remarks in the Technology domain, difficulty with implementation in daily practice was a commonly cited barrier.22,24,40,42 Difficulties included creating real-time data feeds out of the EMR, though there were mentions of some successful examples.25,27,36 Difficulty in the interpretability of AI was also considered a potential barrier.30,32,33,35,39,41 There were remarks as to the applicability of the prolonged prediction horizon because of the associated decoupling from the clinical view.39,42
Conservative attitudes toward new technologies and inadequate knowledge were mentioned as barriers.39 Repeated remarks were made on the difficulty of interpreting and responding to a predicted escalation, as the clinical pattern might not be recognizable at such an early stage. On the other hand, it is expected that less invasive countermeasures would be adequate to avert further escalation. Earlier recognition of possible escalations also raised potential ethical questions, such as when to discuss palliative care.24
The heterogeneity of the general ward population and the relatively low prevalence of deterioration were mentioned as barriers.24,30,38,41 There were also concerns that not all escalations are preventable and that some patient outcomes may not be modifiable.24,38
Many investigators expected reductions in false alarms and associated alarm fatigue (reflected as higher PPVs). Furthermore, they expected workflow to improve and workload to decrease.21,23,27,31,33,35,38,41 Despite the capacity of modern EMRs to store large amounts of patient data, some investigators felt improvements to real-time access, data quality and validity, and data density are needed to ensure valid associated predictions.21,22,24,32,37
DISCUSSION
As the complexity and comorbidity of hospitalized adults grow, predicting clinical deterioration is becoming more important. With an ever-increasing amount of available
There are several important limitations across these studies. In a clinical setting, these models would function as a screening test. Almost all studies report an AUROC; however, sensitivity and PPV or NNE (defined as 1/PPV) may be more useful than AUROC when predicting low-frequency events with high-potential clinical impact.44 Assessing the NNE is especially relevant because of its relation to alarm fatigue and responsiveness of clinicians.43 Alarm fatigue and lack of adequate response to alarms were repeatedly cited as potential barriers for application of automated scores.
Although the results of our scoping review are promising, there are limited data on clinical outcomes using these algorithms. Only three of five algorithms were used to guide clinical decision-making.25,27,35 Kollef et al27 showed shorter hospitalizations and Evans et al35 found decreased mortality rates in a multimorbid subgroup. Escobar et al25 found an overall and consistent decrease in mortality in a large, heterogenic population of inpatients across 21 hospitals. While Escobar et al’s findings provide strong evidence that predictive algorithms and structured follow-up on alarms can improve patient outcomes, it recognizes that not all facilities will have the resources to implement them.25 Dedicated round-the-clock follow-up of alarms has yet to be proven feasible for smaller institutions, and leaner solutions must be explored. The example set by Escobar et al25 should be translated into various settings to prove its reproducibility and to substantiate the clinical impact of predictive models and structured follow-up.
According to expert opinion, the use of high-frequency or continuous monitoring at low-acuity wards and AI algorithms to detect trends and patterns will reduce failure-to-rescue rates.4,9,43 However, most studies in our review focused on periodic spot-checked vital signs, and none of the AI algorithms were implemented in clinical care (Appendix Table 1
STRENGTHS AND LIMITATIONS
We performed a comprehensive review of the current literature using a clear and reproducible methodology to minimize the risk of missing relevant publications. The identified research is mainly limited to large US centers and consists of mostly retrospective studies. Heterogeneity among inputs, endpoints, time horizons, and evaluation metrics make comparisons challenging. Comments on facilitators, barriers, and effects were limited.
RECOMMENDATIONS FOR FUTURE RESEARCH
Artificial intelligence and the use of continuous monitoring hold great promise in creating optimal predictive algorithms. Future studies should directly compare AI- and non-AI-based algorithms using continuous monitoring to determine predictive accuracy, feasibility, costs, and outcomes. A consensus on endpoint definitions, input variables, methodology, and reporting is needed to enhance reproducibility, comparability, and generalizability of future research.
CONCLUSION
- van Galen LS, Struik PW, Driesen BEJM, et al. Delayed recognition of deterioration of patients in general wards is mostly caused by human related monitoring failures: a root cause analysis of unplanned ICU admissions. PLoS One. 2016;11(8):e0161393. https://doi.org/10.1371/journal. pone.0161393
- Mardini L, Lipes J, Jayaraman D. Adverse outcomes associated with delayed intensive care consultation in medical and surgical inpatients. J Crit Care. 2012;27(6):688-693. https://doi.org/10.1016/j.jcrc.2012.04.011
- Young MP, Gooder VJ, McBride K, James B, Fisher ES. Inpatient transfers to the intensive care unit: delays are associated with increased mortality and morbidity. J Gen Intern Med. 2003;18(2):77-83. https://doi.org/10.1046/ j.1525-1497.2003.20441.x
- Khanna AK, Hoppe P, Saugel B. Automated continuous noninvasive ward monitoring: future directions and challenges. Crit Care. 2019;23(1):194. https://doi.org/10.1186/s13054-019-2485-7
- Ludikhuize J, Hamming A, de Jonge E, Fikkers BG. Rapid response systems in The Netherlands. Jt Comm J Qual Patient Saf. 2011;37(3):138-197. https:// doi.org/10.1016/s1553-7250(11)37017-1
- Cuthbertson BH, Boroujerdi M, McKie L, Aucott L, Prescott G. Can physiological variables and early warning scoring systems allow early recognition of the deteriorating surgical patient? Crit Care Med. 2007;35(2):402-409. https://doi.org/10.1097/01.ccm.0000254826.10520.87
- Alam N, Hobbelink EL, van Tienhoven AJ, van de Ven PM, Jansma EP, Nanayakkara PWB. The impact of the use of the Early Warning Score (EWS) on patient outcomes: a systematic review. Resuscitation. 2014;85(5):587-594. https://doi.org/10.1016/j.resuscitation.2014.01.013
- Weenk M, Koeneman M, van de Belt TH, Engelen LJLPG, van Goor H, Bredie SJH. Wireless and continuous monitoring of vital signs in patients at the general ward. Resuscitation. 2019;136:47-53. https://doi.org/10.1016/j.resuscitation.2019.01.017
- Cardona-Morrell M, Prgomet M, Turner RM, Nicholson M, Hillman K. Effectiveness of continuous or intermittent vital signs monitoring in preventing adverse events on general wards: a systematic review and meta-analysis. Int J Clin Pract. 2016;70(10):806-824. https://doi.org/10.1111/ijcp.12846
- Brown H, Terrence J, Vasquez P, Bates DW, Zimlichman E. Continuous monitoring in an inpatient medical-surgical unit: a controlled clinical trial. Am J Med. 2014;127(3):226-232. https://doi.org/10.1016/j.amjmed.2013.12.004
- Mestrom E, De Bie A, van de Steeg M, Driessen M, Atallah L, Bezemer R. Implementation of an automated early warning scoring system in a E8 Journal of Hospital Medicine® Published Online June 2021 An Official Publication of the Society of Hospital Medicine Peelen et al | Predicting Deterioration: A Scoping Review surgical ward: practical use and effects on patient outcomes. PLoS One. 2019;14(5):e0213402. https://doi.org/10.1371/journal.pone.0213402
- Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. https://doi.org/10.1136/ svn-2017-000101
- Iwashyna TJ, Liu V. What’s so different about big data? A primer for clinicians trained to think epidemiologically. Ann Am Thorac Soc. 2014;11(7):1130- 1135. https://doi.org/10.1513/annalsats.201405-185as
- Jalali A, Bender D, Rehman M, Nadkanri V, Nataraj C. Advanced analytics for outcome prediction in intensive care units. Conf Proc IEEE Eng Med Biol Soc. 2016;2016:2520-2524. https://doi.org/10.1109/embc.2016.7591243
- Munn Z, Peters MDJ, Stern C, Tufanaru C, McArthur A, Aromataris E. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol. 2018;18(1):143. https://doi.org/10.1186/s12874-018-0611-x
- Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19-32. https://doi.org/10.1080/13645 57032000119616
- Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMAScR): checklist and explanation. Ann Intern Med. 2018;169(7):467- 473. https://doi.org/10.7326/m18-0850
- Gagnon MP, Desmartis M, Gagnon J, et al. Framework for user involvement in health technology assessment at the local level: views of health managers, user representatives, and clinicians. Int J Technol Assess Health Care. 2015;31(1-2):68-77. https://doi.org/10.1017/s0266462315000070
- Donabedian A. The quality of care. How can it be assessed? JAMA. 1988;260(12):1743-1748. https://doi.org/10.1001/jama.260.12.1743
- Churpek MM, Yuen TC, Winslow C, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med. 2014;190(6):649-655. https://doi.org/10.1164/rccm.201406-1022oc
- Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med. 2016;44(2):368-374. https://doi.org/10.1097/ccm.0000000000001571
- Bartkowiak B, Snyder AM, Benjamin A, et al. Validating the electronic cardiac arrest risk triage (eCART) score for risk stratification of surgical inpatients in the postoperative setting: retrospective cohort study. Ann Surg. 2019;269(6):1059-1063. https://doi.org/10.1097/sla.0000000000002665
- Edelson DP, Carey K, Winslow CJ, Churpek MM. Less is more: detecting clinical deterioration in the hospital with machine learning using only age, heart rate and respiratory rate. Abstract presented at: American Thoracic Society International Conference; May 22, 2018; San Diego, California. Am J Resp Crit Care Med. 2018;197:A4444.
- Escobar GJ, LaGuardia JC, Turk BJ, Ragins A, Kipnis P, Draper D. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med. 2012;7(5):388-395. https:// doi.org/10.1002/jhm.1929
- Escobar GJ, Liu VX, Schuler A, Lawson B, Greene JD, Kipnis P. Automated identification of adults at risk for in-hospital clinical deterioration. N Engl J Med. 2020;383(20):1951-1960. https://doi.org/10.1056/nejmsa2001090
- Kipnis P, Turk BJ, Wulf DA, et al. Development and validation of an electronic medical record-based alert score for detection of inpatient deterioration outside the ICU. J Biomed Inform. 2016;64:10-19. https://doi.org/10.1016/j. jbi.2016.09.013
- Kollef MH, Chen Y, Heard K, et al. A randomized trial of real-time automated clinical deterioration alerts sent to a rapid response team. J Hosp Med. 2014;9(7):424-429. https://doi.org/10.1002/jhm.2193
- Hackmann G, Chen M, Chipara O, et al. Toward a two-tier clinical warning system for hospitalized patients. AMIA Annu Symp Proc. 2011;2011:511-519.
- Bailey TC, Chen Y, Mao Y, Lu, C, Hackmann G, Micek ST. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. J Hosp Med. 2013;8(5):236-242. https://doi.org/10.1002/jhm.2009
- Kwon JM, Lee Y, Lee Y, Lee S, Park J. An algorithm based on deep learning for predicting in-hospital cardiac arrest. J Am Heart Assoc. 2018;7(13):e008678. https://doi.org/10.1161/jaha.118.008678
- Correia S, Gomes A, Shahriari S, Almeida JP, Severo M, Azevedo A. Performance of the early warning system vital to predict unanticipated higher-level of care admission and in-hospital death of ward patients. Value Health. 2018;21(S3):S360. https://doi.org/10.1016/j.jval.2018.09.2152
- Shamout FE, Zhu T, Sharma P, Watkinson PJ, Clifton DA. Deep interpretable early warning system for the detection of clinical deterioration. IEEE J Biomed Health Inform. 2020;24(2):437-446. https://doi.org/10.1109/ jbhi.2019.2937803
- Bai Y, Do DH, Harris PRE, et al. Integrating monitor alarms with laboratory test results to enhance patient deterioration prediction. J Biomed Inform. 2015;53:81-92. https://doi.org/10.1016/j.jbi.2014.09.006
- Hu X, Sapo M, Nenov V, et al. Predictive combinations of monitor alarms preceding in-hospital code blue events. J Biomed Inform. 2012;45(5):913-921. https://doi.org/10.1016/j.jbi.2012.03.001
- Evans RS, Kuttler KG, Simpson KJ, et al. Automated detection of physiologic deterioration in hospitalized patients. J Am Med Inform Assoc. 2015;22(2):350-360. https://doi.org/10.1136/amiajnl-2014-002816
- Ghosh E, Eshelman L, Yang L, Carlson E, Lord B. Early deterioration indicator: data-driven approach to detecting deterioration in general ward. Resuscitation. 2018;122:99-105. https://doi.org/10.1016/j.resuscitation. 2017.10.026
- Kang MA, Churpek MM, Zadravecz FJ, Adhikari R, Twu NM, Edelson DP: Real-time risk prediction on the wards: a feasibility study. Crit Care Med. 2016;44(8):1468-1473. https://doi.org/10.1097/ccm.0000000000001716
- Hu SB, Wong DJL, Correa A, Li N, Deng JC. Prediction of clinical deterioration in hospitalized adult patients with hematologic malignancies using a neural network model. PLoS One. 2016;11(8):e0161401. https://doi. org/10.1371/journal.pone.0161401
- Rothman MJ, Rothman SI, Beals J 4th. Development and validation of a continuous measure of patient condition using the electronic medical record. J Biomed Inform. 2013;46(5):837-848. https://doi.org/10.1016/j. jbi.2013.06.011
- Alaa AM, Yoon J, Hu S, van der Schaar M. Personalized risk scoring for critical care prognosis using mixtures of Gaussian processes. IEEE Trans Biomed Eng. 2018;65(1):207-218. https://doi.org/10.1109/tbme.2017.2698602
- Mohamadlou H, Panchavati S, Calvert J, et al. Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction. Health Informatics J. 2020;26(3):1912-1925. https://doi.org/10.1177/1460458219894494
- Alvarez CA, Clark CA, Zhang S, et al. Predicting out of intensive care unit cardiopulmonary arrest or death using electronic medical record data. BMC Med Inform Decis Mak. 2013;13:28. https://doi.org/10.1186/1472-6947-13-28
- Vincent JL, Einav S, Pearse R, et al. Improving detection of patient deterioration in the general hospital ward environment. Eur J Anaesthesiol. 2018;35(5):325-333. https://doi.org/10.1097/eja.0000000000000798
- Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M. Why the C-statistic is not informative to evaluate early warning scores and what metrics to use. Crit Care. 2015;19(1):285. https://doi.org/10.1186/s13054-015-0999-1
- Weenk M, Bredie SJ, Koeneman M, Hesselink G, van Goor H, van de Belt TH. Continuous monitoring of the vital signs in the general ward using wearable devices: randomized controlled trial. J Med Internet Res. 2020;22(6):e15471. https://doi.org/10.2196/15471
- Wellner B, Grand J, Canzone E, et al. Predicting unplanned transfers to the intensive care unit: a machine learning approach leveraging diverse clinical elements. JMIR Med Inform. 2017;5(4):e45. https://doi.org/10.2196/medinform.8680
- Elliott M, Baird J. Pulse oximetry and the enduring neglect of respiratory rate assessment: a commentary on patient surveillance. Br J Nurs. 2019;28(19):1256-1259. https://doi.org/10.12968/bjon.2019.28.19.1256
- Blackwell JN, Keim-Malpass J, Clark MT, et al. Early detection of in-patient deterioration: one prediction model does not fit all. Crit Care Explor. 2020;2(5):e0116. https://doi.org/10.1097/cce.0000000000000116
- Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. https://doi.org/10.1038/sdata.2016.35
- Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12(6):573-576. https://doi. org/10.1370/afm.1713
- Kirkland LL, Malinchoc M, O’Byrne M, et al. A clinical deterioration prediction tool for internal medicine patients. Am J Med Qual. 2013;28(2):135-142 https://doi.org/10.1177/1062860612450459
The early identification of clinical deterioration among adult hospitalized patients remains a challenge.1 Delayed identification is associated with increased morbidity and mortality, unplanned intensive care unit (ICU) admissions, prolonged hospitalization, and higher costs.2,3 Earlier detection of deterioration using predictive algorithms of vital sign monitoring might avoid these negative outcomes.4 In this scoping review, we summarize current algorithms and their evidence.
Vital signs provide the backbone for detecting clinical deterioration. Early warning scores (EWS) and outreach protocols were developed to bring structure to the assessment of vital signs. Most EWS claim to predict clinical end points such as unplanned ICU admission up to 24 hours in advance.5,6 Reviews of EWS showed a positive trend toward reduced length of stay and mortality. However, conclusions about general efficacy could not be generated because of case heterogeneity and methodologic shortcomings.4,7 Continuous automated vital sign monitoring of patients on the general ward can now be accomplished with wearable devices.8 The first reports on continuous monitoring showed earlier detection of deterioration but not improved clinical end points.4,9 Since then, different reports on continuous monitoring have shown positive effects but concluded that unprocessed monitoring data per se falls short of generating actionable alarms.4,10,11
Predictive algorithms, which often use artificial intelligence (AI), are increasingly employed to recognize complex patterns or abnormalities and support predictions of events in big data sets.12,13 Especially when combined with continuous vital sign monitoring, predictive algorithms have the potential to expedite detection of clinical deterioration and improve patient outcomes. Predictive algorithms using vital signs in the ICU have shown promising results.14 The impact of predictive algorithms on the general wards, however, is unclear.
The aims of our scoping review were to explore the extent and range of and evidence for predictive vital signs–based algorithms on the adult general ward; to describe the variety of these algorithms; and to categorize effects, facilitators, and barriers of their implementation.15
MATERIALS AND METHODS
We performed a scoping review to create a summary of the current state of research. We used the five-step method of Levac and followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses Extension for Scoping Reviews guidelines (Appendix 1).16,17
PubMed, Embase, and CINAHL databases were searched for English-language articles written between January 1, 2010, and November 20, 2020. We developed the search queries with an experienced information scientist, and we used database-specific terms and strategies for input, clinical outcome, method, predictive capability, and population (Appendix 2). Additionally, we searched the references of the selected articles, as well as publications citing these articles.
All studies identified were screened by title and abstract by two researchers (RP and YE). The selected studies were read in their entirety and checked for eligibility using the following inclusion criteria: automated algorithm; vital signs-based; real-time prediction; of clinical deterioration; in an adult, general ward population. In cases where there were successive publications with the same algorithm and population, we selected the most recent study.
For screening and selection, we used the Rayyan QCRI online tool (Qatar Computing Research Institute) and Endnote X9 (Clarivate Analytics). We extracted information using a data extraction form and organized it into descriptive characteristics of the selected studies (Table 1): an input data table showing number of admissions, intermittent or continuous measurements, vital signs measured, laboratory results (Appendix Table 1), a table summarizing study designs and settings (Appendix Table 2), and a prediction performance table (Table 2). We report characteristics of the populations and algorithms, prediction specifications such as area under the receiver operating curve (AUROC), and predictive values. Predictive values are affected by prevalence, which may differ among populations. To compare the algorithms, we calculated an indexed positive predictive value (PPV) and a number needed to evaluate (NNE) using a weighted average prevalence of clinical deterioration of 3.0%.
We defined clinical deterioration as end points, including rapid response team activation, cardiopulmonary resuscitation, transfer to an ICU, or death.
Effects, facilitators, and barriers were identified and categorized using ATLAS.ti 8 software (ATLAS.ti) and evaluated by three researchers (RP, MK, and THvdB). These were categorized using the adapted frameworks of Gagnon et al18 for the barriers and facilitators and of Donabedian19 for the effects (Appendix 3).
The Gagnon et al framework was adapted by changing two of four domains—that is, “Individual” was changed to “Professional” and “Human” to “Physiology.” The domains of “Technology” and “Organization” remained unchanged. The Donabedian domains of “Outcome,” “Process,” and “Structure” also remained unchanged (Table 3).
We divided the studies into two groups: studies on predictive algorithms with and without AI when reporting on characteristics and performance. For the secondary aim of exploring implementation impact, we reported facilitators and barriers in a narrative way, highlighting the most frequent and notable findings.
RESULTS
As shown in the Figure, we found 1741 publications, of which we read the full-text of 109. There were 1632 publications that did not meet the inclusion criteria. The publications by Churpek et al,20,21 Bartkiowak et al,22 Edelson et al,23 Escobar et al,24,25 and Kipnis et al26 reported on the same algorithms or databases but had significantly different approaches. For multiple publications using the same algorithm and population, the most recent was named with inclusion of the earlier findings.20,21,27-29 The resulting 21 papers are included in this review.
Descriptive characteristics of the studies are summarized in Table 1. Nineteen of the publications were full papers and two were conference abstracts. Most of the studies (n = 18) were from the United States; there was one study from South Korea,30 one study from Portugal,31 and one study from the United Kingdom.32 In 15 of the studies, there was a strict focus on general or specific wards; 6 studies also included the ICU and/or emergency departments.
Two of the studies were clinical trials, 2 were prospective observational studies, and 17 were retrospective studies. Five studies reported on an active predictive model during admission. Of these, 3 reported that the model was clinically implemented, using the predictions in their clinical workflow. None of the implemented studies used AI.
All input variables are presented in Appendix Table 1.
The non-AI algorithm prediction horizons ranged from 4 to 24 hours, with a median of 24 hours (interquartile range [IQR], 12-24 hours). The AI algorithms ranged from 2 to 48 hours and had a median horizon of 14 hours (IQR, 12-24 hours).
We found three studies reporting patient outcomes. The most recent of these was a large multicenter implementation study by Escobar et al25 that included an extensive follow-up response. This study reported a significantly decreased 30-day mortality in the intervention cohort. A smaller randomized controlled trial reported no significant differences in patient outcomes with earlier warning alarms.27 A third study reported more appropriate rapid response team deployment and decreased mortality in a subgroup analysis.35
Effects, Facilitators, and Barriers
As shown in the Appendix Figure and further detailed in Table 3, the described effects were predominantly positive—57 positive effects vs 11 negative effects. These positive effects sorted primarily into the outcome and process domains.
All of the studies that compared their proposed model with one of various warning systems (eg, EWS, National Early Warning Score [NEWS], Modified Early Warning Score [MEWS]) showed superior performance (based on AUROC and reported predictive values). In 17 studies, the authors reported their model as more useful or superior to the EWS.20-23,26-28,34,36-41 Four studies reported real-time detection of deterioration before regular EWS,20,26,42 and three studies reported positive effects on patient-related outcomes.26,35 Four negative effects were noted on the controllability, validity, and potential limitations.27,42
Of the 38 remarks in the Technology domain, difficulty with implementation in daily practice was a commonly cited barrier.22,24,40,42 Difficulties included creating real-time data feeds out of the EMR, though there were mentions of some successful examples.25,27,36 Difficulty in the interpretability of AI was also considered a potential barrier.30,32,33,35,39,41 There were remarks as to the applicability of the prolonged prediction horizon because of the associated decoupling from the clinical view.39,42
Conservative attitudes toward new technologies and inadequate knowledge were mentioned as barriers.39 Repeated remarks were made on the difficulty of interpreting and responding to a predicted escalation, as the clinical pattern might not be recognizable at such an early stage. On the other hand, it is expected that less invasive countermeasures would be adequate to avert further escalation. Earlier recognition of possible escalations also raised potential ethical questions, such as when to discuss palliative care.24
The heterogeneity of the general ward population and the relatively low prevalence of deterioration were mentioned as barriers.24,30,38,41 There were also concerns that not all escalations are preventable and that some patient outcomes may not be modifiable.24,38
Many investigators expected reductions in false alarms and associated alarm fatigue (reflected as higher PPVs). Furthermore, they expected workflow to improve and workload to decrease.21,23,27,31,33,35,38,41 Despite the capacity of modern EMRs to store large amounts of patient data, some investigators felt improvements to real-time access, data quality and validity, and data density are needed to ensure valid associated predictions.21,22,24,32,37
DISCUSSION
As the complexity and comorbidity of hospitalized adults grow, predicting clinical deterioration is becoming more important. With an ever-increasing amount of available
There are several important limitations across these studies. In a clinical setting, these models would function as a screening test. Almost all studies report an AUROC; however, sensitivity and PPV or NNE (defined as 1/PPV) may be more useful than AUROC when predicting low-frequency events with high-potential clinical impact.44 Assessing the NNE is especially relevant because of its relation to alarm fatigue and responsiveness of clinicians.43 Alarm fatigue and lack of adequate response to alarms were repeatedly cited as potential barriers for application of automated scores.
Although the results of our scoping review are promising, there are limited data on clinical outcomes using these algorithms. Only three of five algorithms were used to guide clinical decision-making.25,27,35 Kollef et al27 showed shorter hospitalizations and Evans et al35 found decreased mortality rates in a multimorbid subgroup. Escobar et al25 found an overall and consistent decrease in mortality in a large, heterogenic population of inpatients across 21 hospitals. While Escobar et al’s findings provide strong evidence that predictive algorithms and structured follow-up on alarms can improve patient outcomes, it recognizes that not all facilities will have the resources to implement them.25 Dedicated round-the-clock follow-up of alarms has yet to be proven feasible for smaller institutions, and leaner solutions must be explored. The example set by Escobar et al25 should be translated into various settings to prove its reproducibility and to substantiate the clinical impact of predictive models and structured follow-up.
According to expert opinion, the use of high-frequency or continuous monitoring at low-acuity wards and AI algorithms to detect trends and patterns will reduce failure-to-rescue rates.4,9,43 However, most studies in our review focused on periodic spot-checked vital signs, and none of the AI algorithms were implemented in clinical care (Appendix Table 1
STRENGTHS AND LIMITATIONS
We performed a comprehensive review of the current literature using a clear and reproducible methodology to minimize the risk of missing relevant publications. The identified research is mainly limited to large US centers and consists of mostly retrospective studies. Heterogeneity among inputs, endpoints, time horizons, and evaluation metrics make comparisons challenging. Comments on facilitators, barriers, and effects were limited.
RECOMMENDATIONS FOR FUTURE RESEARCH
Artificial intelligence and the use of continuous monitoring hold great promise in creating optimal predictive algorithms. Future studies should directly compare AI- and non-AI-based algorithms using continuous monitoring to determine predictive accuracy, feasibility, costs, and outcomes. A consensus on endpoint definitions, input variables, methodology, and reporting is needed to enhance reproducibility, comparability, and generalizability of future research.
CONCLUSION
The early identification of clinical deterioration among adult hospitalized patients remains a challenge.1 Delayed identification is associated with increased morbidity and mortality, unplanned intensive care unit (ICU) admissions, prolonged hospitalization, and higher costs.2,3 Earlier detection of deterioration using predictive algorithms of vital sign monitoring might avoid these negative outcomes.4 In this scoping review, we summarize current algorithms and their evidence.
Vital signs provide the backbone for detecting clinical deterioration. Early warning scores (EWS) and outreach protocols were developed to bring structure to the assessment of vital signs. Most EWS claim to predict clinical end points such as unplanned ICU admission up to 24 hours in advance.5,6 Reviews of EWS showed a positive trend toward reduced length of stay and mortality. However, conclusions about general efficacy could not be generated because of case heterogeneity and methodologic shortcomings.4,7 Continuous automated vital sign monitoring of patients on the general ward can now be accomplished with wearable devices.8 The first reports on continuous monitoring showed earlier detection of deterioration but not improved clinical end points.4,9 Since then, different reports on continuous monitoring have shown positive effects but concluded that unprocessed monitoring data per se falls short of generating actionable alarms.4,10,11
Predictive algorithms, which often use artificial intelligence (AI), are increasingly employed to recognize complex patterns or abnormalities and support predictions of events in big data sets.12,13 Especially when combined with continuous vital sign monitoring, predictive algorithms have the potential to expedite detection of clinical deterioration and improve patient outcomes. Predictive algorithms using vital signs in the ICU have shown promising results.14 The impact of predictive algorithms on the general wards, however, is unclear.
The aims of our scoping review were to explore the extent and range of and evidence for predictive vital signs–based algorithms on the adult general ward; to describe the variety of these algorithms; and to categorize effects, facilitators, and barriers of their implementation.15
MATERIALS AND METHODS
We performed a scoping review to create a summary of the current state of research. We used the five-step method of Levac and followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses Extension for Scoping Reviews guidelines (Appendix 1).16,17
PubMed, Embase, and CINAHL databases were searched for English-language articles written between January 1, 2010, and November 20, 2020. We developed the search queries with an experienced information scientist, and we used database-specific terms and strategies for input, clinical outcome, method, predictive capability, and population (Appendix 2). Additionally, we searched the references of the selected articles, as well as publications citing these articles.
All studies identified were screened by title and abstract by two researchers (RP and YE). The selected studies were read in their entirety and checked for eligibility using the following inclusion criteria: automated algorithm; vital signs-based; real-time prediction; of clinical deterioration; in an adult, general ward population. In cases where there were successive publications with the same algorithm and population, we selected the most recent study.
For screening and selection, we used the Rayyan QCRI online tool (Qatar Computing Research Institute) and Endnote X9 (Clarivate Analytics). We extracted information using a data extraction form and organized it into descriptive characteristics of the selected studies (Table 1): an input data table showing number of admissions, intermittent or continuous measurements, vital signs measured, laboratory results (Appendix Table 1), a table summarizing study designs and settings (Appendix Table 2), and a prediction performance table (Table 2). We report characteristics of the populations and algorithms, prediction specifications such as area under the receiver operating curve (AUROC), and predictive values. Predictive values are affected by prevalence, which may differ among populations. To compare the algorithms, we calculated an indexed positive predictive value (PPV) and a number needed to evaluate (NNE) using a weighted average prevalence of clinical deterioration of 3.0%.
We defined clinical deterioration as end points, including rapid response team activation, cardiopulmonary resuscitation, transfer to an ICU, or death.
Effects, facilitators, and barriers were identified and categorized using ATLAS.ti 8 software (ATLAS.ti) and evaluated by three researchers (RP, MK, and THvdB). These were categorized using the adapted frameworks of Gagnon et al18 for the barriers and facilitators and of Donabedian19 for the effects (Appendix 3).
The Gagnon et al framework was adapted by changing two of four domains—that is, “Individual” was changed to “Professional” and “Human” to “Physiology.” The domains of “Technology” and “Organization” remained unchanged. The Donabedian domains of “Outcome,” “Process,” and “Structure” also remained unchanged (Table 3).
We divided the studies into two groups: studies on predictive algorithms with and without AI when reporting on characteristics and performance. For the secondary aim of exploring implementation impact, we reported facilitators and barriers in a narrative way, highlighting the most frequent and notable findings.
RESULTS
As shown in the Figure, we found 1741 publications, of which we read the full-text of 109. There were 1632 publications that did not meet the inclusion criteria. The publications by Churpek et al,20,21 Bartkiowak et al,22 Edelson et al,23 Escobar et al,24,25 and Kipnis et al26 reported on the same algorithms or databases but had significantly different approaches. For multiple publications using the same algorithm and population, the most recent was named with inclusion of the earlier findings.20,21,27-29 The resulting 21 papers are included in this review.
Descriptive characteristics of the studies are summarized in Table 1. Nineteen of the publications were full papers and two were conference abstracts. Most of the studies (n = 18) were from the United States; there was one study from South Korea,30 one study from Portugal,31 and one study from the United Kingdom.32 In 15 of the studies, there was a strict focus on general or specific wards; 6 studies also included the ICU and/or emergency departments.
Two of the studies were clinical trials, 2 were prospective observational studies, and 17 were retrospective studies. Five studies reported on an active predictive model during admission. Of these, 3 reported that the model was clinically implemented, using the predictions in their clinical workflow. None of the implemented studies used AI.
All input variables are presented in Appendix Table 1.
The non-AI algorithm prediction horizons ranged from 4 to 24 hours, with a median of 24 hours (interquartile range [IQR], 12-24 hours). The AI algorithms ranged from 2 to 48 hours and had a median horizon of 14 hours (IQR, 12-24 hours).
We found three studies reporting patient outcomes. The most recent of these was a large multicenter implementation study by Escobar et al25 that included an extensive follow-up response. This study reported a significantly decreased 30-day mortality in the intervention cohort. A smaller randomized controlled trial reported no significant differences in patient outcomes with earlier warning alarms.27 A third study reported more appropriate rapid response team deployment and decreased mortality in a subgroup analysis.35
Effects, Facilitators, and Barriers
As shown in the Appendix Figure and further detailed in Table 3, the described effects were predominantly positive—57 positive effects vs 11 negative effects. These positive effects sorted primarily into the outcome and process domains.
All of the studies that compared their proposed model with one of various warning systems (eg, EWS, National Early Warning Score [NEWS], Modified Early Warning Score [MEWS]) showed superior performance (based on AUROC and reported predictive values). In 17 studies, the authors reported their model as more useful or superior to the EWS.20-23,26-28,34,36-41 Four studies reported real-time detection of deterioration before regular EWS,20,26,42 and three studies reported positive effects on patient-related outcomes.26,35 Four negative effects were noted on the controllability, validity, and potential limitations.27,42
Of the 38 remarks in the Technology domain, difficulty with implementation in daily practice was a commonly cited barrier.22,24,40,42 Difficulties included creating real-time data feeds out of the EMR, though there were mentions of some successful examples.25,27,36 Difficulty in the interpretability of AI was also considered a potential barrier.30,32,33,35,39,41 There were remarks as to the applicability of the prolonged prediction horizon because of the associated decoupling from the clinical view.39,42
Conservative attitudes toward new technologies and inadequate knowledge were mentioned as barriers.39 Repeated remarks were made on the difficulty of interpreting and responding to a predicted escalation, as the clinical pattern might not be recognizable at such an early stage. On the other hand, it is expected that less invasive countermeasures would be adequate to avert further escalation. Earlier recognition of possible escalations also raised potential ethical questions, such as when to discuss palliative care.24
The heterogeneity of the general ward population and the relatively low prevalence of deterioration were mentioned as barriers.24,30,38,41 There were also concerns that not all escalations are preventable and that some patient outcomes may not be modifiable.24,38
Many investigators expected reductions in false alarms and associated alarm fatigue (reflected as higher PPVs). Furthermore, they expected workflow to improve and workload to decrease.21,23,27,31,33,35,38,41 Despite the capacity of modern EMRs to store large amounts of patient data, some investigators felt improvements to real-time access, data quality and validity, and data density are needed to ensure valid associated predictions.21,22,24,32,37
DISCUSSION
As the complexity and comorbidity of hospitalized adults grow, predicting clinical deterioration is becoming more important. With an ever-increasing amount of available
There are several important limitations across these studies. In a clinical setting, these models would function as a screening test. Almost all studies report an AUROC; however, sensitivity and PPV or NNE (defined as 1/PPV) may be more useful than AUROC when predicting low-frequency events with high-potential clinical impact.44 Assessing the NNE is especially relevant because of its relation to alarm fatigue and responsiveness of clinicians.43 Alarm fatigue and lack of adequate response to alarms were repeatedly cited as potential barriers for application of automated scores.
Although the results of our scoping review are promising, there are limited data on clinical outcomes using these algorithms. Only three of five algorithms were used to guide clinical decision-making.25,27,35 Kollef et al27 showed shorter hospitalizations and Evans et al35 found decreased mortality rates in a multimorbid subgroup. Escobar et al25 found an overall and consistent decrease in mortality in a large, heterogenic population of inpatients across 21 hospitals. While Escobar et al’s findings provide strong evidence that predictive algorithms and structured follow-up on alarms can improve patient outcomes, it recognizes that not all facilities will have the resources to implement them.25 Dedicated round-the-clock follow-up of alarms has yet to be proven feasible for smaller institutions, and leaner solutions must be explored. The example set by Escobar et al25 should be translated into various settings to prove its reproducibility and to substantiate the clinical impact of predictive models and structured follow-up.
According to expert opinion, the use of high-frequency or continuous monitoring at low-acuity wards and AI algorithms to detect trends and patterns will reduce failure-to-rescue rates.4,9,43 However, most studies in our review focused on periodic spot-checked vital signs, and none of the AI algorithms were implemented in clinical care (Appendix Table 1
STRENGTHS AND LIMITATIONS
We performed a comprehensive review of the current literature using a clear and reproducible methodology to minimize the risk of missing relevant publications. The identified research is mainly limited to large US centers and consists of mostly retrospective studies. Heterogeneity among inputs, endpoints, time horizons, and evaluation metrics make comparisons challenging. Comments on facilitators, barriers, and effects were limited.
RECOMMENDATIONS FOR FUTURE RESEARCH
Artificial intelligence and the use of continuous monitoring hold great promise in creating optimal predictive algorithms. Future studies should directly compare AI- and non-AI-based algorithms using continuous monitoring to determine predictive accuracy, feasibility, costs, and outcomes. A consensus on endpoint definitions, input variables, methodology, and reporting is needed to enhance reproducibility, comparability, and generalizability of future research.
CONCLUSION
- van Galen LS, Struik PW, Driesen BEJM, et al. Delayed recognition of deterioration of patients in general wards is mostly caused by human related monitoring failures: a root cause analysis of unplanned ICU admissions. PLoS One. 2016;11(8):e0161393. https://doi.org/10.1371/journal. pone.0161393
- Mardini L, Lipes J, Jayaraman D. Adverse outcomes associated with delayed intensive care consultation in medical and surgical inpatients. J Crit Care. 2012;27(6):688-693. https://doi.org/10.1016/j.jcrc.2012.04.011
- Young MP, Gooder VJ, McBride K, James B, Fisher ES. Inpatient transfers to the intensive care unit: delays are associated with increased mortality and morbidity. J Gen Intern Med. 2003;18(2):77-83. https://doi.org/10.1046/ j.1525-1497.2003.20441.x
- Khanna AK, Hoppe P, Saugel B. Automated continuous noninvasive ward monitoring: future directions and challenges. Crit Care. 2019;23(1):194. https://doi.org/10.1186/s13054-019-2485-7
- Ludikhuize J, Hamming A, de Jonge E, Fikkers BG. Rapid response systems in The Netherlands. Jt Comm J Qual Patient Saf. 2011;37(3):138-197. https:// doi.org/10.1016/s1553-7250(11)37017-1
- Cuthbertson BH, Boroujerdi M, McKie L, Aucott L, Prescott G. Can physiological variables and early warning scoring systems allow early recognition of the deteriorating surgical patient? Crit Care Med. 2007;35(2):402-409. https://doi.org/10.1097/01.ccm.0000254826.10520.87
- Alam N, Hobbelink EL, van Tienhoven AJ, van de Ven PM, Jansma EP, Nanayakkara PWB. The impact of the use of the Early Warning Score (EWS) on patient outcomes: a systematic review. Resuscitation. 2014;85(5):587-594. https://doi.org/10.1016/j.resuscitation.2014.01.013
- Weenk M, Koeneman M, van de Belt TH, Engelen LJLPG, van Goor H, Bredie SJH. Wireless and continuous monitoring of vital signs in patients at the general ward. Resuscitation. 2019;136:47-53. https://doi.org/10.1016/j.resuscitation.2019.01.017
- Cardona-Morrell M, Prgomet M, Turner RM, Nicholson M, Hillman K. Effectiveness of continuous or intermittent vital signs monitoring in preventing adverse events on general wards: a systematic review and meta-analysis. Int J Clin Pract. 2016;70(10):806-824. https://doi.org/10.1111/ijcp.12846
- Brown H, Terrence J, Vasquez P, Bates DW, Zimlichman E. Continuous monitoring in an inpatient medical-surgical unit: a controlled clinical trial. Am J Med. 2014;127(3):226-232. https://doi.org/10.1016/j.amjmed.2013.12.004
- Mestrom E, De Bie A, van de Steeg M, Driessen M, Atallah L, Bezemer R. Implementation of an automated early warning scoring system in a E8 Journal of Hospital Medicine® Published Online June 2021 An Official Publication of the Society of Hospital Medicine Peelen et al | Predicting Deterioration: A Scoping Review surgical ward: practical use and effects on patient outcomes. PLoS One. 2019;14(5):e0213402. https://doi.org/10.1371/journal.pone.0213402
- Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. https://doi.org/10.1136/ svn-2017-000101
- Iwashyna TJ, Liu V. What’s so different about big data? A primer for clinicians trained to think epidemiologically. Ann Am Thorac Soc. 2014;11(7):1130- 1135. https://doi.org/10.1513/annalsats.201405-185as
- Jalali A, Bender D, Rehman M, Nadkanri V, Nataraj C. Advanced analytics for outcome prediction in intensive care units. Conf Proc IEEE Eng Med Biol Soc. 2016;2016:2520-2524. https://doi.org/10.1109/embc.2016.7591243
- Munn Z, Peters MDJ, Stern C, Tufanaru C, McArthur A, Aromataris E. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol. 2018;18(1):143. https://doi.org/10.1186/s12874-018-0611-x
- Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19-32. https://doi.org/10.1080/13645 57032000119616
- Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMAScR): checklist and explanation. Ann Intern Med. 2018;169(7):467- 473. https://doi.org/10.7326/m18-0850
- Gagnon MP, Desmartis M, Gagnon J, et al. Framework for user involvement in health technology assessment at the local level: views of health managers, user representatives, and clinicians. Int J Technol Assess Health Care. 2015;31(1-2):68-77. https://doi.org/10.1017/s0266462315000070
- Donabedian A. The quality of care. How can it be assessed? JAMA. 1988;260(12):1743-1748. https://doi.org/10.1001/jama.260.12.1743
- Churpek MM, Yuen TC, Winslow C, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med. 2014;190(6):649-655. https://doi.org/10.1164/rccm.201406-1022oc
- Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med. 2016;44(2):368-374. https://doi.org/10.1097/ccm.0000000000001571
- Bartkowiak B, Snyder AM, Benjamin A, et al. Validating the electronic cardiac arrest risk triage (eCART) score for risk stratification of surgical inpatients in the postoperative setting: retrospective cohort study. Ann Surg. 2019;269(6):1059-1063. https://doi.org/10.1097/sla.0000000000002665
- Edelson DP, Carey K, Winslow CJ, Churpek MM. Less is more: detecting clinical deterioration in the hospital with machine learning using only age, heart rate and respiratory rate. Abstract presented at: American Thoracic Society International Conference; May 22, 2018; San Diego, California. Am J Resp Crit Care Med. 2018;197:A4444.
- Escobar GJ, LaGuardia JC, Turk BJ, Ragins A, Kipnis P, Draper D. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med. 2012;7(5):388-395. https:// doi.org/10.1002/jhm.1929
- Escobar GJ, Liu VX, Schuler A, Lawson B, Greene JD, Kipnis P. Automated identification of adults at risk for in-hospital clinical deterioration. N Engl J Med. 2020;383(20):1951-1960. https://doi.org/10.1056/nejmsa2001090
- Kipnis P, Turk BJ, Wulf DA, et al. Development and validation of an electronic medical record-based alert score for detection of inpatient deterioration outside the ICU. J Biomed Inform. 2016;64:10-19. https://doi.org/10.1016/j. jbi.2016.09.013
- Kollef MH, Chen Y, Heard K, et al. A randomized trial of real-time automated clinical deterioration alerts sent to a rapid response team. J Hosp Med. 2014;9(7):424-429. https://doi.org/10.1002/jhm.2193
- Hackmann G, Chen M, Chipara O, et al. Toward a two-tier clinical warning system for hospitalized patients. AMIA Annu Symp Proc. 2011;2011:511-519.
- Bailey TC, Chen Y, Mao Y, Lu, C, Hackmann G, Micek ST. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. J Hosp Med. 2013;8(5):236-242. https://doi.org/10.1002/jhm.2009
- Kwon JM, Lee Y, Lee Y, Lee S, Park J. An algorithm based on deep learning for predicting in-hospital cardiac arrest. J Am Heart Assoc. 2018;7(13):e008678. https://doi.org/10.1161/jaha.118.008678
- Correia S, Gomes A, Shahriari S, Almeida JP, Severo M, Azevedo A. Performance of the early warning system vital to predict unanticipated higher-level of care admission and in-hospital death of ward patients. Value Health. 2018;21(S3):S360. https://doi.org/10.1016/j.jval.2018.09.2152
- Shamout FE, Zhu T, Sharma P, Watkinson PJ, Clifton DA. Deep interpretable early warning system for the detection of clinical deterioration. IEEE J Biomed Health Inform. 2020;24(2):437-446. https://doi.org/10.1109/ jbhi.2019.2937803
- Bai Y, Do DH, Harris PRE, et al. Integrating monitor alarms with laboratory test results to enhance patient deterioration prediction. J Biomed Inform. 2015;53:81-92. https://doi.org/10.1016/j.jbi.2014.09.006
- Hu X, Sapo M, Nenov V, et al. Predictive combinations of monitor alarms preceding in-hospital code blue events. J Biomed Inform. 2012;45(5):913-921. https://doi.org/10.1016/j.jbi.2012.03.001
- Evans RS, Kuttler KG, Simpson KJ, et al. Automated detection of physiologic deterioration in hospitalized patients. J Am Med Inform Assoc. 2015;22(2):350-360. https://doi.org/10.1136/amiajnl-2014-002816
- Ghosh E, Eshelman L, Yang L, Carlson E, Lord B. Early deterioration indicator: data-driven approach to detecting deterioration in general ward. Resuscitation. 2018;122:99-105. https://doi.org/10.1016/j.resuscitation. 2017.10.026
- Kang MA, Churpek MM, Zadravecz FJ, Adhikari R, Twu NM, Edelson DP: Real-time risk prediction on the wards: a feasibility study. Crit Care Med. 2016;44(8):1468-1473. https://doi.org/10.1097/ccm.0000000000001716
- Hu SB, Wong DJL, Correa A, Li N, Deng JC. Prediction of clinical deterioration in hospitalized adult patients with hematologic malignancies using a neural network model. PLoS One. 2016;11(8):e0161401. https://doi. org/10.1371/journal.pone.0161401
- Rothman MJ, Rothman SI, Beals J 4th. Development and validation of a continuous measure of patient condition using the electronic medical record. J Biomed Inform. 2013;46(5):837-848. https://doi.org/10.1016/j. jbi.2013.06.011
- Alaa AM, Yoon J, Hu S, van der Schaar M. Personalized risk scoring for critical care prognosis using mixtures of Gaussian processes. IEEE Trans Biomed Eng. 2018;65(1):207-218. https://doi.org/10.1109/tbme.2017.2698602
- Mohamadlou H, Panchavati S, Calvert J, et al. Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction. Health Informatics J. 2020;26(3):1912-1925. https://doi.org/10.1177/1460458219894494
- Alvarez CA, Clark CA, Zhang S, et al. Predicting out of intensive care unit cardiopulmonary arrest or death using electronic medical record data. BMC Med Inform Decis Mak. 2013;13:28. https://doi.org/10.1186/1472-6947-13-28
- Vincent JL, Einav S, Pearse R, et al. Improving detection of patient deterioration in the general hospital ward environment. Eur J Anaesthesiol. 2018;35(5):325-333. https://doi.org/10.1097/eja.0000000000000798
- Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M. Why the C-statistic is not informative to evaluate early warning scores and what metrics to use. Crit Care. 2015;19(1):285. https://doi.org/10.1186/s13054-015-0999-1
- Weenk M, Bredie SJ, Koeneman M, Hesselink G, van Goor H, van de Belt TH. Continuous monitoring of the vital signs in the general ward using wearable devices: randomized controlled trial. J Med Internet Res. 2020;22(6):e15471. https://doi.org/10.2196/15471
- Wellner B, Grand J, Canzone E, et al. Predicting unplanned transfers to the intensive care unit: a machine learning approach leveraging diverse clinical elements. JMIR Med Inform. 2017;5(4):e45. https://doi.org/10.2196/medinform.8680
- Elliott M, Baird J. Pulse oximetry and the enduring neglect of respiratory rate assessment: a commentary on patient surveillance. Br J Nurs. 2019;28(19):1256-1259. https://doi.org/10.12968/bjon.2019.28.19.1256
- Blackwell JN, Keim-Malpass J, Clark MT, et al. Early detection of in-patient deterioration: one prediction model does not fit all. Crit Care Explor. 2020;2(5):e0116. https://doi.org/10.1097/cce.0000000000000116
- Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. https://doi.org/10.1038/sdata.2016.35
- Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12(6):573-576. https://doi. org/10.1370/afm.1713
- Kirkland LL, Malinchoc M, O’Byrne M, et al. A clinical deterioration prediction tool for internal medicine patients. Am J Med Qual. 2013;28(2):135-142 https://doi.org/10.1177/1062860612450459
- van Galen LS, Struik PW, Driesen BEJM, et al. Delayed recognition of deterioration of patients in general wards is mostly caused by human related monitoring failures: a root cause analysis of unplanned ICU admissions. PLoS One. 2016;11(8):e0161393. https://doi.org/10.1371/journal. pone.0161393
- Mardini L, Lipes J, Jayaraman D. Adverse outcomes associated with delayed intensive care consultation in medical and surgical inpatients. J Crit Care. 2012;27(6):688-693. https://doi.org/10.1016/j.jcrc.2012.04.011
- Young MP, Gooder VJ, McBride K, James B, Fisher ES. Inpatient transfers to the intensive care unit: delays are associated with increased mortality and morbidity. J Gen Intern Med. 2003;18(2):77-83. https://doi.org/10.1046/ j.1525-1497.2003.20441.x
- Khanna AK, Hoppe P, Saugel B. Automated continuous noninvasive ward monitoring: future directions and challenges. Crit Care. 2019;23(1):194. https://doi.org/10.1186/s13054-019-2485-7
- Ludikhuize J, Hamming A, de Jonge E, Fikkers BG. Rapid response systems in The Netherlands. Jt Comm J Qual Patient Saf. 2011;37(3):138-197. https:// doi.org/10.1016/s1553-7250(11)37017-1
- Cuthbertson BH, Boroujerdi M, McKie L, Aucott L, Prescott G. Can physiological variables and early warning scoring systems allow early recognition of the deteriorating surgical patient? Crit Care Med. 2007;35(2):402-409. https://doi.org/10.1097/01.ccm.0000254826.10520.87
- Alam N, Hobbelink EL, van Tienhoven AJ, van de Ven PM, Jansma EP, Nanayakkara PWB. The impact of the use of the Early Warning Score (EWS) on patient outcomes: a systematic review. Resuscitation. 2014;85(5):587-594. https://doi.org/10.1016/j.resuscitation.2014.01.013
- Weenk M, Koeneman M, van de Belt TH, Engelen LJLPG, van Goor H, Bredie SJH. Wireless and continuous monitoring of vital signs in patients at the general ward. Resuscitation. 2019;136:47-53. https://doi.org/10.1016/j.resuscitation.2019.01.017
- Cardona-Morrell M, Prgomet M, Turner RM, Nicholson M, Hillman K. Effectiveness of continuous or intermittent vital signs monitoring in preventing adverse events on general wards: a systematic review and meta-analysis. Int J Clin Pract. 2016;70(10):806-824. https://doi.org/10.1111/ijcp.12846
- Brown H, Terrence J, Vasquez P, Bates DW, Zimlichman E. Continuous monitoring in an inpatient medical-surgical unit: a controlled clinical trial. Am J Med. 2014;127(3):226-232. https://doi.org/10.1016/j.amjmed.2013.12.004
- Mestrom E, De Bie A, van de Steeg M, Driessen M, Atallah L, Bezemer R. Implementation of an automated early warning scoring system in a E8 Journal of Hospital Medicine® Published Online June 2021 An Official Publication of the Society of Hospital Medicine Peelen et al | Predicting Deterioration: A Scoping Review surgical ward: practical use and effects on patient outcomes. PLoS One. 2019;14(5):e0213402. https://doi.org/10.1371/journal.pone.0213402
- Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. https://doi.org/10.1136/ svn-2017-000101
- Iwashyna TJ, Liu V. What’s so different about big data? A primer for clinicians trained to think epidemiologically. Ann Am Thorac Soc. 2014;11(7):1130- 1135. https://doi.org/10.1513/annalsats.201405-185as
- Jalali A, Bender D, Rehman M, Nadkanri V, Nataraj C. Advanced analytics for outcome prediction in intensive care units. Conf Proc IEEE Eng Med Biol Soc. 2016;2016:2520-2524. https://doi.org/10.1109/embc.2016.7591243
- Munn Z, Peters MDJ, Stern C, Tufanaru C, McArthur A, Aromataris E. Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med Res Methodol. 2018;18(1):143. https://doi.org/10.1186/s12874-018-0611-x
- Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8(1):19-32. https://doi.org/10.1080/13645 57032000119616
- Tricco AC, Lillie E, Zarin W, et al. PRISMA extension for scoping reviews (PRISMAScR): checklist and explanation. Ann Intern Med. 2018;169(7):467- 473. https://doi.org/10.7326/m18-0850
- Gagnon MP, Desmartis M, Gagnon J, et al. Framework for user involvement in health technology assessment at the local level: views of health managers, user representatives, and clinicians. Int J Technol Assess Health Care. 2015;31(1-2):68-77. https://doi.org/10.1017/s0266462315000070
- Donabedian A. The quality of care. How can it be assessed? JAMA. 1988;260(12):1743-1748. https://doi.org/10.1001/jama.260.12.1743
- Churpek MM, Yuen TC, Winslow C, et al. Multicenter development and validation of a risk stratification tool for ward patients. Am J Respir Crit Care Med. 2014;190(6):649-655. https://doi.org/10.1164/rccm.201406-1022oc
- Churpek MM, Yuen TC, Winslow C, Meltzer DO, Kattan MW, Edelson DP. Multicenter comparison of machine learning methods and conventional regression for predicting clinical deterioration on the wards. Crit Care Med. 2016;44(2):368-374. https://doi.org/10.1097/ccm.0000000000001571
- Bartkowiak B, Snyder AM, Benjamin A, et al. Validating the electronic cardiac arrest risk triage (eCART) score for risk stratification of surgical inpatients in the postoperative setting: retrospective cohort study. Ann Surg. 2019;269(6):1059-1063. https://doi.org/10.1097/sla.0000000000002665
- Edelson DP, Carey K, Winslow CJ, Churpek MM. Less is more: detecting clinical deterioration in the hospital with machine learning using only age, heart rate and respiratory rate. Abstract presented at: American Thoracic Society International Conference; May 22, 2018; San Diego, California. Am J Resp Crit Care Med. 2018;197:A4444.
- Escobar GJ, LaGuardia JC, Turk BJ, Ragins A, Kipnis P, Draper D. Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med. 2012;7(5):388-395. https:// doi.org/10.1002/jhm.1929
- Escobar GJ, Liu VX, Schuler A, Lawson B, Greene JD, Kipnis P. Automated identification of adults at risk for in-hospital clinical deterioration. N Engl J Med. 2020;383(20):1951-1960. https://doi.org/10.1056/nejmsa2001090
- Kipnis P, Turk BJ, Wulf DA, et al. Development and validation of an electronic medical record-based alert score for detection of inpatient deterioration outside the ICU. J Biomed Inform. 2016;64:10-19. https://doi.org/10.1016/j. jbi.2016.09.013
- Kollef MH, Chen Y, Heard K, et al. A randomized trial of real-time automated clinical deterioration alerts sent to a rapid response team. J Hosp Med. 2014;9(7):424-429. https://doi.org/10.1002/jhm.2193
- Hackmann G, Chen M, Chipara O, et al. Toward a two-tier clinical warning system for hospitalized patients. AMIA Annu Symp Proc. 2011;2011:511-519.
- Bailey TC, Chen Y, Mao Y, Lu, C, Hackmann G, Micek ST. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. J Hosp Med. 2013;8(5):236-242. https://doi.org/10.1002/jhm.2009
- Kwon JM, Lee Y, Lee Y, Lee S, Park J. An algorithm based on deep learning for predicting in-hospital cardiac arrest. J Am Heart Assoc. 2018;7(13):e008678. https://doi.org/10.1161/jaha.118.008678
- Correia S, Gomes A, Shahriari S, Almeida JP, Severo M, Azevedo A. Performance of the early warning system vital to predict unanticipated higher-level of care admission and in-hospital death of ward patients. Value Health. 2018;21(S3):S360. https://doi.org/10.1016/j.jval.2018.09.2152
- Shamout FE, Zhu T, Sharma P, Watkinson PJ, Clifton DA. Deep interpretable early warning system for the detection of clinical deterioration. IEEE J Biomed Health Inform. 2020;24(2):437-446. https://doi.org/10.1109/ jbhi.2019.2937803
- Bai Y, Do DH, Harris PRE, et al. Integrating monitor alarms with laboratory test results to enhance patient deterioration prediction. J Biomed Inform. 2015;53:81-92. https://doi.org/10.1016/j.jbi.2014.09.006
- Hu X, Sapo M, Nenov V, et al. Predictive combinations of monitor alarms preceding in-hospital code blue events. J Biomed Inform. 2012;45(5):913-921. https://doi.org/10.1016/j.jbi.2012.03.001
- Evans RS, Kuttler KG, Simpson KJ, et al. Automated detection of physiologic deterioration in hospitalized patients. J Am Med Inform Assoc. 2015;22(2):350-360. https://doi.org/10.1136/amiajnl-2014-002816
- Ghosh E, Eshelman L, Yang L, Carlson E, Lord B. Early deterioration indicator: data-driven approach to detecting deterioration in general ward. Resuscitation. 2018;122:99-105. https://doi.org/10.1016/j.resuscitation. 2017.10.026
- Kang MA, Churpek MM, Zadravecz FJ, Adhikari R, Twu NM, Edelson DP: Real-time risk prediction on the wards: a feasibility study. Crit Care Med. 2016;44(8):1468-1473. https://doi.org/10.1097/ccm.0000000000001716
- Hu SB, Wong DJL, Correa A, Li N, Deng JC. Prediction of clinical deterioration in hospitalized adult patients with hematologic malignancies using a neural network model. PLoS One. 2016;11(8):e0161401. https://doi. org/10.1371/journal.pone.0161401
- Rothman MJ, Rothman SI, Beals J 4th. Development and validation of a continuous measure of patient condition using the electronic medical record. J Biomed Inform. 2013;46(5):837-848. https://doi.org/10.1016/j. jbi.2013.06.011
- Alaa AM, Yoon J, Hu S, van der Schaar M. Personalized risk scoring for critical care prognosis using mixtures of Gaussian processes. IEEE Trans Biomed Eng. 2018;65(1):207-218. https://doi.org/10.1109/tbme.2017.2698602
- Mohamadlou H, Panchavati S, Calvert J, et al. Multicenter validation of a machine-learning algorithm for 48-h all-cause mortality prediction. Health Informatics J. 2020;26(3):1912-1925. https://doi.org/10.1177/1460458219894494
- Alvarez CA, Clark CA, Zhang S, et al. Predicting out of intensive care unit cardiopulmonary arrest or death using electronic medical record data. BMC Med Inform Decis Mak. 2013;13:28. https://doi.org/10.1186/1472-6947-13-28
- Vincent JL, Einav S, Pearse R, et al. Improving detection of patient deterioration in the general hospital ward environment. Eur J Anaesthesiol. 2018;35(5):325-333. https://doi.org/10.1097/eja.0000000000000798
- Romero-Brufau S, Huddleston JM, Escobar GJ, Liebow M. Why the C-statistic is not informative to evaluate early warning scores and what metrics to use. Crit Care. 2015;19(1):285. https://doi.org/10.1186/s13054-015-0999-1
- Weenk M, Bredie SJ, Koeneman M, Hesselink G, van Goor H, van de Belt TH. Continuous monitoring of the vital signs in the general ward using wearable devices: randomized controlled trial. J Med Internet Res. 2020;22(6):e15471. https://doi.org/10.2196/15471
- Wellner B, Grand J, Canzone E, et al. Predicting unplanned transfers to the intensive care unit: a machine learning approach leveraging diverse clinical elements. JMIR Med Inform. 2017;5(4):e45. https://doi.org/10.2196/medinform.8680
- Elliott M, Baird J. Pulse oximetry and the enduring neglect of respiratory rate assessment: a commentary on patient surveillance. Br J Nurs. 2019;28(19):1256-1259. https://doi.org/10.12968/bjon.2019.28.19.1256
- Blackwell JN, Keim-Malpass J, Clark MT, et al. Early detection of in-patient deterioration: one prediction model does not fit all. Crit Care Explor. 2020;2(5):e0116. https://doi.org/10.1097/cce.0000000000000116
- Johnson AEW, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. https://doi.org/10.1038/sdata.2016.35
- Bodenheimer T, Sinsky C. From triple to quadruple aim: care of the patient requires care of the provider. Ann Fam Med. 2014;12(6):573-576. https://doi. org/10.1370/afm.1713
- Kirkland LL, Malinchoc M, O’Byrne M, et al. A clinical deterioration prediction tool for internal medicine patients. Am J Med Qual. 2013;28(2):135-142 https://doi.org/10.1177/1062860612450459
Reducing Overuse of Proton Pump Inhibitors for Stress Ulcer Prophylaxis and Nonvariceal Gastrointestinal Bleeding in the Hospital: A Narrative Review and Implementation Guide
Proton pump inhibitors (PPIs) are among the most commonly used drugs worldwide to treat dyspepsia and prevent gastrointestinal bleeding (GIB).1 Between 40% and 70% of hospitalized patients receive acid-suppressive therapy (AST; defined as PPIs or histamine-receptor antagonists), and nearly half of these are initiated during the inpatient stay.2,3 While up to 50% of inpatients who received a new AST were discharged on these medications,2 there were no evidence-based indications for a majority of the prescriptions.2,3
Growing evidence shows that PPIs are overutilized and may be associated with wide-ranging adverse events, such as acute and chronic kidney disease,4Clostridium difficile infection,5 hypomagnesemia,6 and fractures.7 Because of the widespread overuse and the potential harm associated with PPIs, a concerted effort to promote their appropriate use in the inpatient setting is necessary. It is important to note that reducing the use of PPIs does not increase the risks of GIB or worsening dyspepsia. Rather, reducing overuse of PPIs lowers the risk of harm to patients. The efforts to reduce overuse, however, are complex and difficult.
This article summarizes evidence regarding interventions to reduce overuse and offers an implementation guide based on this evidence. This guide promotes value-based quality improvement and provides a blueprint for implementing an institution-wide program to reduce PPI overuse in the inpatient setting. We begin with a discussion about quality initiatives to reduce PPI overuse, followed by a review of the safety outcomes associated with reduced use of PPIs.
METHODS
A focused search of the US National Library of Medicine’s PubMed database was performed to identify English-language articles published between 2000 and 2018 that addressed strategies to reduce PPI overuse for stress ulcer prophylaxis (SUP) and nonvariceal GIB. The following search terms were used: PPI and inappropriate use; acid-suppressive therapy and inappropriate use; PPI and discontinuation; acid-suppressive (or suppressant) therapy and discontinuation; SUP and cost; and histamine receptor antagonist and PPI. Inpatient or outpatient studies of patients aged 18 years or older were considered for inclusion in this narrative review, and all study types were included. The primary exclusion criterion was patients aged younger than 18 years. A manual review of the full text of the retrieved articles was performed and references were reviewed for missed citations.
RESULTS
We identified a total of 1,497 unique citations through our initial search. After performing a manual review, we excluded 1,483 of the references and added an additional 2, resulting in 16 articles selected for inclusion. The selected articles addressed interventions falling into three main groupings: implementation of institutional guidelines with or without electronic health record (EHR)–based decision support, educational interventions alone, and multifaceted interventions. Each of these interventions is discussed in the sections that follow. Table 1, Table 2, and Table 3 summarize the results of the studies included in our narrative review.
QUALITY INITIATIVES TO REDUCE PPI OVERUSE
Institutional Guidelines With or Without EHR-Based Decision Support
Table 1 summarizes institutional guidelines, with or without EHR-based decision support, to reduce inappropriate PPI use. The implementation of institutional guidelines for the appropriate reduction of PPI use has had some success. Coursol and Sanzari evaluated the impact of a treatment algorithm on the appropriateness of prescriptions for SUP in the intensive care unit (ICU).8 Risk factors of patients in this study included mechanical ventilation for 48 hours, coagulopathy for 24 hours, postoperative transplant, severe burns, active gastrointestinal (GI) disease, multiple trauma, multiple organ failure, and septicemia. The three treatment options chosen for the algorithm were intravenous (IV) famotidine (if the oral route was unavailable or impractical), omeprazole tablets (if oral access was available), and omeprazole suspension (in cases of dysphagia and presence of nasogastric or orogastric tube). After implementation of the treatment algorithm, the proportion of inappropriate prophylaxis decreased from 95.7% to 88.2% (P = .033), and the cost per patient decreased from $11.11 to $8.49 Canadian dollars (P = .003).
Van Vliet et al implemented a clinical practice guideline listing specific criteria for prescribing a PPI.9 Their criteria included the presence of gastric or duodenal ulcer and use of a nonsteroidal anti-inflammatory drug (NSAID) or aspirin, plus at least one additional risk factor (eg, history of gastroduodenal hemorrhage or age >70 years). The proportion of patients started on PPIs during hospitalization decreased from 21% to 13% (odds ratio, 0.56; 95% CI, 0.33-0.97).
Michal et al utilized an institutional pharmacist-driven protocol that stipulated criteria for appropriate PPI use (eg, upper GIB, mechanical ventilation, peptic ulcer disease, gastroesophageal reflux disease, coagulopathy).10 Pharmacists in the study evaluated patients for PPI appropriateness and recommended changes in medication or discontinuation of use. This institutional intervention decreased PPI use in non-ICU hospitalized adults. Discontinuation of PPIs increased from 41% of patients in the preintervention group to 66% of patients in the postintervention group (P = .001).
In addition to implementing guidelines and intervention strategies, institutions have also adopted changes to the EHR to reduce inappropriate PPI use. Herzig et al utilized a computerized clinical decision support intervention to decrease SUP in non-ICU hospitalized patients.11 Of the available response options for acid-suppressive medication, when SUP was chosen as the only indication for PPI use a prompt alerted the clinician that “[SUP] is not recommended for patients outside the [ICU]”; the alert resulted in a significant reduction in AST for the sole purpose of SUP. With this intervention, the percentage of patients who had any inappropriate acid-suppressive exposure decreased from 4.0% to 0.6% (P < .001).
EDUCATION
Table 2 summarizes educational interventions to reduce inappropriate PPI use.
Agee et al employed a pharmacist-led educational seminar that described SUP indications, risks, and costs.12 Inappropriate SUP prescriptions decreased from 55.5% to 30.5% after the intervention (P < .0001). However, there was no reduction in the percentage of patients discharged on inappropriate AST.
Chui et al performed an intervention with academic detailing wherein a one-on-one visit with a physician took place, providing education to improve physician prescribing behavior.13 In this study, academic detailing focused on the most common instances for which PPIs were inappropriately utilized at that hospital (eg, surgical prophylaxis, anemia). Inappropriate use of double-dose PPIs was also targeted. Despite these efforts, no significant difference in inappropriate PPI prescribing was observed post intervention.
Hamzat et al implemented an educational strategy to reduce inappropriate PPI prescribing during hospital stays, which included dissemination of fliers, posters, emails, and presentations over a 4-week period.14 Educational efforts targeted clinical pharmacists, nurses, physicians, and patients. Appropriate indications for PPI use in this study included peptic ulcer disease (current or previous), H pylori infection, and treatment or prevention of an NSAID-induced ulcer. The primary outcome was a reduction in PPI dose or discontinuation of PPI during the hospital admission, which increased from 9% in the preintervention (pre-education) phase to 43% during the intervention (education) phase and to 46% in the postintervention (posteducation) phase (P = .006).
Liberman and Whelan also implemented an educational intervention among internal medicine residents to reduce inappropriate use of SUP; this intervention was based on practice-based learning and improvement methodology.15 They noted that the rate of inappropriate prophylaxis with AST decreased from 59% preintervention to 33% post intervention (P < .007).
MULTIFACETED APPROACHES
Table 3 summarizes several multifaceted approaches aimed at reducing inappropriate PPI use. Belfield et al utilized an intervention consisting of an institutional guideline review, education, and monitoring of AST by clinical pharmacists to reduce inappropriate use of PPI for SUP.16 With this intervention, the primary outcome of total inappropriate days of AST during hospitalization decreased from 279 to 116 (48% relative reduction in risk, P < .01, across 142 patients studied). Furthermore, inappropriate AST prescriptions at discharge decreased from 32% to 8% (P = .006). The one case of GIB noted in this study occurred in the control group.
Del Giorno et al combined audit and feedback with education to reduce new PPI prescriptions at the time of discharge from the hospital.17 The educational component of this intervention included guidance regarding potentially inappropriate PPI use and associated side effects and targeted multiple departments in the hospital. This intervention led to a sustained reduction in new PPI prescriptions at discharge during the 3-year study period. The annual rate of new PPI prescriptions was 19%, 19%, 18%, and 16% in years 2014, 2015, 2016, and 2017, respectively, in the internal medicine department (postintervention group), compared with rates of 30%, 29%, 36%, 36% (P < .001) for the same years in the surgery department (control group).
Education and the use of medication reconciliation forms on admission and discharge were utilized by Gupta et al to reduce inappropriate AST in hospitalized patients from 51% prior to intervention to 22% post intervention (P < .001).18 Furthermore, the proportion of patients discharged on inappropriate AST decreased from 69% to 20% (P < .001).
Hatch et al also used educational resources and pharmacist-led medication reconciliation to reduce use of SUP.19 Before the intervention, 24.4% of patients were continued on SUP after hospital discharge in the absence of a clear indication for use; post intervention, 11% of patients were continued on SUP after hospital discharge (of these patients, 8.7% had no clear indication for use). This represented a 64.4% decrease in inappropriately prescribed SUP after discharge (P < .0001).
Khalili et al combined an educational intervention with an institutional guideline in an infectious disease ward to reduce inappropriate use of SUP.20 This intervention reduced the inappropriate use of AST from 80.9% before the intervention to 47.1% post intervention (P < .001).
Masood et al implemented two interventions wherein pharmacists reviewed SUP indications for each patient during daily team rounds, and ICU residents and fellows received education about indications for SUP and the implemented initiative on a bimonthly basis.21 Inappropriate AST decreased from 26.75 to 7.14 prescriptions per 100 patient-days of care (P < .001).
McDonald et al combined education with a web-based quality improvement tool to reduce inappropriate exit prescriptions for PPIs.22 The proportion of PPIs discontinued at hospital discharge increased from 7.7% per month to 18.5% per month (P = .03).
Finally, the initiative implemented by Tasaka et al to reduce overutilization of SUP included an institutional guideline, a pharmacist-led intervention, and an institutional education and awareness campaign.23 Their initiative led to a reduction in inappropriate SUP both at the time of transfer out of the ICU (8% before intervention, 4% post intervention, P = .54) and at the time of discharge from the hospital (7% before intervention, 0% post intervention, P = .22).
REDUCING PPI USE AND SAFETY OUTCOMES
Proton pump inhibitors are often initiated in the hospital setting, with up to half of these new prescriptions continued at discharge.2,24,25 Inappropriate prescriptions for PPIs expose patients to excess risk of long-term adverse events.26 De-escalating PPIs, however, raises concern among clinicians and patients for potential recurrence of dyspepsia and GIB. There is limited evidence regarding long-term safety outcomes (including GIB) following the discontinuation of PPIs deemed to have been inappropriately initiated in the hospital. In view of this, clinicians should educate and monitor individual patients for symptom relapse to ensure timely and appropriate resumption of AST.
LIMITATIONS
Our literature search for this narrative review and implementation guide has limitations. First, the time frame we included (2000-2018) may have excluded relevant articles published before our starting year. We did not include articles published before 2000 based on concerns these might contain outdated information. Also, there may have been incomplete retrieval of relevant studies/articles due to the labor-intensive nature involved in determining whether PPI prescriptions are appropriate or inappropriate.
We noted that interventional studies aimed at reducing overuse of PPIs were often limited by a low number of participants; these studies were also more likely to be single-center interventions, which limits generalizability. In addition, the studies often had low methodological rigor and lacked randomization or controls. Moreover, to fully evaluate the sustainability of interventions, some of the studies had a limited postimplementation period. For multifaceted interventions, the efficacy of individual components of the interventions was not clearly evaluated. Moreover, there was a high risk of bias in many of the included studies. Some of the larger studies used overall AST prescriptions as a surrogate for more appropriate use. It would be advantageous for a site to perform a pilot study that provides well-defined parameters for appropriate prescribing, and then correlate with the total number of prescriptions (automated and much easier) thereafter. Further, although the evidence regarding appropriate PPI use for SUP and GIB has shifted rapidly in recent years, society guidelines have not been updated to reflect this change. As such, quality improvement interventions have predominantly focused on reducing PPI use for the indications reflected by these guidelines.
IMPLEMENTATION BLUEPRINT
The following are our recommendations for successfully implementing an evidence-based, institution-wide initiative to promote the appropriate use of PPIs during hospitalization. These recommendations are informed by the evidence review and reflect the consensus of the combined committees coauthoring this review.
For an initiative to succeed, participation from multiple disciplines is necessary to formulate local guidelines and design and implement interventions. Such an interdisciplinary approach requires advocates to closely monitor and evaluate the program; sustainability will be greatly facilitated by the active engagement of key stakeholders, including the hospital’s executive administration, supply chain, pharmacists, and gastroenterologists. Lack of adequate buy-in on the part of key stakeholders is a barrier to the success of any intervention. Accordingly, before selecting a particular intervention, it is important to understand local factors driving the overuse of PPI.
1. Develop evidence-based institutional guidelines for both SUP and nonvariceal upper GIB through an interdisciplinary workgroup.
- Establish an interdisciplinary group including, but not limited to, pharmacists, hospitalists, gastroenterologists, and intensivists so that changes in practice will be widely adopted as institutional policy.
- Incorporate the best evidence and clearly convey appropriate and inappropriate uses.
2. Integrate changes to the EHR.
- If possible, the EHR should be leveraged to implement changes in PPI ordering practices.
- While integrating changes to the EHR, it is important to consider informatics and implementation science, since the utility of hard stops and best practice alerts has been questioned in the setting of operational inefficiencies and alert fatigue.
- Options for integrating changes to the EHR include the following:
- Create an ordering pathway that provides clinical decision support for PPI use.
- Incorporate a best practice alert in the EMR to notify clinicians of institutional guidelines when they initiate an order for PPI outside of the pathway.
- Consider restricting the authority to order IV PPIs by requiring a code or password or implement another means of using the EHR to limit the supply of PPI.
- Limit the duration of IV PPI by requiring daily renewal of IV PPI dosing or by altering the period of time that use of IV PPI is permitted (eg, 48 to 72 hours).
- PPIs should be removed from any current order sets that include medications for SUP.
3. Foster pharmacy-driven interventions.
- Consider requiring pharmacist approval for IV PPIs.
- Pharmacist-led review and feedback to clinicians for discontinuation of inappropriate PPIs can be effective in decreasing inappropriate utilization.
4. Provide education, audit data, and obtain feedback.
- Data auditing is needed to measure the efficacy of interventions. Outcome measures may include the number of non-ICU and ICU patients who are started on a PPI during an admission; the audit should be continued through discharge. A process measure may be the number of pharmacist calls for inappropriate PPIs. A balancing measure would be ulcer-specific upper GIB in patients who do not receive SUP during their admission. (Upper GIB from other etiologies, such as varices, portal hypertensive gastropathy, and Mallory-Weiss tear would not be affected by PPI SUP.)
- Run or control charts should be utilized, and data should be shared with project champions and ordering clinicians—in real time if possible.
- Project champions should provide feedback to colleagues; they should also work with hospital leadership to develop new strategies to improve adherence.
- Provide ongoing education about appropriate indications for PPIs and potential adverse effects associated with their use. Whenever possible, point-of-care or just-in-time teaching is the preferred format.
CONCLUSION
Excessive use of PPIs during hospitalization is prevalent; however, quality improvement interventions can be effective in achieving sustainable reductions in overuse. There is a need for the American College of Gastroenterology to revisit and update their guidelines for management of patients with ulcer bleeding to include stronger evidence-based recommendations on the proper use of PPIs.27 These updated guidelines could be used to update the implementation blueprint.
Quality improvement teams have an opportunity to use the principles of value-based healthcare to reduce inappropriate PPI use. By following the blueprint outlined in this article, institutions can safely and effectively tailor the use of PPIs to suitable patients in the appropriate settings. Reduction of PPI overuse can be employed as an institutional catalyst to promote implementation of further value-based measures to improve efficiency and quality of patient care.
1. Savarino V, Marabotto E, Zentilin P, et al. Proton pump inhibitors: use and misuse in the clinical setting. Exp Rev Clin Pharmacol. 2018;11(11):1123-1134. https://doi.org/10.1080/17512433.2018.1531703
2. Nardino RJ, Vender RJ, Herbert PN. Overuse of acid-suppressive therapy in hospitalized patients. Am J Gastroenterol. 2000;95(11):3118-3122. https://doi.org/10.1111/j.1572-0241.2000.03259.x
3. Ahrens D, Behrens G, Himmel W, Kochen MM, Chenot JF. Appropriateness of proton pump inhibitor recommendations at hospital discharge and continuation in primary care. Int J Clin Pract. 2012;66(8):767-773. https://doi.org/10.1111/j.1742-1241.2012.02973.x
4. Moledina DG, Perazella MA. PPIs and kidney disease: from AIN to CKD. J Nephrol. 2016;29(5):611-616. https://doi.org/10.1007/s40620-016-0309-2
5. Kwok CS, Arthur AK, Anibueze CI, Singh S, Cavallazzi R, Loke YK. Risk of Clostridium difficile infection with acid suppressing drugs and antibiotics: meta-analysis. Am J Gastroenterol. 2012;107(7):1011-1019. https://doi.org/10.1038/ajg.2012.108
6. Cheungpasitporn W, Thongprayoon C, Kittanamongkolchai W, et al. Proton pump inhibitors linked to hypomagnesemia: a systematic review and meta-analysis of observational studies. Ren Fail. 2015;37(7):1237-1241. https://doi.org/10.3109/0886022x.2015.1057800
7. Yang YX, Lewis JD, Epstein S, Metz DC. Long-term proton pump inhibitor therapy and risk of hip fracture. JAMA. 2006;296(24):2947-2953. https://doi.org/10.1001/jama.296.24.2947
8. Coursol CJ, Sanzari SE. Impact of stress ulcer prophylaxis algorithm study. Ann Pharmacother. 2005;39(5):810-816. https://doi.org/10.1345/aph.1d129
9. van Vliet EPM, Steyerberg EW, Otten HJ, et al. The effects of guideline implementation for proton pump inhibitor prescription on two pulmonary medicine wards. Aliment Pharmacol Ther. 2009;29(2):213-221. https://doi.org/10.1111/j.1365-2036.2008.03875.x
10. Michal J, Henry T, Street C. Impact of a pharmacist-driven protocol to decrease proton pump inhibitor use in non-intensive care hospitalized adults. Am J Health Syst Pharm. 2016;73(17 Suppl 4):S126-S132. https://doi.org/10.2146/ajhp150519
11. Herzig SJ, Guess JR, Feinbloom DB, et al. Improving appropriateness of acid-suppressive medication use via computerized clinical decision support. J Hosp Med. 2015;10(1):41-45. https://doi.org/10.1002/jhm.2260
12. Agee C, Coulter L, Hudson J. Effects of pharmacy resident led education on resident physician prescribing habits associated with stress ulcer prophylaxis in non-intensive care unit patients. Am J Health Syst Pharm. 2015;72(11 Suppl 1):S48-S52. https://doi.org/10.2146/sp150013
13. Chui D, Young F, Tejani AM, Dillon EC. Impact of academic detailing on proton pump inhibitor prescribing behaviour in a community hospital. Can Pharm J (Ott). 2011;144(2):66-71. https://doi.org/10.3821/1913-701X-144.2.66
14. Hamzat H, Sun H, Ford JC, Macleod J, Soiza RL, Mangoni AA. Inappropriate prescribing of proton pump inhibitors in older patients: effects of an educational strategy. Drugs Aging. 2012;29(8):681-690. https://doi.org/10.1007/bf03262283
15. Liberman JD, Whelan CT. Brief report: Reducing inappropriate usage of stress ulcer prophylaxis among internal medicine residents. A practice-based educational intervention. J Gen Intern Med. 2006;21(5):498-500. https://doi.org/10.1111/j.1525-1497.2006.00435.x
16. Belfield KD, Kuyumjian AG, Teran R, Amadi M, Blatt M, Bicking K. Impact of a collaborative strategy to reduce the inappropriate use of acid suppressive therapy in non-intensive care unit patients. Ann Pharmacother. 2017;51(7):577-583. https://doi.org/10.1177/1060028017698797
17. Del Giorno R, Ceschi A, Pironi M, Zasa A, Greco A, Gabutti L. Multifaceted intervention to curb in-hospital over-prescription of proton pump inhibitors: a longitudinal multicenter quasi-experimental before-and-after study. Eur J Intern Med. 2018;50:52-59. https://doi.org/10.1016/j.ejim.2017.11.002
18. Gupta R, Marshall J, Munoz JC, Kottoor R, Jamal MM, Vega KJ. Decreased acid suppression therapy overuse after education and medication reconciliation. Int J Clin Pract. 2013;67(1):60-65. https://doi.org/10.1111/ijcp.12046
19. Hatch JB, Schulz L, Fish JT. Stress ulcer prophylaxis: reducing non-indicated prescribing after hospital discharge. Ann Pharmacother. 2010;44(10):1565-1571. https://doi.org/10.1345/aph.1p167
20. Khalili H, Dashti-Khavidaki S, Hossein Talasaz AH, Tabeefar H, Hendoiee N. Descriptive analysis of a clinical pharmacy intervention to improve the appropriate use of stress ulcer prophylaxis in a hospital infectious disease ward. J Manag Care Pharm. 2010;16(2):114-121. https://doi.org/10.18553/jmcp.2010.16.2.114
21. Masood U, Sharma A, Bhatti Z, et al. A successful pharmacist-based quality initiative to reduce inappropriate stress ulcer prophylaxis use in an academic medical intensive care unit. Inquiry. 2018;55:46958018759116. https://doi.org/10.1177/0046958018759116
22. McDonald EG, Jones J, Green L, Jayaraman D, Lee TC. Reduction of inappropriate exit prescriptions for proton pump inhibitors: a before-after study using education paired with a web-based quality-improvement tool. J Hosp Med. 2015;10(5):281-286. https://doi.org/10.1002/jhm.2330
23. Tasaka CL, Burg C, VanOsdol SJ, et al. An interprofessional approach to reducing the overutilization of stress ulcer prophylaxis in adult medical and surgical intensive care units. Ann Pharmacother. 2014;48(4):462-469. https://doi.org/10.1177/1060028013517088
24. Zink DA, Pohlman M, Barnes M, Cannon ME. Long-term use of acid suppression started inappropriately during hospitalization. Aliment Pharmacol Ther. 2005;21(10):1203-1209. https://doi.org/10.1111/j.1365-2036.2005.02454.x
25. Pham CQ, Regal RE, Bostwick TR, Knauf KS. Acid suppressive therapy use on an inpatient internal medicine service. Ann Pharmacother. 2006;40(7-8):1261-1266. https://doi.org/10.1345/aph.1g703
26. Schoenfeld AJ, Grady D. Adverse effects associated with proton pump inhibitors [editorial]. JAMA Intern Med. 2016;176(2):172-174. https://doi.org/10.1001/jamainternmed.2015.7927
27. Laine L, Jensen DM. Management of patients with ulcer bleeding. Am J Gastroenterol. 2012;107(3):345-360; quiz 361. https://doi.org/10.1038/ajg.2011.480
Proton pump inhibitors (PPIs) are among the most commonly used drugs worldwide to treat dyspepsia and prevent gastrointestinal bleeding (GIB).1 Between 40% and 70% of hospitalized patients receive acid-suppressive therapy (AST; defined as PPIs or histamine-receptor antagonists), and nearly half of these are initiated during the inpatient stay.2,3 While up to 50% of inpatients who received a new AST were discharged on these medications,2 there were no evidence-based indications for a majority of the prescriptions.2,3
Growing evidence shows that PPIs are overutilized and may be associated with wide-ranging adverse events, such as acute and chronic kidney disease,4Clostridium difficile infection,5 hypomagnesemia,6 and fractures.7 Because of the widespread overuse and the potential harm associated with PPIs, a concerted effort to promote their appropriate use in the inpatient setting is necessary. It is important to note that reducing the use of PPIs does not increase the risks of GIB or worsening dyspepsia. Rather, reducing overuse of PPIs lowers the risk of harm to patients. The efforts to reduce overuse, however, are complex and difficult.
This article summarizes evidence regarding interventions to reduce overuse and offers an implementation guide based on this evidence. This guide promotes value-based quality improvement and provides a blueprint for implementing an institution-wide program to reduce PPI overuse in the inpatient setting. We begin with a discussion about quality initiatives to reduce PPI overuse, followed by a review of the safety outcomes associated with reduced use of PPIs.
METHODS
A focused search of the US National Library of Medicine’s PubMed database was performed to identify English-language articles published between 2000 and 2018 that addressed strategies to reduce PPI overuse for stress ulcer prophylaxis (SUP) and nonvariceal GIB. The following search terms were used: PPI and inappropriate use; acid-suppressive therapy and inappropriate use; PPI and discontinuation; acid-suppressive (or suppressant) therapy and discontinuation; SUP and cost; and histamine receptor antagonist and PPI. Inpatient or outpatient studies of patients aged 18 years or older were considered for inclusion in this narrative review, and all study types were included. The primary exclusion criterion was patients aged younger than 18 years. A manual review of the full text of the retrieved articles was performed and references were reviewed for missed citations.
RESULTS
We identified a total of 1,497 unique citations through our initial search. After performing a manual review, we excluded 1,483 of the references and added an additional 2, resulting in 16 articles selected for inclusion. The selected articles addressed interventions falling into three main groupings: implementation of institutional guidelines with or without electronic health record (EHR)–based decision support, educational interventions alone, and multifaceted interventions. Each of these interventions is discussed in the sections that follow. Table 1, Table 2, and Table 3 summarize the results of the studies included in our narrative review.
QUALITY INITIATIVES TO REDUCE PPI OVERUSE
Institutional Guidelines With or Without EHR-Based Decision Support
Table 1 summarizes institutional guidelines, with or without EHR-based decision support, to reduce inappropriate PPI use. The implementation of institutional guidelines for the appropriate reduction of PPI use has had some success. Coursol and Sanzari evaluated the impact of a treatment algorithm on the appropriateness of prescriptions for SUP in the intensive care unit (ICU).8 Risk factors of patients in this study included mechanical ventilation for 48 hours, coagulopathy for 24 hours, postoperative transplant, severe burns, active gastrointestinal (GI) disease, multiple trauma, multiple organ failure, and septicemia. The three treatment options chosen for the algorithm were intravenous (IV) famotidine (if the oral route was unavailable or impractical), omeprazole tablets (if oral access was available), and omeprazole suspension (in cases of dysphagia and presence of nasogastric or orogastric tube). After implementation of the treatment algorithm, the proportion of inappropriate prophylaxis decreased from 95.7% to 88.2% (P = .033), and the cost per patient decreased from $11.11 to $8.49 Canadian dollars (P = .003).
Van Vliet et al implemented a clinical practice guideline listing specific criteria for prescribing a PPI.9 Their criteria included the presence of gastric or duodenal ulcer and use of a nonsteroidal anti-inflammatory drug (NSAID) or aspirin, plus at least one additional risk factor (eg, history of gastroduodenal hemorrhage or age >70 years). The proportion of patients started on PPIs during hospitalization decreased from 21% to 13% (odds ratio, 0.56; 95% CI, 0.33-0.97).
Michal et al utilized an institutional pharmacist-driven protocol that stipulated criteria for appropriate PPI use (eg, upper GIB, mechanical ventilation, peptic ulcer disease, gastroesophageal reflux disease, coagulopathy).10 Pharmacists in the study evaluated patients for PPI appropriateness and recommended changes in medication or discontinuation of use. This institutional intervention decreased PPI use in non-ICU hospitalized adults. Discontinuation of PPIs increased from 41% of patients in the preintervention group to 66% of patients in the postintervention group (P = .001).
In addition to implementing guidelines and intervention strategies, institutions have also adopted changes to the EHR to reduce inappropriate PPI use. Herzig et al utilized a computerized clinical decision support intervention to decrease SUP in non-ICU hospitalized patients.11 Of the available response options for acid-suppressive medication, when SUP was chosen as the only indication for PPI use a prompt alerted the clinician that “[SUP] is not recommended for patients outside the [ICU]”; the alert resulted in a significant reduction in AST for the sole purpose of SUP. With this intervention, the percentage of patients who had any inappropriate acid-suppressive exposure decreased from 4.0% to 0.6% (P < .001).
EDUCATION
Table 2 summarizes educational interventions to reduce inappropriate PPI use.
Agee et al employed a pharmacist-led educational seminar that described SUP indications, risks, and costs.12 Inappropriate SUP prescriptions decreased from 55.5% to 30.5% after the intervention (P < .0001). However, there was no reduction in the percentage of patients discharged on inappropriate AST.
Chui et al performed an intervention with academic detailing wherein a one-on-one visit with a physician took place, providing education to improve physician prescribing behavior.13 In this study, academic detailing focused on the most common instances for which PPIs were inappropriately utilized at that hospital (eg, surgical prophylaxis, anemia). Inappropriate use of double-dose PPIs was also targeted. Despite these efforts, no significant difference in inappropriate PPI prescribing was observed post intervention.
Hamzat et al implemented an educational strategy to reduce inappropriate PPI prescribing during hospital stays, which included dissemination of fliers, posters, emails, and presentations over a 4-week period.14 Educational efforts targeted clinical pharmacists, nurses, physicians, and patients. Appropriate indications for PPI use in this study included peptic ulcer disease (current or previous), H pylori infection, and treatment or prevention of an NSAID-induced ulcer. The primary outcome was a reduction in PPI dose or discontinuation of PPI during the hospital admission, which increased from 9% in the preintervention (pre-education) phase to 43% during the intervention (education) phase and to 46% in the postintervention (posteducation) phase (P = .006).
Liberman and Whelan also implemented an educational intervention among internal medicine residents to reduce inappropriate use of SUP; this intervention was based on practice-based learning and improvement methodology.15 They noted that the rate of inappropriate prophylaxis with AST decreased from 59% preintervention to 33% post intervention (P < .007).
MULTIFACETED APPROACHES
Table 3 summarizes several multifaceted approaches aimed at reducing inappropriate PPI use. Belfield et al utilized an intervention consisting of an institutional guideline review, education, and monitoring of AST by clinical pharmacists to reduce inappropriate use of PPI for SUP.16 With this intervention, the primary outcome of total inappropriate days of AST during hospitalization decreased from 279 to 116 (48% relative reduction in risk, P < .01, across 142 patients studied). Furthermore, inappropriate AST prescriptions at discharge decreased from 32% to 8% (P = .006). The one case of GIB noted in this study occurred in the control group.
Del Giorno et al combined audit and feedback with education to reduce new PPI prescriptions at the time of discharge from the hospital.17 The educational component of this intervention included guidance regarding potentially inappropriate PPI use and associated side effects and targeted multiple departments in the hospital. This intervention led to a sustained reduction in new PPI prescriptions at discharge during the 3-year study period. The annual rate of new PPI prescriptions was 19%, 19%, 18%, and 16% in years 2014, 2015, 2016, and 2017, respectively, in the internal medicine department (postintervention group), compared with rates of 30%, 29%, 36%, 36% (P < .001) for the same years in the surgery department (control group).
Education and the use of medication reconciliation forms on admission and discharge were utilized by Gupta et al to reduce inappropriate AST in hospitalized patients from 51% prior to intervention to 22% post intervention (P < .001).18 Furthermore, the proportion of patients discharged on inappropriate AST decreased from 69% to 20% (P < .001).
Hatch et al also used educational resources and pharmacist-led medication reconciliation to reduce use of SUP.19 Before the intervention, 24.4% of patients were continued on SUP after hospital discharge in the absence of a clear indication for use; post intervention, 11% of patients were continued on SUP after hospital discharge (of these patients, 8.7% had no clear indication for use). This represented a 64.4% decrease in inappropriately prescribed SUP after discharge (P < .0001).
Khalili et al combined an educational intervention with an institutional guideline in an infectious disease ward to reduce inappropriate use of SUP.20 This intervention reduced the inappropriate use of AST from 80.9% before the intervention to 47.1% post intervention (P < .001).
Masood et al implemented two interventions wherein pharmacists reviewed SUP indications for each patient during daily team rounds, and ICU residents and fellows received education about indications for SUP and the implemented initiative on a bimonthly basis.21 Inappropriate AST decreased from 26.75 to 7.14 prescriptions per 100 patient-days of care (P < .001).
McDonald et al combined education with a web-based quality improvement tool to reduce inappropriate exit prescriptions for PPIs.22 The proportion of PPIs discontinued at hospital discharge increased from 7.7% per month to 18.5% per month (P = .03).
Finally, the initiative implemented by Tasaka et al to reduce overutilization of SUP included an institutional guideline, a pharmacist-led intervention, and an institutional education and awareness campaign.23 Their initiative led to a reduction in inappropriate SUP both at the time of transfer out of the ICU (8% before intervention, 4% post intervention, P = .54) and at the time of discharge from the hospital (7% before intervention, 0% post intervention, P = .22).
REDUCING PPI USE AND SAFETY OUTCOMES
Proton pump inhibitors are often initiated in the hospital setting, with up to half of these new prescriptions continued at discharge.2,24,25 Inappropriate prescriptions for PPIs expose patients to excess risk of long-term adverse events.26 De-escalating PPIs, however, raises concern among clinicians and patients for potential recurrence of dyspepsia and GIB. There is limited evidence regarding long-term safety outcomes (including GIB) following the discontinuation of PPIs deemed to have been inappropriately initiated in the hospital. In view of this, clinicians should educate and monitor individual patients for symptom relapse to ensure timely and appropriate resumption of AST.
LIMITATIONS
Our literature search for this narrative review and implementation guide has limitations. First, the time frame we included (2000-2018) may have excluded relevant articles published before our starting year. We did not include articles published before 2000 based on concerns these might contain outdated information. Also, there may have been incomplete retrieval of relevant studies/articles due to the labor-intensive nature involved in determining whether PPI prescriptions are appropriate or inappropriate.
We noted that interventional studies aimed at reducing overuse of PPIs were often limited by a low number of participants; these studies were also more likely to be single-center interventions, which limits generalizability. In addition, the studies often had low methodological rigor and lacked randomization or controls. Moreover, to fully evaluate the sustainability of interventions, some of the studies had a limited postimplementation period. For multifaceted interventions, the efficacy of individual components of the interventions was not clearly evaluated. Moreover, there was a high risk of bias in many of the included studies. Some of the larger studies used overall AST prescriptions as a surrogate for more appropriate use. It would be advantageous for a site to perform a pilot study that provides well-defined parameters for appropriate prescribing, and then correlate with the total number of prescriptions (automated and much easier) thereafter. Further, although the evidence regarding appropriate PPI use for SUP and GIB has shifted rapidly in recent years, society guidelines have not been updated to reflect this change. As such, quality improvement interventions have predominantly focused on reducing PPI use for the indications reflected by these guidelines.
IMPLEMENTATION BLUEPRINT
The following are our recommendations for successfully implementing an evidence-based, institution-wide initiative to promote the appropriate use of PPIs during hospitalization. These recommendations are informed by the evidence review and reflect the consensus of the combined committees coauthoring this review.
For an initiative to succeed, participation from multiple disciplines is necessary to formulate local guidelines and design and implement interventions. Such an interdisciplinary approach requires advocates to closely monitor and evaluate the program; sustainability will be greatly facilitated by the active engagement of key stakeholders, including the hospital’s executive administration, supply chain, pharmacists, and gastroenterologists. Lack of adequate buy-in on the part of key stakeholders is a barrier to the success of any intervention. Accordingly, before selecting a particular intervention, it is important to understand local factors driving the overuse of PPI.
1. Develop evidence-based institutional guidelines for both SUP and nonvariceal upper GIB through an interdisciplinary workgroup.
- Establish an interdisciplinary group including, but not limited to, pharmacists, hospitalists, gastroenterologists, and intensivists so that changes in practice will be widely adopted as institutional policy.
- Incorporate the best evidence and clearly convey appropriate and inappropriate uses.
2. Integrate changes to the EHR.
- If possible, the EHR should be leveraged to implement changes in PPI ordering practices.
- While integrating changes to the EHR, it is important to consider informatics and implementation science, since the utility of hard stops and best practice alerts has been questioned in the setting of operational inefficiencies and alert fatigue.
- Options for integrating changes to the EHR include the following:
- Create an ordering pathway that provides clinical decision support for PPI use.
- Incorporate a best practice alert in the EMR to notify clinicians of institutional guidelines when they initiate an order for PPI outside of the pathway.
- Consider restricting the authority to order IV PPIs by requiring a code or password or implement another means of using the EHR to limit the supply of PPI.
- Limit the duration of IV PPI by requiring daily renewal of IV PPI dosing or by altering the period of time that use of IV PPI is permitted (eg, 48 to 72 hours).
- PPIs should be removed from any current order sets that include medications for SUP.
3. Foster pharmacy-driven interventions.
- Consider requiring pharmacist approval for IV PPIs.
- Pharmacist-led review and feedback to clinicians for discontinuation of inappropriate PPIs can be effective in decreasing inappropriate utilization.
4. Provide education, audit data, and obtain feedback.
- Data auditing is needed to measure the efficacy of interventions. Outcome measures may include the number of non-ICU and ICU patients who are started on a PPI during an admission; the audit should be continued through discharge. A process measure may be the number of pharmacist calls for inappropriate PPIs. A balancing measure would be ulcer-specific upper GIB in patients who do not receive SUP during their admission. (Upper GIB from other etiologies, such as varices, portal hypertensive gastropathy, and Mallory-Weiss tear would not be affected by PPI SUP.)
- Run or control charts should be utilized, and data should be shared with project champions and ordering clinicians—in real time if possible.
- Project champions should provide feedback to colleagues; they should also work with hospital leadership to develop new strategies to improve adherence.
- Provide ongoing education about appropriate indications for PPIs and potential adverse effects associated with their use. Whenever possible, point-of-care or just-in-time teaching is the preferred format.
CONCLUSION
Excessive use of PPIs during hospitalization is prevalent; however, quality improvement interventions can be effective in achieving sustainable reductions in overuse. There is a need for the American College of Gastroenterology to revisit and update their guidelines for management of patients with ulcer bleeding to include stronger evidence-based recommendations on the proper use of PPIs.27 These updated guidelines could be used to update the implementation blueprint.
Quality improvement teams have an opportunity to use the principles of value-based healthcare to reduce inappropriate PPI use. By following the blueprint outlined in this article, institutions can safely and effectively tailor the use of PPIs to suitable patients in the appropriate settings. Reduction of PPI overuse can be employed as an institutional catalyst to promote implementation of further value-based measures to improve efficiency and quality of patient care.
Proton pump inhibitors (PPIs) are among the most commonly used drugs worldwide to treat dyspepsia and prevent gastrointestinal bleeding (GIB).1 Between 40% and 70% of hospitalized patients receive acid-suppressive therapy (AST; defined as PPIs or histamine-receptor antagonists), and nearly half of these are initiated during the inpatient stay.2,3 While up to 50% of inpatients who received a new AST were discharged on these medications,2 there were no evidence-based indications for a majority of the prescriptions.2,3
Growing evidence shows that PPIs are overutilized and may be associated with wide-ranging adverse events, such as acute and chronic kidney disease,4Clostridium difficile infection,5 hypomagnesemia,6 and fractures.7 Because of the widespread overuse and the potential harm associated with PPIs, a concerted effort to promote their appropriate use in the inpatient setting is necessary. It is important to note that reducing the use of PPIs does not increase the risks of GIB or worsening dyspepsia. Rather, reducing overuse of PPIs lowers the risk of harm to patients. The efforts to reduce overuse, however, are complex and difficult.
This article summarizes evidence regarding interventions to reduce overuse and offers an implementation guide based on this evidence. This guide promotes value-based quality improvement and provides a blueprint for implementing an institution-wide program to reduce PPI overuse in the inpatient setting. We begin with a discussion about quality initiatives to reduce PPI overuse, followed by a review of the safety outcomes associated with reduced use of PPIs.
METHODS
A focused search of the US National Library of Medicine’s PubMed database was performed to identify English-language articles published between 2000 and 2018 that addressed strategies to reduce PPI overuse for stress ulcer prophylaxis (SUP) and nonvariceal GIB. The following search terms were used: PPI and inappropriate use; acid-suppressive therapy and inappropriate use; PPI and discontinuation; acid-suppressive (or suppressant) therapy and discontinuation; SUP and cost; and histamine receptor antagonist and PPI. Inpatient or outpatient studies of patients aged 18 years or older were considered for inclusion in this narrative review, and all study types were included. The primary exclusion criterion was patients aged younger than 18 years. A manual review of the full text of the retrieved articles was performed and references were reviewed for missed citations.
RESULTS
We identified a total of 1,497 unique citations through our initial search. After performing a manual review, we excluded 1,483 of the references and added an additional 2, resulting in 16 articles selected for inclusion. The selected articles addressed interventions falling into three main groupings: implementation of institutional guidelines with or without electronic health record (EHR)–based decision support, educational interventions alone, and multifaceted interventions. Each of these interventions is discussed in the sections that follow. Table 1, Table 2, and Table 3 summarize the results of the studies included in our narrative review.
QUALITY INITIATIVES TO REDUCE PPI OVERUSE
Institutional Guidelines With or Without EHR-Based Decision Support
Table 1 summarizes institutional guidelines, with or without EHR-based decision support, to reduce inappropriate PPI use. The implementation of institutional guidelines for the appropriate reduction of PPI use has had some success. Coursol and Sanzari evaluated the impact of a treatment algorithm on the appropriateness of prescriptions for SUP in the intensive care unit (ICU).8 Risk factors of patients in this study included mechanical ventilation for 48 hours, coagulopathy for 24 hours, postoperative transplant, severe burns, active gastrointestinal (GI) disease, multiple trauma, multiple organ failure, and septicemia. The three treatment options chosen for the algorithm were intravenous (IV) famotidine (if the oral route was unavailable or impractical), omeprazole tablets (if oral access was available), and omeprazole suspension (in cases of dysphagia and presence of nasogastric or orogastric tube). After implementation of the treatment algorithm, the proportion of inappropriate prophylaxis decreased from 95.7% to 88.2% (P = .033), and the cost per patient decreased from $11.11 to $8.49 Canadian dollars (P = .003).
Van Vliet et al implemented a clinical practice guideline listing specific criteria for prescribing a PPI.9 Their criteria included the presence of gastric or duodenal ulcer and use of a nonsteroidal anti-inflammatory drug (NSAID) or aspirin, plus at least one additional risk factor (eg, history of gastroduodenal hemorrhage or age >70 years). The proportion of patients started on PPIs during hospitalization decreased from 21% to 13% (odds ratio, 0.56; 95% CI, 0.33-0.97).
Michal et al utilized an institutional pharmacist-driven protocol that stipulated criteria for appropriate PPI use (eg, upper GIB, mechanical ventilation, peptic ulcer disease, gastroesophageal reflux disease, coagulopathy).10 Pharmacists in the study evaluated patients for PPI appropriateness and recommended changes in medication or discontinuation of use. This institutional intervention decreased PPI use in non-ICU hospitalized adults. Discontinuation of PPIs increased from 41% of patients in the preintervention group to 66% of patients in the postintervention group (P = .001).
In addition to implementing guidelines and intervention strategies, institutions have also adopted changes to the EHR to reduce inappropriate PPI use. Herzig et al utilized a computerized clinical decision support intervention to decrease SUP in non-ICU hospitalized patients.11 Of the available response options for acid-suppressive medication, when SUP was chosen as the only indication for PPI use a prompt alerted the clinician that “[SUP] is not recommended for patients outside the [ICU]”; the alert resulted in a significant reduction in AST for the sole purpose of SUP. With this intervention, the percentage of patients who had any inappropriate acid-suppressive exposure decreased from 4.0% to 0.6% (P < .001).
EDUCATION
Table 2 summarizes educational interventions to reduce inappropriate PPI use.
Agee et al employed a pharmacist-led educational seminar that described SUP indications, risks, and costs.12 Inappropriate SUP prescriptions decreased from 55.5% to 30.5% after the intervention (P < .0001). However, there was no reduction in the percentage of patients discharged on inappropriate AST.
Chui et al performed an intervention with academic detailing wherein a one-on-one visit with a physician took place, providing education to improve physician prescribing behavior.13 In this study, academic detailing focused on the most common instances for which PPIs were inappropriately utilized at that hospital (eg, surgical prophylaxis, anemia). Inappropriate use of double-dose PPIs was also targeted. Despite these efforts, no significant difference in inappropriate PPI prescribing was observed post intervention.
Hamzat et al implemented an educational strategy to reduce inappropriate PPI prescribing during hospital stays, which included dissemination of fliers, posters, emails, and presentations over a 4-week period.14 Educational efforts targeted clinical pharmacists, nurses, physicians, and patients. Appropriate indications for PPI use in this study included peptic ulcer disease (current or previous), H pylori infection, and treatment or prevention of an NSAID-induced ulcer. The primary outcome was a reduction in PPI dose or discontinuation of PPI during the hospital admission, which increased from 9% in the preintervention (pre-education) phase to 43% during the intervention (education) phase and to 46% in the postintervention (posteducation) phase (P = .006).
Liberman and Whelan also implemented an educational intervention among internal medicine residents to reduce inappropriate use of SUP; this intervention was based on practice-based learning and improvement methodology.15 They noted that the rate of inappropriate prophylaxis with AST decreased from 59% preintervention to 33% post intervention (P < .007).
MULTIFACETED APPROACHES
Table 3 summarizes several multifaceted approaches aimed at reducing inappropriate PPI use. Belfield et al utilized an intervention consisting of an institutional guideline review, education, and monitoring of AST by clinical pharmacists to reduce inappropriate use of PPI for SUP.16 With this intervention, the primary outcome of total inappropriate days of AST during hospitalization decreased from 279 to 116 (48% relative reduction in risk, P < .01, across 142 patients studied). Furthermore, inappropriate AST prescriptions at discharge decreased from 32% to 8% (P = .006). The one case of GIB noted in this study occurred in the control group.
Del Giorno et al combined audit and feedback with education to reduce new PPI prescriptions at the time of discharge from the hospital.17 The educational component of this intervention included guidance regarding potentially inappropriate PPI use and associated side effects and targeted multiple departments in the hospital. This intervention led to a sustained reduction in new PPI prescriptions at discharge during the 3-year study period. The annual rate of new PPI prescriptions was 19%, 19%, 18%, and 16% in years 2014, 2015, 2016, and 2017, respectively, in the internal medicine department (postintervention group), compared with rates of 30%, 29%, 36%, 36% (P < .001) for the same years in the surgery department (control group).
Education and the use of medication reconciliation forms on admission and discharge were utilized by Gupta et al to reduce inappropriate AST in hospitalized patients from 51% prior to intervention to 22% post intervention (P < .001).18 Furthermore, the proportion of patients discharged on inappropriate AST decreased from 69% to 20% (P < .001).
Hatch et al also used educational resources and pharmacist-led medication reconciliation to reduce use of SUP.19 Before the intervention, 24.4% of patients were continued on SUP after hospital discharge in the absence of a clear indication for use; post intervention, 11% of patients were continued on SUP after hospital discharge (of these patients, 8.7% had no clear indication for use). This represented a 64.4% decrease in inappropriately prescribed SUP after discharge (P < .0001).
Khalili et al combined an educational intervention with an institutional guideline in an infectious disease ward to reduce inappropriate use of SUP.20 This intervention reduced the inappropriate use of AST from 80.9% before the intervention to 47.1% post intervention (P < .001).
Masood et al implemented two interventions wherein pharmacists reviewed SUP indications for each patient during daily team rounds, and ICU residents and fellows received education about indications for SUP and the implemented initiative on a bimonthly basis.21 Inappropriate AST decreased from 26.75 to 7.14 prescriptions per 100 patient-days of care (P < .001).
McDonald et al combined education with a web-based quality improvement tool to reduce inappropriate exit prescriptions for PPIs.22 The proportion of PPIs discontinued at hospital discharge increased from 7.7% per month to 18.5% per month (P = .03).
Finally, the initiative implemented by Tasaka et al to reduce overutilization of SUP included an institutional guideline, a pharmacist-led intervention, and an institutional education and awareness campaign.23 Their initiative led to a reduction in inappropriate SUP both at the time of transfer out of the ICU (8% before intervention, 4% post intervention, P = .54) and at the time of discharge from the hospital (7% before intervention, 0% post intervention, P = .22).
REDUCING PPI USE AND SAFETY OUTCOMES
Proton pump inhibitors are often initiated in the hospital setting, with up to half of these new prescriptions continued at discharge.2,24,25 Inappropriate prescriptions for PPIs expose patients to excess risk of long-term adverse events.26 De-escalating PPIs, however, raises concern among clinicians and patients for potential recurrence of dyspepsia and GIB. There is limited evidence regarding long-term safety outcomes (including GIB) following the discontinuation of PPIs deemed to have been inappropriately initiated in the hospital. In view of this, clinicians should educate and monitor individual patients for symptom relapse to ensure timely and appropriate resumption of AST.
LIMITATIONS
Our literature search for this narrative review and implementation guide has limitations. First, the time frame we included (2000-2018) may have excluded relevant articles published before our starting year. We did not include articles published before 2000 based on concerns these might contain outdated information. Also, there may have been incomplete retrieval of relevant studies/articles due to the labor-intensive nature involved in determining whether PPI prescriptions are appropriate or inappropriate.
We noted that interventional studies aimed at reducing overuse of PPIs were often limited by a low number of participants; these studies were also more likely to be single-center interventions, which limits generalizability. In addition, the studies often had low methodological rigor and lacked randomization or controls. Moreover, to fully evaluate the sustainability of interventions, some of the studies had a limited postimplementation period. For multifaceted interventions, the efficacy of individual components of the interventions was not clearly evaluated. Moreover, there was a high risk of bias in many of the included studies. Some of the larger studies used overall AST prescriptions as a surrogate for more appropriate use. It would be advantageous for a site to perform a pilot study that provides well-defined parameters for appropriate prescribing, and then correlate with the total number of prescriptions (automated and much easier) thereafter. Further, although the evidence regarding appropriate PPI use for SUP and GIB has shifted rapidly in recent years, society guidelines have not been updated to reflect this change. As such, quality improvement interventions have predominantly focused on reducing PPI use for the indications reflected by these guidelines.
IMPLEMENTATION BLUEPRINT
The following are our recommendations for successfully implementing an evidence-based, institution-wide initiative to promote the appropriate use of PPIs during hospitalization. These recommendations are informed by the evidence review and reflect the consensus of the combined committees coauthoring this review.
For an initiative to succeed, participation from multiple disciplines is necessary to formulate local guidelines and design and implement interventions. Such an interdisciplinary approach requires advocates to closely monitor and evaluate the program; sustainability will be greatly facilitated by the active engagement of key stakeholders, including the hospital’s executive administration, supply chain, pharmacists, and gastroenterologists. Lack of adequate buy-in on the part of key stakeholders is a barrier to the success of any intervention. Accordingly, before selecting a particular intervention, it is important to understand local factors driving the overuse of PPI.
1. Develop evidence-based institutional guidelines for both SUP and nonvariceal upper GIB through an interdisciplinary workgroup.
- Establish an interdisciplinary group including, but not limited to, pharmacists, hospitalists, gastroenterologists, and intensivists so that changes in practice will be widely adopted as institutional policy.
- Incorporate the best evidence and clearly convey appropriate and inappropriate uses.
2. Integrate changes to the EHR.
- If possible, the EHR should be leveraged to implement changes in PPI ordering practices.
- While integrating changes to the EHR, it is important to consider informatics and implementation science, since the utility of hard stops and best practice alerts has been questioned in the setting of operational inefficiencies and alert fatigue.
- Options for integrating changes to the EHR include the following:
- Create an ordering pathway that provides clinical decision support for PPI use.
- Incorporate a best practice alert in the EMR to notify clinicians of institutional guidelines when they initiate an order for PPI outside of the pathway.
- Consider restricting the authority to order IV PPIs by requiring a code or password or implement another means of using the EHR to limit the supply of PPI.
- Limit the duration of IV PPI by requiring daily renewal of IV PPI dosing or by altering the period of time that use of IV PPI is permitted (eg, 48 to 72 hours).
- PPIs should be removed from any current order sets that include medications for SUP.
3. Foster pharmacy-driven interventions.
- Consider requiring pharmacist approval for IV PPIs.
- Pharmacist-led review and feedback to clinicians for discontinuation of inappropriate PPIs can be effective in decreasing inappropriate utilization.
4. Provide education, audit data, and obtain feedback.
- Data auditing is needed to measure the efficacy of interventions. Outcome measures may include the number of non-ICU and ICU patients who are started on a PPI during an admission; the audit should be continued through discharge. A process measure may be the number of pharmacist calls for inappropriate PPIs. A balancing measure would be ulcer-specific upper GIB in patients who do not receive SUP during their admission. (Upper GIB from other etiologies, such as varices, portal hypertensive gastropathy, and Mallory-Weiss tear would not be affected by PPI SUP.)
- Run or control charts should be utilized, and data should be shared with project champions and ordering clinicians—in real time if possible.
- Project champions should provide feedback to colleagues; they should also work with hospital leadership to develop new strategies to improve adherence.
- Provide ongoing education about appropriate indications for PPIs and potential adverse effects associated with their use. Whenever possible, point-of-care or just-in-time teaching is the preferred format.
CONCLUSION
Excessive use of PPIs during hospitalization is prevalent; however, quality improvement interventions can be effective in achieving sustainable reductions in overuse. There is a need for the American College of Gastroenterology to revisit and update their guidelines for management of patients with ulcer bleeding to include stronger evidence-based recommendations on the proper use of PPIs.27 These updated guidelines could be used to update the implementation blueprint.
Quality improvement teams have an opportunity to use the principles of value-based healthcare to reduce inappropriate PPI use. By following the blueprint outlined in this article, institutions can safely and effectively tailor the use of PPIs to suitable patients in the appropriate settings. Reduction of PPI overuse can be employed as an institutional catalyst to promote implementation of further value-based measures to improve efficiency and quality of patient care.
1. Savarino V, Marabotto E, Zentilin P, et al. Proton pump inhibitors: use and misuse in the clinical setting. Exp Rev Clin Pharmacol. 2018;11(11):1123-1134. https://doi.org/10.1080/17512433.2018.1531703
2. Nardino RJ, Vender RJ, Herbert PN. Overuse of acid-suppressive therapy in hospitalized patients. Am J Gastroenterol. 2000;95(11):3118-3122. https://doi.org/10.1111/j.1572-0241.2000.03259.x
3. Ahrens D, Behrens G, Himmel W, Kochen MM, Chenot JF. Appropriateness of proton pump inhibitor recommendations at hospital discharge and continuation in primary care. Int J Clin Pract. 2012;66(8):767-773. https://doi.org/10.1111/j.1742-1241.2012.02973.x
4. Moledina DG, Perazella MA. PPIs and kidney disease: from AIN to CKD. J Nephrol. 2016;29(5):611-616. https://doi.org/10.1007/s40620-016-0309-2
5. Kwok CS, Arthur AK, Anibueze CI, Singh S, Cavallazzi R, Loke YK. Risk of Clostridium difficile infection with acid suppressing drugs and antibiotics: meta-analysis. Am J Gastroenterol. 2012;107(7):1011-1019. https://doi.org/10.1038/ajg.2012.108
6. Cheungpasitporn W, Thongprayoon C, Kittanamongkolchai W, et al. Proton pump inhibitors linked to hypomagnesemia: a systematic review and meta-analysis of observational studies. Ren Fail. 2015;37(7):1237-1241. https://doi.org/10.3109/0886022x.2015.1057800
7. Yang YX, Lewis JD, Epstein S, Metz DC. Long-term proton pump inhibitor therapy and risk of hip fracture. JAMA. 2006;296(24):2947-2953. https://doi.org/10.1001/jama.296.24.2947
8. Coursol CJ, Sanzari SE. Impact of stress ulcer prophylaxis algorithm study. Ann Pharmacother. 2005;39(5):810-816. https://doi.org/10.1345/aph.1d129
9. van Vliet EPM, Steyerberg EW, Otten HJ, et al. The effects of guideline implementation for proton pump inhibitor prescription on two pulmonary medicine wards. Aliment Pharmacol Ther. 2009;29(2):213-221. https://doi.org/10.1111/j.1365-2036.2008.03875.x
10. Michal J, Henry T, Street C. Impact of a pharmacist-driven protocol to decrease proton pump inhibitor use in non-intensive care hospitalized adults. Am J Health Syst Pharm. 2016;73(17 Suppl 4):S126-S132. https://doi.org/10.2146/ajhp150519
11. Herzig SJ, Guess JR, Feinbloom DB, et al. Improving appropriateness of acid-suppressive medication use via computerized clinical decision support. J Hosp Med. 2015;10(1):41-45. https://doi.org/10.1002/jhm.2260
12. Agee C, Coulter L, Hudson J. Effects of pharmacy resident led education on resident physician prescribing habits associated with stress ulcer prophylaxis in non-intensive care unit patients. Am J Health Syst Pharm. 2015;72(11 Suppl 1):S48-S52. https://doi.org/10.2146/sp150013
13. Chui D, Young F, Tejani AM, Dillon EC. Impact of academic detailing on proton pump inhibitor prescribing behaviour in a community hospital. Can Pharm J (Ott). 2011;144(2):66-71. https://doi.org/10.3821/1913-701X-144.2.66
14. Hamzat H, Sun H, Ford JC, Macleod J, Soiza RL, Mangoni AA. Inappropriate prescribing of proton pump inhibitors in older patients: effects of an educational strategy. Drugs Aging. 2012;29(8):681-690. https://doi.org/10.1007/bf03262283
15. Liberman JD, Whelan CT. Brief report: Reducing inappropriate usage of stress ulcer prophylaxis among internal medicine residents. A practice-based educational intervention. J Gen Intern Med. 2006;21(5):498-500. https://doi.org/10.1111/j.1525-1497.2006.00435.x
16. Belfield KD, Kuyumjian AG, Teran R, Amadi M, Blatt M, Bicking K. Impact of a collaborative strategy to reduce the inappropriate use of acid suppressive therapy in non-intensive care unit patients. Ann Pharmacother. 2017;51(7):577-583. https://doi.org/10.1177/1060028017698797
17. Del Giorno R, Ceschi A, Pironi M, Zasa A, Greco A, Gabutti L. Multifaceted intervention to curb in-hospital over-prescription of proton pump inhibitors: a longitudinal multicenter quasi-experimental before-and-after study. Eur J Intern Med. 2018;50:52-59. https://doi.org/10.1016/j.ejim.2017.11.002
18. Gupta R, Marshall J, Munoz JC, Kottoor R, Jamal MM, Vega KJ. Decreased acid suppression therapy overuse after education and medication reconciliation. Int J Clin Pract. 2013;67(1):60-65. https://doi.org/10.1111/ijcp.12046
19. Hatch JB, Schulz L, Fish JT. Stress ulcer prophylaxis: reducing non-indicated prescribing after hospital discharge. Ann Pharmacother. 2010;44(10):1565-1571. https://doi.org/10.1345/aph.1p167
20. Khalili H, Dashti-Khavidaki S, Hossein Talasaz AH, Tabeefar H, Hendoiee N. Descriptive analysis of a clinical pharmacy intervention to improve the appropriate use of stress ulcer prophylaxis in a hospital infectious disease ward. J Manag Care Pharm. 2010;16(2):114-121. https://doi.org/10.18553/jmcp.2010.16.2.114
21. Masood U, Sharma A, Bhatti Z, et al. A successful pharmacist-based quality initiative to reduce inappropriate stress ulcer prophylaxis use in an academic medical intensive care unit. Inquiry. 2018;55:46958018759116. https://doi.org/10.1177/0046958018759116
22. McDonald EG, Jones J, Green L, Jayaraman D, Lee TC. Reduction of inappropriate exit prescriptions for proton pump inhibitors: a before-after study using education paired with a web-based quality-improvement tool. J Hosp Med. 2015;10(5):281-286. https://doi.org/10.1002/jhm.2330
23. Tasaka CL, Burg C, VanOsdol SJ, et al. An interprofessional approach to reducing the overutilization of stress ulcer prophylaxis in adult medical and surgical intensive care units. Ann Pharmacother. 2014;48(4):462-469. https://doi.org/10.1177/1060028013517088
24. Zink DA, Pohlman M, Barnes M, Cannon ME. Long-term use of acid suppression started inappropriately during hospitalization. Aliment Pharmacol Ther. 2005;21(10):1203-1209. https://doi.org/10.1111/j.1365-2036.2005.02454.x
25. Pham CQ, Regal RE, Bostwick TR, Knauf KS. Acid suppressive therapy use on an inpatient internal medicine service. Ann Pharmacother. 2006;40(7-8):1261-1266. https://doi.org/10.1345/aph.1g703
26. Schoenfeld AJ, Grady D. Adverse effects associated with proton pump inhibitors [editorial]. JAMA Intern Med. 2016;176(2):172-174. https://doi.org/10.1001/jamainternmed.2015.7927
27. Laine L, Jensen DM. Management of patients with ulcer bleeding. Am J Gastroenterol. 2012;107(3):345-360; quiz 361. https://doi.org/10.1038/ajg.2011.480
1. Savarino V, Marabotto E, Zentilin P, et al. Proton pump inhibitors: use and misuse in the clinical setting. Exp Rev Clin Pharmacol. 2018;11(11):1123-1134. https://doi.org/10.1080/17512433.2018.1531703
2. Nardino RJ, Vender RJ, Herbert PN. Overuse of acid-suppressive therapy in hospitalized patients. Am J Gastroenterol. 2000;95(11):3118-3122. https://doi.org/10.1111/j.1572-0241.2000.03259.x
3. Ahrens D, Behrens G, Himmel W, Kochen MM, Chenot JF. Appropriateness of proton pump inhibitor recommendations at hospital discharge and continuation in primary care. Int J Clin Pract. 2012;66(8):767-773. https://doi.org/10.1111/j.1742-1241.2012.02973.x
4. Moledina DG, Perazella MA. PPIs and kidney disease: from AIN to CKD. J Nephrol. 2016;29(5):611-616. https://doi.org/10.1007/s40620-016-0309-2
5. Kwok CS, Arthur AK, Anibueze CI, Singh S, Cavallazzi R, Loke YK. Risk of Clostridium difficile infection with acid suppressing drugs and antibiotics: meta-analysis. Am J Gastroenterol. 2012;107(7):1011-1019. https://doi.org/10.1038/ajg.2012.108
6. Cheungpasitporn W, Thongprayoon C, Kittanamongkolchai W, et al. Proton pump inhibitors linked to hypomagnesemia: a systematic review and meta-analysis of observational studies. Ren Fail. 2015;37(7):1237-1241. https://doi.org/10.3109/0886022x.2015.1057800
7. Yang YX, Lewis JD, Epstein S, Metz DC. Long-term proton pump inhibitor therapy and risk of hip fracture. JAMA. 2006;296(24):2947-2953. https://doi.org/10.1001/jama.296.24.2947
8. Coursol CJ, Sanzari SE. Impact of stress ulcer prophylaxis algorithm study. Ann Pharmacother. 2005;39(5):810-816. https://doi.org/10.1345/aph.1d129
9. van Vliet EPM, Steyerberg EW, Otten HJ, et al. The effects of guideline implementation for proton pump inhibitor prescription on two pulmonary medicine wards. Aliment Pharmacol Ther. 2009;29(2):213-221. https://doi.org/10.1111/j.1365-2036.2008.03875.x
10. Michal J, Henry T, Street C. Impact of a pharmacist-driven protocol to decrease proton pump inhibitor use in non-intensive care hospitalized adults. Am J Health Syst Pharm. 2016;73(17 Suppl 4):S126-S132. https://doi.org/10.2146/ajhp150519
11. Herzig SJ, Guess JR, Feinbloom DB, et al. Improving appropriateness of acid-suppressive medication use via computerized clinical decision support. J Hosp Med. 2015;10(1):41-45. https://doi.org/10.1002/jhm.2260
12. Agee C, Coulter L, Hudson J. Effects of pharmacy resident led education on resident physician prescribing habits associated with stress ulcer prophylaxis in non-intensive care unit patients. Am J Health Syst Pharm. 2015;72(11 Suppl 1):S48-S52. https://doi.org/10.2146/sp150013
13. Chui D, Young F, Tejani AM, Dillon EC. Impact of academic detailing on proton pump inhibitor prescribing behaviour in a community hospital. Can Pharm J (Ott). 2011;144(2):66-71. https://doi.org/10.3821/1913-701X-144.2.66
14. Hamzat H, Sun H, Ford JC, Macleod J, Soiza RL, Mangoni AA. Inappropriate prescribing of proton pump inhibitors in older patients: effects of an educational strategy. Drugs Aging. 2012;29(8):681-690. https://doi.org/10.1007/bf03262283
15. Liberman JD, Whelan CT. Brief report: Reducing inappropriate usage of stress ulcer prophylaxis among internal medicine residents. A practice-based educational intervention. J Gen Intern Med. 2006;21(5):498-500. https://doi.org/10.1111/j.1525-1497.2006.00435.x
16. Belfield KD, Kuyumjian AG, Teran R, Amadi M, Blatt M, Bicking K. Impact of a collaborative strategy to reduce the inappropriate use of acid suppressive therapy in non-intensive care unit patients. Ann Pharmacother. 2017;51(7):577-583. https://doi.org/10.1177/1060028017698797
17. Del Giorno R, Ceschi A, Pironi M, Zasa A, Greco A, Gabutti L. Multifaceted intervention to curb in-hospital over-prescription of proton pump inhibitors: a longitudinal multicenter quasi-experimental before-and-after study. Eur J Intern Med. 2018;50:52-59. https://doi.org/10.1016/j.ejim.2017.11.002
18. Gupta R, Marshall J, Munoz JC, Kottoor R, Jamal MM, Vega KJ. Decreased acid suppression therapy overuse after education and medication reconciliation. Int J Clin Pract. 2013;67(1):60-65. https://doi.org/10.1111/ijcp.12046
19. Hatch JB, Schulz L, Fish JT. Stress ulcer prophylaxis: reducing non-indicated prescribing after hospital discharge. Ann Pharmacother. 2010;44(10):1565-1571. https://doi.org/10.1345/aph.1p167
20. Khalili H, Dashti-Khavidaki S, Hossein Talasaz AH, Tabeefar H, Hendoiee N. Descriptive analysis of a clinical pharmacy intervention to improve the appropriate use of stress ulcer prophylaxis in a hospital infectious disease ward. J Manag Care Pharm. 2010;16(2):114-121. https://doi.org/10.18553/jmcp.2010.16.2.114
21. Masood U, Sharma A, Bhatti Z, et al. A successful pharmacist-based quality initiative to reduce inappropriate stress ulcer prophylaxis use in an academic medical intensive care unit. Inquiry. 2018;55:46958018759116. https://doi.org/10.1177/0046958018759116
22. McDonald EG, Jones J, Green L, Jayaraman D, Lee TC. Reduction of inappropriate exit prescriptions for proton pump inhibitors: a before-after study using education paired with a web-based quality-improvement tool. J Hosp Med. 2015;10(5):281-286. https://doi.org/10.1002/jhm.2330
23. Tasaka CL, Burg C, VanOsdol SJ, et al. An interprofessional approach to reducing the overutilization of stress ulcer prophylaxis in adult medical and surgical intensive care units. Ann Pharmacother. 2014;48(4):462-469. https://doi.org/10.1177/1060028013517088
24. Zink DA, Pohlman M, Barnes M, Cannon ME. Long-term use of acid suppression started inappropriately during hospitalization. Aliment Pharmacol Ther. 2005;21(10):1203-1209. https://doi.org/10.1111/j.1365-2036.2005.02454.x
25. Pham CQ, Regal RE, Bostwick TR, Knauf KS. Acid suppressive therapy use on an inpatient internal medicine service. Ann Pharmacother. 2006;40(7-8):1261-1266. https://doi.org/10.1345/aph.1g703
26. Schoenfeld AJ, Grady D. Adverse effects associated with proton pump inhibitors [editorial]. JAMA Intern Med. 2016;176(2):172-174. https://doi.org/10.1001/jamainternmed.2015.7927
27. Laine L, Jensen DM. Management of patients with ulcer bleeding. Am J Gastroenterol. 2012;107(3):345-360; quiz 361. https://doi.org/10.1038/ajg.2011.480
© 2021 Society of Hospital Medicine
Things We Do For No Reason™: Rasburicase for Adult Patients With Tumor Lysis Syndrome
Inspired by the ABIM Foundation’s Choosing Wisely ® campaign, the “Things We Do for No Reason™” (TWDFNR) series reviews practices that have become common parts of hospital care but may provide little value to our patients. Practices reviewed in the TWDFNR series do not represent clear-cut conclusions or clinical practice standards but are meant as a starting place for research and active discussions among hospitalists and patients. We invite you to be part of that discussion.
CLINICAL SCENARIO
A 35-year-old man with a history of diffuse large B-cell lymphoma (DLBCL), who most recently received treatment 12 months earlier, presents to the emergency department with abdominal pain and constipation. A computed tomography scan of the abdomen reveals retroperitoneal and mesenteric lymphadenopathy causing small bowel obstruction. The basic metabolic panel reveals a creatinine of 1.1 mg/dL, calcium of 8.5 mg/dL, phosphorus of 4 mg/dL, potassium of 4.5 mEq/L, and uric acid of 7.3 mg/dL. The admitting team contemplates using allopurinol or rasburicase for tumor lysis syndrome (TLS) prevention in the setting of recurrent DLBCL.
BACKGROUND
Tumor lysis syndrome is characterized by metabolic derangement and end-organ damage in the setting of cytotoxic chemotherapy, chemosensitive malignancy, and/or increased tumor burden.1 Risk stratification for TLS takes into account patient and disease characteristics (Table 1). Other risk factors include tumor bulk, elevated baseline serum lactate dehydrogenase, and certain types of chemotherapy (eg, cisplatin, cytarabine, etoposide, paclitaxel, cytotoxic therapies), immunotherapy, or targeted therapy.2 Elevated serum levels of uric acid, potassium, and phosphorus, as well as preexisting renal dysfunction, predispose patients to clinical TLS.3
The Cairo-Bishop classification system is most frequently used to diagnose TLS (Table 2).3 Laboratory features include hyperkalemia, hyperphosphatemia, hyperuricemia, and hypocalcemia secondary to lysis of proliferating tumor cells and their nuclei. Clinical features include arrhythmias, seizures, and acute kidney injury (AKI).1 Acute kidney injury, the most common clinical complication of TLS, results from crystallization of markedly elevated plasma uric acid, leading to tubular obstruction.1,4 The development of AKI can predict morbidity (namely, the need for renal replacement therapy [RRT]) and mortality in this patient population.1
Stratifying a patient’s baseline risk of developing TLS often dictates the prevention and management plan. Therapeutic prophylaxis and management strategies for TLS include aggressive fluid resuscitation, diuresis, plasma uric acid (PUA) levels, monitoring electrolyte levels, and, in certain life-threatening situations, dialysis. Oncologists presume reducing uric acid levels prevents and treats TLS.
Current methods to reduce PUA as a means of preventing or treating TLS include xanthine oxidase inhibitors (eg, allopurinol) or urate oxidase (eg, rasburicase). Before the US Food and Drug Administration’s (FDA) approval of rasburicase to manage TLS, providers combined allopurinol (a purine analog that inhibits the enzyme xanthine oxidase, decreasing uric acid level) with aggressive fluid resuscitation. Approved by the FDA in 2002, rasburicase offers an alternative treatment for hyperuricemia by directly decreasing levels of uric acid instead of merely preventing the increased formation of uric acid. As a urate oxidase, rasburicase converts uric acid to the non-nephrotoxic, water-soluble, and freely excreted allantoin.
WHY YOU MIGHT THINK YOU SHOULD USE URATE OXIDASE IN TUMOR LYSIS SYNDROME FOR THE PREVENTION AND MANAGEMENT OF ACUTE KIDNEY INJURY
Rasburicase is often considered the standard-of-care treatment for hyperuricemia due to its ability to reduce circulating uric acid levels rapidly. The primary goal of uric acid reduction is to prevent the occurrence of AKI.
Based upon bioplausible relevance to clinically meaningful endpoints, researchers selected PUA reduction as the primary outcome in randomized controlled trials (RCTs) and observational studies to justify treatment with rasburicase. In RCTs, compassionate trials, and systematic reviews and meta-analyses, rasburicase demonstrated a more rapid reduction in uric acid levels compared to allopurinol.5 Specifically, in one study by Goldman et al,6 rasburicase decreased baseline uric acid levels in pediatric oncology patients by 86% (statistically significant) 4 hours after administration, compared to allopurinol, which only reduced baseline uric acid by 12%. According to a study by Cairo et al, allopurinol may take up to 1 day to reduce PUA.3
WHY URATE OXIDASE MAY NOT IMPROVE CLINICAL OUTCOMES IN PATIENTS AT RISK FOR OR WITH TUMOR LYSIS SYNDROME
Randomized controlled trials examining the safety, efficacy, and cost-effectiveness of rasburicase in adult patients remain sparse. Both RCTs and systematic reviews and meta-analyses rely on PUA levels as a surrogate endpoint and fail to include clinically meaningful primary endpoints (eg, change in baseline creatinine or need for RRT), raising the question as to whether rasburicase improves patient-centered outcomes.5 Since previous studies in the oncology literature show low or modest correlations between PUA reduction and patient-oriented outcomes, we must question whether PUA reduction serves as a meaningful surrogate endpoint.
Treatment of Tumor Lysis Syndrome
Two meta-analyses focusing on the treatment of TLS by Dinnel et al5 and Lopez-Olivo et al8 each included only three unique RCTs (two of the three RCTs were referenced in both meta-analyses). Moreover, both studies included only one RCT comparing rasburicase directly to allopurinol (a 2010 RCT by Cortes et al9) while the other RCTs compared the impact of different rasburicase dosing regimens. Researchers powered the head-to-head RCT by Cortes et al9 to detect a difference in PUA levels across three different arms: rasburicase, rasburicase plus allopurinol, or allopurinol alone. All three treatment arms resulted in a statistically significant reduction in serum PUA levels (87%, 78%, 66%, respectively; P = .001) without a change in the secondary, underpowered clinical outcomes such as clinical TLS or reduced renal function (defined in this study as increased creatinine, renal failure/impairment, or acute renal failure).
More recently, retrospective analyses of patients with AKI secondary to TLS found no difference in creatinine improvement, renal recovery, or prevention of RRT based on whether the patients received either rasburicase or allopurinol.10,11 While rasburicase is associated with greater PUA reduction compared to allopurinol, according to meaningful RCT and observational data as discussed previously and described further in the following section, this does not translate to clinically important risk reduction.
Prevention of Tumor Lysis Syndrome
Furthermore, there exists little compelling evidence to support the use of rasburicase for preventing AKI secondary to TLS. Even among patients at high-risk for TLS (the only group for whom rasburicase is currently recommended),5 rasburicase does not definitively prevent AKI. Data suggest that despite lowering uric acid levels, rasburicase does not consistently prevent renal injury11 or decrease the total number of subsequent inpatient days.12 The only phase 3 trial that compared the efficacy of rasburicase to allopurinol for the prevention of TLS and included clinically meaningful endpoints (eg, renal failure) found that, while rasburicase reduced uric acid levels faster than allopurinol, it did not decrease rates of clinical TLS.9
The published literature offers limited efficacy data of rasburicase in preventing TLS in low-risk patients; however, the absence of benefit of rasburicase in preventing renal failure in high-risk patients warrants skepticism as to its potential efficacy in low-risk patients.8,10
Costs-Effectiveness and Other Ethical Considerations
Rasburicase is an expensive treatment. The estimated cost of the FDA-recommended dosing is around $37,500.13 Moreover, studies comparing the cost-effectiveness of rasburicase to allopurinol focus primarily on patients at high-risk for TLS, which overestimates the cost-effectiveness of rasburicase in patients at low-to-intermediate risk for TLS.14,15 Unfortunately, some providers inappropriately prescribe rasburicase regularly to patients at low or intermediate risk for TLS. Based on observational studies of rasburicase in various clinical scenarios, including inpatient and emergency department settings, inappropriate use of rasburicase (eg, in the setting of hyperuricemia without evidence of a high-risk TLS tumor, no prior trial of allopurinol, preserved renal function, no laboratory evaluation) ranges from 32% to 70%.14,15
Finally, while <1% of patients experience rasburicase-induced anaphylaxis, 20% to 50% of patients develop gastrointestinal symptoms and viral-syndrome-like symptoms.16 Meanwhile, major side effects from allopurinol that occur with 1% to 10% frequency include maculopapular rash, pruritis, gout, nausea, vomiting, and renal failure syndrome.17 Even if the cost for rasburicase and allopurinol were similar, the lack of improved efficacy and the side-effect profiles of the two medications should make us question whether to prescribe rasburicase preferentially over allopurinol.
WHEN MIGHT URATE OXIDASE BE HELPFUL IN TUMOR LYSIS SYNDROME
While some experts recommend rasburicase prophylaxis in patients at high risk for developing TLS, such recommendations rely on low-quality evidence.2 When prescribing rasburicase, the hospitalist must ensure correct dosing. The FDA approved rasburicase for weight-based dosing at 0.2 mg/kg, though current evidence favors a single, fixed dose of 3 mg.16,17 Compared to weight-based dosing, which has an estimated cost-effectiveness ratio ranging from $27,982.77 to $119,643.59 per quality-adjusted life-year, single dosing has equivalent efficacy at approximately 50% lower cost per dose.11,17,18
WHAT YOU SHOULD DO INSTEAD
As a preventive treatment for TLS, clinicians should only consider prescribing rasburicase as a single fixed dose of 3 mg to high-risk patients.17 In the event of AKI secondary to TLS, clinicians should proceed with the mainstay treatment of resuscitation with aggressive fluid resuscitation, with a goal urine output of at least 2 mL/kg/h.1 Fluid resuscitation should be used cautiously in patients with oliguric or anuric AKI, pulmonary hypertension, congestive heart failure, and hemodynamically significant valvular disease. Clinicians should provide continuous cardiac monitoring during the initial presentation to monitor for electrocardiographic changes in the setting of hyperkalemia and hypocalcemia, and they should consult nephrology, oncology, and critical care services early in the disease course to maximize coordination of care.
RECOMMENDATIONS
Prevention
- Identify patients at high-risk of TLS (Table 1) and consider a single 3-mg dose of rasburicase.
- Manage low- and intermediate-risk patients with allopurinol and hydration.
Treatment
- Identify patients with TLS using the clinical and laboratory findings outlined in the Cairo-Bishop classification system (Table 2).
- Initiate aggressive fluid resuscitation and manage electrolyte abnormalities.
- If urate-lowering therapy is part of local hospital guidelines for TLS management, consider a single dose regimen of rasburicase utilizing shared decision-making.
CONCLUSION
Tumor lysis syndrome remains a metabolic emergency that requires rapid diagnosis and management to prevent morbidity and mortality. Current data show rasburicase rapidly decreases PUA compared to allopurinol. However, the current literature does not provide compelling evidence that rapidly lowering uric acid with rasburicase to prevent TLS or to treat AKI secondary to TLS improves patient-oriented outcomes.
Do you think this is a low-value practice? Is this truly a “Thing We Do for No Reason™”? Share what you do in your practice and join in the conversation online by retweeting it on Twitter (#TWDFNR) and liking it on Facebook. We invite you to propose ideas for other “Things We Do for No Reason™” topics by emailing TWDFNR@hospitalmedicine.org
1. Howard SC, Jones DP, Pui CH. The tumor lysis syndrome. N Engl J Med.2011;364(19):1844-1854. https://doi.org/10.1056/nejmra0904569
2. Cairo MS, Coiffier B, Reiter A, Younes A; TLS Expert Panel. Recommendations for the evaluation of risk and prophylaxis of tumour lysis syndrome (TLS) in adults and children with malignant diseases: an expert TLS panel consensus. Br J Haematol. 2010;149(4):578-586. https://doi.org/10.1111/j.1365-2141.2010.08143.x
3. Cairo MS, Bishop M. Tumour lysis syndrome: new therapeutic strategies and classification. Br J Haematol.. 2004;127(1):3-11. https://doi.org/10.1111/j.1365-2141.2004.05094.x
4. Durani U, Shah ND, Go RS. In-hospital outcomes of tumor lysis syndrome: a population-based study using the National Inpatient Sample. Oncologist. 2017;22(12):1506-1509. https://doi.org/10.1634/theoncologist.2017-0147
5. Dinnel J, Moore BL, Skiver BM, Bose P. Rasburicase in the management of tumor lysis: an evidence-based review of its place in therapy. Core Evid.. 2015;10:23-38. https://doi.org/10.2147/ce.s54995
6. Goldman SC, Holcenberg JS, Finklestein JZ, et al. A randomized comparison between rasburicase and allopurinol in children with lymphoma or leukemia at high risk for tumor lysis. Blood. 2001;97(10):2998-3003. https://doi.org/10.1182/blood.v97.10.2998
7. Haslam A, Hey SP, Gill J, Prasad V. A systematic review of trial-level meta-analyses measuring the strength of association between surrogate end-points and overall survival in oncology. Eur J Cancer. 1990. 2019;106:196-211. https://doi.org/10.1016/j.ejca.2018.11.012
8. Lopez-Olivo MA, Pratt G, Palla SL, Salahudeen A. Rasburicase in tumor lysis syndrome of the adult: a systematic review and meta-analysis. Am J Kidney Dis. 2013;62(3):481-492. https://doi.org/10.1053/j.ajkd.2013.02.378
9. Cortes J, Moore JO, Maziarz RT, et al. Control of plasma uric acid in adults at risk for tumor lysis syndrome: efficacy and safety of rasburicase alone and rasburicase followed by allopurinol compared with allopurinol alone—results of a multicenter phase III study. J Clin Oncol. 2010;28(27):4207-4213. https://doi.org/10.1200/jco.2009.26.8896
10. Martens KL, Khalighi PR, Li S, et al. Comparative effectiveness of rasburicase versus allopurinol for cancer patients with renal dysfunction and hyperuricemia. Leuk Res. 2020;89:106298. https://doi.org/10.1016/j.leukres.2020.106298
11. Personett HA, Barreto EF, McCullough K, Dierkhising R, Leung N, Habermann TM. Impact of early rasburicase on incidence and outcomes of clinical tumor lysis syndrome in lymphoma. Blood. 2019;60(9)2271-2277. https://doi.org/10.1080/10428194.2019.1574000
12. Howard SC, Cockerham AR, Yvonne Barnes DN, Ryan M, Irish W, Gordan L. Real-world analysis of outpatient rasburicase to prevent and manage tumor lysis syndrome in newly diagnosed adults with leukemia or lymphoma. J Clin Pathways. 2020;6(2):46-51.
13. Abu-Hashyeh AM, Shenouda M, Al-Sharedi M. The efficacy of cost-effective fixed dose of rasburicase compared to weight-based dose in treatment and prevention of tumor lysis syndrome (TLS). J Natl Compr Canc Netw. 2020;18(3.5):QIM20-119. https://doi.org/10.6004/jnccn.2019.7516
14. Patel KK, Brown TJ, Gupta A, et al. Decreasing inappropriate use of rasburicase to promote cost-effective care. J Oncol Pract. 2019;15(2):e178-e186. https://doi.org/10.1200/jop.18.00528
15. Khalighi PR, Martens KL, White AA, et al. Utilization patterns and clinical outcomes of rasburicase administration according to tumor risk stratification. J Oncol Pharm Pract. 2020;26(3):529-535. https://doi.org/10.1177/1078155219851543
16. Elitek. Prescribing information. Sanofi-Aventis U.S., LLC; 2019. Accessed June 1, 2021. https://products.sanofi.us/elitek/Elitek.html
17. Allopurinol. Drugs & Diseases. Medscape. Accessed June 1, 2021. https://reference.medscape.com/drug/zyloprim-aloprim-allopurinol-342811
18. Jones GL, Will A, Jackson GH, Webb NJA, Rule S; British Committee for Standards in Haematology. Guidelines for the management of tumour lysis syndrome in adults and children with haematological malignancies on behalf of the British Committee for Standards in Haematology. Br J Haematol. 2015;169(5):661‐671. https://doi.org/10.1111/bjh.13403
19. Boutin A, Blackman A, O’Sullivan DM, Forcello N. The value of fixed rasburicase dosing versus weight-based dosing in the treatment and prevention of tumor lysis syndrome. J Oncol Pharm Pract. 2019;25(3):577-583. https://doi.org/10.1177/1078155217752075
Inspired by the ABIM Foundation’s Choosing Wisely ® campaign, the “Things We Do for No Reason™” (TWDFNR) series reviews practices that have become common parts of hospital care but may provide little value to our patients. Practices reviewed in the TWDFNR series do not represent clear-cut conclusions or clinical practice standards but are meant as a starting place for research and active discussions among hospitalists and patients. We invite you to be part of that discussion.
CLINICAL SCENARIO
A 35-year-old man with a history of diffuse large B-cell lymphoma (DLBCL), who most recently received treatment 12 months earlier, presents to the emergency department with abdominal pain and constipation. A computed tomography scan of the abdomen reveals retroperitoneal and mesenteric lymphadenopathy causing small bowel obstruction. The basic metabolic panel reveals a creatinine of 1.1 mg/dL, calcium of 8.5 mg/dL, phosphorus of 4 mg/dL, potassium of 4.5 mEq/L, and uric acid of 7.3 mg/dL. The admitting team contemplates using allopurinol or rasburicase for tumor lysis syndrome (TLS) prevention in the setting of recurrent DLBCL.
BACKGROUND
Tumor lysis syndrome is characterized by metabolic derangement and end-organ damage in the setting of cytotoxic chemotherapy, chemosensitive malignancy, and/or increased tumor burden.1 Risk stratification for TLS takes into account patient and disease characteristics (Table 1). Other risk factors include tumor bulk, elevated baseline serum lactate dehydrogenase, and certain types of chemotherapy (eg, cisplatin, cytarabine, etoposide, paclitaxel, cytotoxic therapies), immunotherapy, or targeted therapy.2 Elevated serum levels of uric acid, potassium, and phosphorus, as well as preexisting renal dysfunction, predispose patients to clinical TLS.3
The Cairo-Bishop classification system is most frequently used to diagnose TLS (Table 2).3 Laboratory features include hyperkalemia, hyperphosphatemia, hyperuricemia, and hypocalcemia secondary to lysis of proliferating tumor cells and their nuclei. Clinical features include arrhythmias, seizures, and acute kidney injury (AKI).1 Acute kidney injury, the most common clinical complication of TLS, results from crystallization of markedly elevated plasma uric acid, leading to tubular obstruction.1,4 The development of AKI can predict morbidity (namely, the need for renal replacement therapy [RRT]) and mortality in this patient population.1
Stratifying a patient’s baseline risk of developing TLS often dictates the prevention and management plan. Therapeutic prophylaxis and management strategies for TLS include aggressive fluid resuscitation, diuresis, plasma uric acid (PUA) levels, monitoring electrolyte levels, and, in certain life-threatening situations, dialysis. Oncologists presume reducing uric acid levels prevents and treats TLS.
Current methods to reduce PUA as a means of preventing or treating TLS include xanthine oxidase inhibitors (eg, allopurinol) or urate oxidase (eg, rasburicase). Before the US Food and Drug Administration’s (FDA) approval of rasburicase to manage TLS, providers combined allopurinol (a purine analog that inhibits the enzyme xanthine oxidase, decreasing uric acid level) with aggressive fluid resuscitation. Approved by the FDA in 2002, rasburicase offers an alternative treatment for hyperuricemia by directly decreasing levels of uric acid instead of merely preventing the increased formation of uric acid. As a urate oxidase, rasburicase converts uric acid to the non-nephrotoxic, water-soluble, and freely excreted allantoin.
WHY YOU MIGHT THINK YOU SHOULD USE URATE OXIDASE IN TUMOR LYSIS SYNDROME FOR THE PREVENTION AND MANAGEMENT OF ACUTE KIDNEY INJURY
Rasburicase is often considered the standard-of-care treatment for hyperuricemia due to its ability to reduce circulating uric acid levels rapidly. The primary goal of uric acid reduction is to prevent the occurrence of AKI.
Based upon bioplausible relevance to clinically meaningful endpoints, researchers selected PUA reduction as the primary outcome in randomized controlled trials (RCTs) and observational studies to justify treatment with rasburicase. In RCTs, compassionate trials, and systematic reviews and meta-analyses, rasburicase demonstrated a more rapid reduction in uric acid levels compared to allopurinol.5 Specifically, in one study by Goldman et al,6 rasburicase decreased baseline uric acid levels in pediatric oncology patients by 86% (statistically significant) 4 hours after administration, compared to allopurinol, which only reduced baseline uric acid by 12%. According to a study by Cairo et al, allopurinol may take up to 1 day to reduce PUA.3
WHY URATE OXIDASE MAY NOT IMPROVE CLINICAL OUTCOMES IN PATIENTS AT RISK FOR OR WITH TUMOR LYSIS SYNDROME
Randomized controlled trials examining the safety, efficacy, and cost-effectiveness of rasburicase in adult patients remain sparse. Both RCTs and systematic reviews and meta-analyses rely on PUA levels as a surrogate endpoint and fail to include clinically meaningful primary endpoints (eg, change in baseline creatinine or need for RRT), raising the question as to whether rasburicase improves patient-centered outcomes.5 Since previous studies in the oncology literature show low or modest correlations between PUA reduction and patient-oriented outcomes, we must question whether PUA reduction serves as a meaningful surrogate endpoint.
Treatment of Tumor Lysis Syndrome
Two meta-analyses focusing on the treatment of TLS by Dinnel et al5 and Lopez-Olivo et al8 each included only three unique RCTs (two of the three RCTs were referenced in both meta-analyses). Moreover, both studies included only one RCT comparing rasburicase directly to allopurinol (a 2010 RCT by Cortes et al9) while the other RCTs compared the impact of different rasburicase dosing regimens. Researchers powered the head-to-head RCT by Cortes et al9 to detect a difference in PUA levels across three different arms: rasburicase, rasburicase plus allopurinol, or allopurinol alone. All three treatment arms resulted in a statistically significant reduction in serum PUA levels (87%, 78%, 66%, respectively; P = .001) without a change in the secondary, underpowered clinical outcomes such as clinical TLS or reduced renal function (defined in this study as increased creatinine, renal failure/impairment, or acute renal failure).
More recently, retrospective analyses of patients with AKI secondary to TLS found no difference in creatinine improvement, renal recovery, or prevention of RRT based on whether the patients received either rasburicase or allopurinol.10,11 While rasburicase is associated with greater PUA reduction compared to allopurinol, according to meaningful RCT and observational data as discussed previously and described further in the following section, this does not translate to clinically important risk reduction.
Prevention of Tumor Lysis Syndrome
Furthermore, there exists little compelling evidence to support the use of rasburicase for preventing AKI secondary to TLS. Even among patients at high-risk for TLS (the only group for whom rasburicase is currently recommended),5 rasburicase does not definitively prevent AKI. Data suggest that despite lowering uric acid levels, rasburicase does not consistently prevent renal injury11 or decrease the total number of subsequent inpatient days.12 The only phase 3 trial that compared the efficacy of rasburicase to allopurinol for the prevention of TLS and included clinically meaningful endpoints (eg, renal failure) found that, while rasburicase reduced uric acid levels faster than allopurinol, it did not decrease rates of clinical TLS.9
The published literature offers limited efficacy data of rasburicase in preventing TLS in low-risk patients; however, the absence of benefit of rasburicase in preventing renal failure in high-risk patients warrants skepticism as to its potential efficacy in low-risk patients.8,10
Costs-Effectiveness and Other Ethical Considerations
Rasburicase is an expensive treatment. The estimated cost of the FDA-recommended dosing is around $37,500.13 Moreover, studies comparing the cost-effectiveness of rasburicase to allopurinol focus primarily on patients at high-risk for TLS, which overestimates the cost-effectiveness of rasburicase in patients at low-to-intermediate risk for TLS.14,15 Unfortunately, some providers inappropriately prescribe rasburicase regularly to patients at low or intermediate risk for TLS. Based on observational studies of rasburicase in various clinical scenarios, including inpatient and emergency department settings, inappropriate use of rasburicase (eg, in the setting of hyperuricemia without evidence of a high-risk TLS tumor, no prior trial of allopurinol, preserved renal function, no laboratory evaluation) ranges from 32% to 70%.14,15
Finally, while <1% of patients experience rasburicase-induced anaphylaxis, 20% to 50% of patients develop gastrointestinal symptoms and viral-syndrome-like symptoms.16 Meanwhile, major side effects from allopurinol that occur with 1% to 10% frequency include maculopapular rash, pruritis, gout, nausea, vomiting, and renal failure syndrome.17 Even if the cost for rasburicase and allopurinol were similar, the lack of improved efficacy and the side-effect profiles of the two medications should make us question whether to prescribe rasburicase preferentially over allopurinol.
WHEN MIGHT URATE OXIDASE BE HELPFUL IN TUMOR LYSIS SYNDROME
While some experts recommend rasburicase prophylaxis in patients at high risk for developing TLS, such recommendations rely on low-quality evidence.2 When prescribing rasburicase, the hospitalist must ensure correct dosing. The FDA approved rasburicase for weight-based dosing at 0.2 mg/kg, though current evidence favors a single, fixed dose of 3 mg.16,17 Compared to weight-based dosing, which has an estimated cost-effectiveness ratio ranging from $27,982.77 to $119,643.59 per quality-adjusted life-year, single dosing has equivalent efficacy at approximately 50% lower cost per dose.11,17,18
WHAT YOU SHOULD DO INSTEAD
As a preventive treatment for TLS, clinicians should only consider prescribing rasburicase as a single fixed dose of 3 mg to high-risk patients.17 In the event of AKI secondary to TLS, clinicians should proceed with the mainstay treatment of resuscitation with aggressive fluid resuscitation, with a goal urine output of at least 2 mL/kg/h.1 Fluid resuscitation should be used cautiously in patients with oliguric or anuric AKI, pulmonary hypertension, congestive heart failure, and hemodynamically significant valvular disease. Clinicians should provide continuous cardiac monitoring during the initial presentation to monitor for electrocardiographic changes in the setting of hyperkalemia and hypocalcemia, and they should consult nephrology, oncology, and critical care services early in the disease course to maximize coordination of care.
RECOMMENDATIONS
Prevention
- Identify patients at high-risk of TLS (Table 1) and consider a single 3-mg dose of rasburicase.
- Manage low- and intermediate-risk patients with allopurinol and hydration.
Treatment
- Identify patients with TLS using the clinical and laboratory findings outlined in the Cairo-Bishop classification system (Table 2).
- Initiate aggressive fluid resuscitation and manage electrolyte abnormalities.
- If urate-lowering therapy is part of local hospital guidelines for TLS management, consider a single dose regimen of rasburicase utilizing shared decision-making.
CONCLUSION
Tumor lysis syndrome remains a metabolic emergency that requires rapid diagnosis and management to prevent morbidity and mortality. Current data show rasburicase rapidly decreases PUA compared to allopurinol. However, the current literature does not provide compelling evidence that rapidly lowering uric acid with rasburicase to prevent TLS or to treat AKI secondary to TLS improves patient-oriented outcomes.
Do you think this is a low-value practice? Is this truly a “Thing We Do for No Reason™”? Share what you do in your practice and join in the conversation online by retweeting it on Twitter (#TWDFNR) and liking it on Facebook. We invite you to propose ideas for other “Things We Do for No Reason™” topics by emailing TWDFNR@hospitalmedicine.org
Inspired by the ABIM Foundation’s Choosing Wisely ® campaign, the “Things We Do for No Reason™” (TWDFNR) series reviews practices that have become common parts of hospital care but may provide little value to our patients. Practices reviewed in the TWDFNR series do not represent clear-cut conclusions or clinical practice standards but are meant as a starting place for research and active discussions among hospitalists and patients. We invite you to be part of that discussion.
CLINICAL SCENARIO
A 35-year-old man with a history of diffuse large B-cell lymphoma (DLBCL), who most recently received treatment 12 months earlier, presents to the emergency department with abdominal pain and constipation. A computed tomography scan of the abdomen reveals retroperitoneal and mesenteric lymphadenopathy causing small bowel obstruction. The basic metabolic panel reveals a creatinine of 1.1 mg/dL, calcium of 8.5 mg/dL, phosphorus of 4 mg/dL, potassium of 4.5 mEq/L, and uric acid of 7.3 mg/dL. The admitting team contemplates using allopurinol or rasburicase for tumor lysis syndrome (TLS) prevention in the setting of recurrent DLBCL.
BACKGROUND
Tumor lysis syndrome is characterized by metabolic derangement and end-organ damage in the setting of cytotoxic chemotherapy, chemosensitive malignancy, and/or increased tumor burden.1 Risk stratification for TLS takes into account patient and disease characteristics (Table 1). Other risk factors include tumor bulk, elevated baseline serum lactate dehydrogenase, and certain types of chemotherapy (eg, cisplatin, cytarabine, etoposide, paclitaxel, cytotoxic therapies), immunotherapy, or targeted therapy.2 Elevated serum levels of uric acid, potassium, and phosphorus, as well as preexisting renal dysfunction, predispose patients to clinical TLS.3
The Cairo-Bishop classification system is most frequently used to diagnose TLS (Table 2).3 Laboratory features include hyperkalemia, hyperphosphatemia, hyperuricemia, and hypocalcemia secondary to lysis of proliferating tumor cells and their nuclei. Clinical features include arrhythmias, seizures, and acute kidney injury (AKI).1 Acute kidney injury, the most common clinical complication of TLS, results from crystallization of markedly elevated plasma uric acid, leading to tubular obstruction.1,4 The development of AKI can predict morbidity (namely, the need for renal replacement therapy [RRT]) and mortality in this patient population.1
Stratifying a patient’s baseline risk of developing TLS often dictates the prevention and management plan. Therapeutic prophylaxis and management strategies for TLS include aggressive fluid resuscitation, diuresis, plasma uric acid (PUA) levels, monitoring electrolyte levels, and, in certain life-threatening situations, dialysis. Oncologists presume reducing uric acid levels prevents and treats TLS.
Current methods to reduce PUA as a means of preventing or treating TLS include xanthine oxidase inhibitors (eg, allopurinol) or urate oxidase (eg, rasburicase). Before the US Food and Drug Administration’s (FDA) approval of rasburicase to manage TLS, providers combined allopurinol (a purine analog that inhibits the enzyme xanthine oxidase, decreasing uric acid level) with aggressive fluid resuscitation. Approved by the FDA in 2002, rasburicase offers an alternative treatment for hyperuricemia by directly decreasing levels of uric acid instead of merely preventing the increased formation of uric acid. As a urate oxidase, rasburicase converts uric acid to the non-nephrotoxic, water-soluble, and freely excreted allantoin.
WHY YOU MIGHT THINK YOU SHOULD USE URATE OXIDASE IN TUMOR LYSIS SYNDROME FOR THE PREVENTION AND MANAGEMENT OF ACUTE KIDNEY INJURY
Rasburicase is often considered the standard-of-care treatment for hyperuricemia due to its ability to reduce circulating uric acid levels rapidly. The primary goal of uric acid reduction is to prevent the occurrence of AKI.
Based upon bioplausible relevance to clinically meaningful endpoints, researchers selected PUA reduction as the primary outcome in randomized controlled trials (RCTs) and observational studies to justify treatment with rasburicase. In RCTs, compassionate trials, and systematic reviews and meta-analyses, rasburicase demonstrated a more rapid reduction in uric acid levels compared to allopurinol.5 Specifically, in one study by Goldman et al,6 rasburicase decreased baseline uric acid levels in pediatric oncology patients by 86% (statistically significant) 4 hours after administration, compared to allopurinol, which only reduced baseline uric acid by 12%. According to a study by Cairo et al, allopurinol may take up to 1 day to reduce PUA.3
WHY URATE OXIDASE MAY NOT IMPROVE CLINICAL OUTCOMES IN PATIENTS AT RISK FOR OR WITH TUMOR LYSIS SYNDROME
Randomized controlled trials examining the safety, efficacy, and cost-effectiveness of rasburicase in adult patients remain sparse. Both RCTs and systematic reviews and meta-analyses rely on PUA levels as a surrogate endpoint and fail to include clinically meaningful primary endpoints (eg, change in baseline creatinine or need for RRT), raising the question as to whether rasburicase improves patient-centered outcomes.5 Since previous studies in the oncology literature show low or modest correlations between PUA reduction and patient-oriented outcomes, we must question whether PUA reduction serves as a meaningful surrogate endpoint.
Treatment of Tumor Lysis Syndrome
Two meta-analyses focusing on the treatment of TLS by Dinnel et al5 and Lopez-Olivo et al8 each included only three unique RCTs (two of the three RCTs were referenced in both meta-analyses). Moreover, both studies included only one RCT comparing rasburicase directly to allopurinol (a 2010 RCT by Cortes et al9) while the other RCTs compared the impact of different rasburicase dosing regimens. Researchers powered the head-to-head RCT by Cortes et al9 to detect a difference in PUA levels across three different arms: rasburicase, rasburicase plus allopurinol, or allopurinol alone. All three treatment arms resulted in a statistically significant reduction in serum PUA levels (87%, 78%, 66%, respectively; P = .001) without a change in the secondary, underpowered clinical outcomes such as clinical TLS or reduced renal function (defined in this study as increased creatinine, renal failure/impairment, or acute renal failure).
More recently, retrospective analyses of patients with AKI secondary to TLS found no difference in creatinine improvement, renal recovery, or prevention of RRT based on whether the patients received either rasburicase or allopurinol.10,11 While rasburicase is associated with greater PUA reduction compared to allopurinol, according to meaningful RCT and observational data as discussed previously and described further in the following section, this does not translate to clinically important risk reduction.
Prevention of Tumor Lysis Syndrome
Furthermore, there exists little compelling evidence to support the use of rasburicase for preventing AKI secondary to TLS. Even among patients at high-risk for TLS (the only group for whom rasburicase is currently recommended),5 rasburicase does not definitively prevent AKI. Data suggest that despite lowering uric acid levels, rasburicase does not consistently prevent renal injury11 or decrease the total number of subsequent inpatient days.12 The only phase 3 trial that compared the efficacy of rasburicase to allopurinol for the prevention of TLS and included clinically meaningful endpoints (eg, renal failure) found that, while rasburicase reduced uric acid levels faster than allopurinol, it did not decrease rates of clinical TLS.9
The published literature offers limited efficacy data of rasburicase in preventing TLS in low-risk patients; however, the absence of benefit of rasburicase in preventing renal failure in high-risk patients warrants skepticism as to its potential efficacy in low-risk patients.8,10
Costs-Effectiveness and Other Ethical Considerations
Rasburicase is an expensive treatment. The estimated cost of the FDA-recommended dosing is around $37,500.13 Moreover, studies comparing the cost-effectiveness of rasburicase to allopurinol focus primarily on patients at high-risk for TLS, which overestimates the cost-effectiveness of rasburicase in patients at low-to-intermediate risk for TLS.14,15 Unfortunately, some providers inappropriately prescribe rasburicase regularly to patients at low or intermediate risk for TLS. Based on observational studies of rasburicase in various clinical scenarios, including inpatient and emergency department settings, inappropriate use of rasburicase (eg, in the setting of hyperuricemia without evidence of a high-risk TLS tumor, no prior trial of allopurinol, preserved renal function, no laboratory evaluation) ranges from 32% to 70%.14,15
Finally, while <1% of patients experience rasburicase-induced anaphylaxis, 20% to 50% of patients develop gastrointestinal symptoms and viral-syndrome-like symptoms.16 Meanwhile, major side effects from allopurinol that occur with 1% to 10% frequency include maculopapular rash, pruritis, gout, nausea, vomiting, and renal failure syndrome.17 Even if the cost for rasburicase and allopurinol were similar, the lack of improved efficacy and the side-effect profiles of the two medications should make us question whether to prescribe rasburicase preferentially over allopurinol.
WHEN MIGHT URATE OXIDASE BE HELPFUL IN TUMOR LYSIS SYNDROME
While some experts recommend rasburicase prophylaxis in patients at high risk for developing TLS, such recommendations rely on low-quality evidence.2 When prescribing rasburicase, the hospitalist must ensure correct dosing. The FDA approved rasburicase for weight-based dosing at 0.2 mg/kg, though current evidence favors a single, fixed dose of 3 mg.16,17 Compared to weight-based dosing, which has an estimated cost-effectiveness ratio ranging from $27,982.77 to $119,643.59 per quality-adjusted life-year, single dosing has equivalent efficacy at approximately 50% lower cost per dose.11,17,18
WHAT YOU SHOULD DO INSTEAD
As a preventive treatment for TLS, clinicians should only consider prescribing rasburicase as a single fixed dose of 3 mg to high-risk patients.17 In the event of AKI secondary to TLS, clinicians should proceed with the mainstay treatment of resuscitation with aggressive fluid resuscitation, with a goal urine output of at least 2 mL/kg/h.1 Fluid resuscitation should be used cautiously in patients with oliguric or anuric AKI, pulmonary hypertension, congestive heart failure, and hemodynamically significant valvular disease. Clinicians should provide continuous cardiac monitoring during the initial presentation to monitor for electrocardiographic changes in the setting of hyperkalemia and hypocalcemia, and they should consult nephrology, oncology, and critical care services early in the disease course to maximize coordination of care.
RECOMMENDATIONS
Prevention
- Identify patients at high-risk of TLS (Table 1) and consider a single 3-mg dose of rasburicase.
- Manage low- and intermediate-risk patients with allopurinol and hydration.
Treatment
- Identify patients with TLS using the clinical and laboratory findings outlined in the Cairo-Bishop classification system (Table 2).
- Initiate aggressive fluid resuscitation and manage electrolyte abnormalities.
- If urate-lowering therapy is part of local hospital guidelines for TLS management, consider a single dose regimen of rasburicase utilizing shared decision-making.
CONCLUSION
Tumor lysis syndrome remains a metabolic emergency that requires rapid diagnosis and management to prevent morbidity and mortality. Current data show rasburicase rapidly decreases PUA compared to allopurinol. However, the current literature does not provide compelling evidence that rapidly lowering uric acid with rasburicase to prevent TLS or to treat AKI secondary to TLS improves patient-oriented outcomes.
Do you think this is a low-value practice? Is this truly a “Thing We Do for No Reason™”? Share what you do in your practice and join in the conversation online by retweeting it on Twitter (#TWDFNR) and liking it on Facebook. We invite you to propose ideas for other “Things We Do for No Reason™” topics by emailing TWDFNR@hospitalmedicine.org
1. Howard SC, Jones DP, Pui CH. The tumor lysis syndrome. N Engl J Med.2011;364(19):1844-1854. https://doi.org/10.1056/nejmra0904569
2. Cairo MS, Coiffier B, Reiter A, Younes A; TLS Expert Panel. Recommendations for the evaluation of risk and prophylaxis of tumour lysis syndrome (TLS) in adults and children with malignant diseases: an expert TLS panel consensus. Br J Haematol. 2010;149(4):578-586. https://doi.org/10.1111/j.1365-2141.2010.08143.x
3. Cairo MS, Bishop M. Tumour lysis syndrome: new therapeutic strategies and classification. Br J Haematol.. 2004;127(1):3-11. https://doi.org/10.1111/j.1365-2141.2004.05094.x
4. Durani U, Shah ND, Go RS. In-hospital outcomes of tumor lysis syndrome: a population-based study using the National Inpatient Sample. Oncologist. 2017;22(12):1506-1509. https://doi.org/10.1634/theoncologist.2017-0147
5. Dinnel J, Moore BL, Skiver BM, Bose P. Rasburicase in the management of tumor lysis: an evidence-based review of its place in therapy. Core Evid.. 2015;10:23-38. https://doi.org/10.2147/ce.s54995
6. Goldman SC, Holcenberg JS, Finklestein JZ, et al. A randomized comparison between rasburicase and allopurinol in children with lymphoma or leukemia at high risk for tumor lysis. Blood. 2001;97(10):2998-3003. https://doi.org/10.1182/blood.v97.10.2998
7. Haslam A, Hey SP, Gill J, Prasad V. A systematic review of trial-level meta-analyses measuring the strength of association between surrogate end-points and overall survival in oncology. Eur J Cancer. 1990. 2019;106:196-211. https://doi.org/10.1016/j.ejca.2018.11.012
8. Lopez-Olivo MA, Pratt G, Palla SL, Salahudeen A. Rasburicase in tumor lysis syndrome of the adult: a systematic review and meta-analysis. Am J Kidney Dis. 2013;62(3):481-492. https://doi.org/10.1053/j.ajkd.2013.02.378
9. Cortes J, Moore JO, Maziarz RT, et al. Control of plasma uric acid in adults at risk for tumor lysis syndrome: efficacy and safety of rasburicase alone and rasburicase followed by allopurinol compared with allopurinol alone—results of a multicenter phase III study. J Clin Oncol. 2010;28(27):4207-4213. https://doi.org/10.1200/jco.2009.26.8896
10. Martens KL, Khalighi PR, Li S, et al. Comparative effectiveness of rasburicase versus allopurinol for cancer patients with renal dysfunction and hyperuricemia. Leuk Res. 2020;89:106298. https://doi.org/10.1016/j.leukres.2020.106298
11. Personett HA, Barreto EF, McCullough K, Dierkhising R, Leung N, Habermann TM. Impact of early rasburicase on incidence and outcomes of clinical tumor lysis syndrome in lymphoma. Blood. 2019;60(9)2271-2277. https://doi.org/10.1080/10428194.2019.1574000
12. Howard SC, Cockerham AR, Yvonne Barnes DN, Ryan M, Irish W, Gordan L. Real-world analysis of outpatient rasburicase to prevent and manage tumor lysis syndrome in newly diagnosed adults with leukemia or lymphoma. J Clin Pathways. 2020;6(2):46-51.
13. Abu-Hashyeh AM, Shenouda M, Al-Sharedi M. The efficacy of cost-effective fixed dose of rasburicase compared to weight-based dose in treatment and prevention of tumor lysis syndrome (TLS). J Natl Compr Canc Netw. 2020;18(3.5):QIM20-119. https://doi.org/10.6004/jnccn.2019.7516
14. Patel KK, Brown TJ, Gupta A, et al. Decreasing inappropriate use of rasburicase to promote cost-effective care. J Oncol Pract. 2019;15(2):e178-e186. https://doi.org/10.1200/jop.18.00528
15. Khalighi PR, Martens KL, White AA, et al. Utilization patterns and clinical outcomes of rasburicase administration according to tumor risk stratification. J Oncol Pharm Pract. 2020;26(3):529-535. https://doi.org/10.1177/1078155219851543
16. Elitek. Prescribing information. Sanofi-Aventis U.S., LLC; 2019. Accessed June 1, 2021. https://products.sanofi.us/elitek/Elitek.html
17. Allopurinol. Drugs & Diseases. Medscape. Accessed June 1, 2021. https://reference.medscape.com/drug/zyloprim-aloprim-allopurinol-342811
18. Jones GL, Will A, Jackson GH, Webb NJA, Rule S; British Committee for Standards in Haematology. Guidelines for the management of tumour lysis syndrome in adults and children with haematological malignancies on behalf of the British Committee for Standards in Haematology. Br J Haematol. 2015;169(5):661‐671. https://doi.org/10.1111/bjh.13403
19. Boutin A, Blackman A, O’Sullivan DM, Forcello N. The value of fixed rasburicase dosing versus weight-based dosing in the treatment and prevention of tumor lysis syndrome. J Oncol Pharm Pract. 2019;25(3):577-583. https://doi.org/10.1177/1078155217752075
1. Howard SC, Jones DP, Pui CH. The tumor lysis syndrome. N Engl J Med.2011;364(19):1844-1854. https://doi.org/10.1056/nejmra0904569
2. Cairo MS, Coiffier B, Reiter A, Younes A; TLS Expert Panel. Recommendations for the evaluation of risk and prophylaxis of tumour lysis syndrome (TLS) in adults and children with malignant diseases: an expert TLS panel consensus. Br J Haematol. 2010;149(4):578-586. https://doi.org/10.1111/j.1365-2141.2010.08143.x
3. Cairo MS, Bishop M. Tumour lysis syndrome: new therapeutic strategies and classification. Br J Haematol.. 2004;127(1):3-11. https://doi.org/10.1111/j.1365-2141.2004.05094.x
4. Durani U, Shah ND, Go RS. In-hospital outcomes of tumor lysis syndrome: a population-based study using the National Inpatient Sample. Oncologist. 2017;22(12):1506-1509. https://doi.org/10.1634/theoncologist.2017-0147
5. Dinnel J, Moore BL, Skiver BM, Bose P. Rasburicase in the management of tumor lysis: an evidence-based review of its place in therapy. Core Evid.. 2015;10:23-38. https://doi.org/10.2147/ce.s54995
6. Goldman SC, Holcenberg JS, Finklestein JZ, et al. A randomized comparison between rasburicase and allopurinol in children with lymphoma or leukemia at high risk for tumor lysis. Blood. 2001;97(10):2998-3003. https://doi.org/10.1182/blood.v97.10.2998
7. Haslam A, Hey SP, Gill J, Prasad V. A systematic review of trial-level meta-analyses measuring the strength of association between surrogate end-points and overall survival in oncology. Eur J Cancer. 1990. 2019;106:196-211. https://doi.org/10.1016/j.ejca.2018.11.012
8. Lopez-Olivo MA, Pratt G, Palla SL, Salahudeen A. Rasburicase in tumor lysis syndrome of the adult: a systematic review and meta-analysis. Am J Kidney Dis. 2013;62(3):481-492. https://doi.org/10.1053/j.ajkd.2013.02.378
9. Cortes J, Moore JO, Maziarz RT, et al. Control of plasma uric acid in adults at risk for tumor lysis syndrome: efficacy and safety of rasburicase alone and rasburicase followed by allopurinol compared with allopurinol alone—results of a multicenter phase III study. J Clin Oncol. 2010;28(27):4207-4213. https://doi.org/10.1200/jco.2009.26.8896
10. Martens KL, Khalighi PR, Li S, et al. Comparative effectiveness of rasburicase versus allopurinol for cancer patients with renal dysfunction and hyperuricemia. Leuk Res. 2020;89:106298. https://doi.org/10.1016/j.leukres.2020.106298
11. Personett HA, Barreto EF, McCullough K, Dierkhising R, Leung N, Habermann TM. Impact of early rasburicase on incidence and outcomes of clinical tumor lysis syndrome in lymphoma. Blood. 2019;60(9)2271-2277. https://doi.org/10.1080/10428194.2019.1574000
12. Howard SC, Cockerham AR, Yvonne Barnes DN, Ryan M, Irish W, Gordan L. Real-world analysis of outpatient rasburicase to prevent and manage tumor lysis syndrome in newly diagnosed adults with leukemia or lymphoma. J Clin Pathways. 2020;6(2):46-51.
13. Abu-Hashyeh AM, Shenouda M, Al-Sharedi M. The efficacy of cost-effective fixed dose of rasburicase compared to weight-based dose in treatment and prevention of tumor lysis syndrome (TLS). J Natl Compr Canc Netw. 2020;18(3.5):QIM20-119. https://doi.org/10.6004/jnccn.2019.7516
14. Patel KK, Brown TJ, Gupta A, et al. Decreasing inappropriate use of rasburicase to promote cost-effective care. J Oncol Pract. 2019;15(2):e178-e186. https://doi.org/10.1200/jop.18.00528
15. Khalighi PR, Martens KL, White AA, et al. Utilization patterns and clinical outcomes of rasburicase administration according to tumor risk stratification. J Oncol Pharm Pract. 2020;26(3):529-535. https://doi.org/10.1177/1078155219851543
16. Elitek. Prescribing information. Sanofi-Aventis U.S., LLC; 2019. Accessed June 1, 2021. https://products.sanofi.us/elitek/Elitek.html
17. Allopurinol. Drugs & Diseases. Medscape. Accessed June 1, 2021. https://reference.medscape.com/drug/zyloprim-aloprim-allopurinol-342811
18. Jones GL, Will A, Jackson GH, Webb NJA, Rule S; British Committee for Standards in Haematology. Guidelines for the management of tumour lysis syndrome in adults and children with haematological malignancies on behalf of the British Committee for Standards in Haematology. Br J Haematol. 2015;169(5):661‐671. https://doi.org/10.1111/bjh.13403
19. Boutin A, Blackman A, O’Sullivan DM, Forcello N. The value of fixed rasburicase dosing versus weight-based dosing in the treatment and prevention of tumor lysis syndrome. J Oncol Pharm Pract. 2019;25(3):577-583. https://doi.org/10.1177/1078155217752075
© 2021 Society of Hospital Medicine