Abstract
Background: Though cardiac stress tests have long been the standard of care for initial evaluation of cardiac symptoms, our institution, along with others, has noted high rates of incomplete tests.
Objective: To identify sociodemographic factors associated with the completion of cardiac stress tests and to assess the value of completed tests.
Design & Participants: We conducted a retrospective chart review evaluating 150 patients with cardiac stress tests orders placed in 1 urban hospital-based primary care practice from 1/1/2018-12/31/2021.
Main Measures: Our primary outcome was the completion of the stress test. We examined rates of completion based on sociodemographic factors including age, gender, race, language, and social vulnerability, markers of chronic illness, risk of atherosclerotic cardiovascular disease, and pretest probability of coronary artery disease.
Key Results: In a multivariable adjusted model, female gender (OR:0.43 [0.18-1.00]), Black race (OR:0.26 [0.11-0.61]), and dyslipidemia (OR:0.27 [0.090-0.78]) were associated with lower test completion rates. We found no relationship between the likelihood of test completion and pretest probability. In an analysis of tests with low pretest probability, 100% of low-risk stress tests were negative; had any of those tests been positive the highest positive predictive value would have been 25%.
Conclusions: Test completion rates were significantly lower for individuals with female gender, Black race, and a diagnosis of dyslipidemia, highlighting inequities in the completion rates for a potentially lifesaving test. In addition, a substantial number of ordered tests were low risk and low value, highlighting areas of opportunity by advancing the value of cardiovascular care delivered.
- Academic Medical Centers
- Diagnostic Errors
- Diagnostic Tests
- Patient Safety
- Primary Health Care
- Quality Improvement
- Retrospective Studies
- Sociodemographic Factors
- Cardiac Stress Test
Introduction
Cardiac stress tests are a critical diagnostic tool for patients with known or suspected heart disease, and several million tests are ordered by clinicians each year. Recent studies done by our team and others have found low patient completion rates of ordered stress tests —termed diagnostic “loop closure”—with only 60% of stress tests being completed when ordered in the primary care setting.1,2 Failure to close “diagnostic loops” represents 1 the leading causes of diagnostic errors and delayed treatment in the United States.3 Previously our team has examined demographic and limited clinical factors associated with the failure to complete stress tests in a timely way.1,2 Through prior qualitative analysis, we have found that, in addition to patient factors, difficulties with scheduling and access represent a large barrier to test completion. These issues have only worsened since the COVID pandemic.4 In addition, prior studies have indicated that many ordered cardiac tests may be of low value.5⇓⇓–8 In light of this, we sought to examine the association of sociodemographic and clinical factors with completion of stress tests through granular variables not available in a large database, and to evaluate whether the clinical value of stress tests was linked to test completion. In doing this, we also identified the frequency of low value tests to evaluate the opportunity to improve access to stress testing for those who needed it most by reducing low value testing.
Methods
We conducted a retrospective chart review evaluating patients with chest pain (n = 147) or an anginal equivalent (n = 3) who had cardiac exercise stress test orders (imaging and nonimaging) placed in 1 urban hospital-based primary care practice from 1/1/2018-12/31/2021. From this cohort, we placed patients into 3 categories: 1) completed the stress test within 45 days of the order (on time), 2) completed the stress test after 45 days (delayed), and 3) never completed the test. The 45-day time frame was selected based on urgency as determined by physicians administering the test and feasibility to give patients enough time to schedule given current systematic delays. Tests that were ordered urgently were not included in this cohort. We identified 1180 total stress test orders, of which 719 (60.9%) were completed on time, 100 (8.4%) were completed in a delayed fashion and 361 (30.6%) were never completed. In each of these categories, we identified a random sample of 50 patients for chart review, for a total sample of 150 patients.
Our primary outcome was the completion of the stress test (ie, closure of the diagnostic loop) and our secondary outcome was the time to completion. We report rates of loop closure based on sociodemographic factors including age, gender, race, language, and social vulnerability (based on the Centers for Disease Control and Prevention’s Social Vulnerability Index [SVI], which measures vulnerability across 4 domains: socioeconomic status (RPL theme 1), household characteristics (RPL theme 2), racial and ethnic minority status (RPL theme 3) and housing type and transportation (RPL theme 4)).9 Additional clinical factors included presence of chronic illness (hypertension (ICD code I10), dyslipidemia (ICD code E78), diabetes (ICD code E8-13)), risk of atherosclerotic cardiovascular disease (ASCVD risk score),10 and pretest probability of coronary artery disease (CAD). Pretest probability of CAD was calculated using the European Society of Cardiology model which is an expanded version of the well-established Diamond and Forrester scoring system and incorporates age, gender, and chest pain quality (divided into 3 categories: typical, atypical and nonanginal) which we collected by chart review.11⇓–13
In a secondary analysis, we linked pretest probability of CAD to cardiac stress test result to examine the value of the ordered tests. Using the model described above, we risk stratified all patients with completed tests into low, intermediate, and high-risk categories. We then linked this stratification to the outcome of their stress test, which was collected through a retrospective chart review. Finally, we calculated posttest probability of CAD as a means of highlighting the impact of a positive test based on test characteristics (imaging vs nonimaging) and pretest probability. Tests with a pretest probability of 0 to 5% were considered to be low risk, those with a pretest probability of 6 to 14% were considered to be intermediate risk and those with a pretest probability of >=15% were considered to be high risk.13 Posttest probability of CAD ranged from 0 to 13.5% (nonimaging) and 0 to 25% (imaging) for low-risk tests, 15.9 to 32.5% (nonimaging) and 28.8 to 50.8% (imaging) for intermediate-risk tests, and 34.3 to 76.2% (nonimaging) and 52.8 to 87.3% (imaging) for high-risk tests.
To examine the association between patient characteristics and loop closure and the association between patient characteristics and time to loop closure, we employed a logistic regression model and Cox proportional hazards model,14 respectively. Given the presence of missing data, we conducted 5 imputations under the assumption variables were missing at random. To account for the limited sample size, within each imputation, we conducted univariate analysis for both models to allow subsequent model selection. Model selection was conducted by the backward stepwise model selection with AIC criterion. The majority method for model selection15 was employed, retaining the covariates selected at least 3 times for inclusion in the final models. The 5 models’ results from 5 imputations were combined by Rubin’s rule.16 We reported hazard ratios for the Cox proportional hazards model, odds ratios for the logistic regression model, and their corresponding 95% confidence intervals. The threshold for significance was set at 0.05. All analyses were performed in R (version 4.2.2).17 Multiple imputations were performed with the MICE R package,18 and the fitting of the Cox proportional hazards model was done with the survival R package.19–20
Results
Description of Cohort
A total of 150 stress tests were analyzed. Patients who did not complete the ordered test were significantly older (median age 64 vs 57.5 years), more likely to identify as female (70 vs 51%), more likely to identify as Black (56 vs 31%), from more socially vulnerable communities (median RPL theme 0.71 vs 0.46), more likely to speak English (100 vs 83%), and more likely to have a diagnosis of dyslipidemia (82 vs 68%) (P < .05). They also tended to have a higher ASCVD score (10.40 vs 7.80) and were more likely to have a diagnosis of diabetes (30 vs 16%) (P < .10) (Table 1). The risk of coronary artery disease based on their calculated pretest probability did not differ significantly between those with tests completed and those whose tests were not completed (of those completed on time, 30% were low risk, while 34% of those not completed were low-risk patients based on the calculated probability of coronary artery disease (P = .46)). In an analysis of test completion by year, there was no statistically significant difference in rates of test completion when comparing pre-COVID and post-COVID timeframes.
Baseline Characteristics
Factors Associated with Lower Rates of Loop Closure
In a single variable logistic regression, female gender (OR: 0.38 [0.18-0.83]), Black race (OR: 0.35 [0.16-0.74]), social vulnerability (OR: 0.21 [0.062-0.70]) and presence of dyslipidemia (OR: 0.35 [0.13-0.93]) were associated with lower rates of loop closure. Age (OR: 0.97 [0.93-1.00]) and diagnosis of diabetes (OR: 0.44 [0.19-1.02]) also trended toward lower rates of loop closure (Table 2). In a fully adjusted model, these differences persisted for female gender (OR: 0.43 [0.18-1.00]), Black race (OR: 0.26 [0.11-0.61]) and dyslipidemia (OR: 0.27 [0.090-0.78]) (Table 3).
Univariate Analysis for Loop Closure
Cox Model for Loop Closure
Factors Associated with Lower Time to Loop Closure
In an analysis of time to loop closure, female gender (HR: 0.66 [0.44-0.99]), Black race (HR: 0.49 [0.31-0.77]), social vulnerability (HR: 0.41 [0.21-0.78]), hypertension (HR: 0.64 [0.42-0.99]), and dyslipidemia (HR: 0.58 [0.37-0.91]) were associated with increased time to loop closure in the single variable Cox model, while having English as a nonprimary language (HR: 2.23 [1.27-3.94]) was associated with decreased time to loop closure. A diagnosis of diabetes (HR: 0.57 [0.33-1.00]) also trended toward increased time to loop closure (Table 4). In the adjusted model, these differences persisted only for Black race (HR: 0.51 [0.31-0.85]), dyslipidemia (HR: 0.46 [0.28-0.77]), and non-English language (HR: 2.66 [1.47-4.84]) (Table 5).
Univariate Analysis for Time to Loop Closure
Cox Model for Time to Loop Closure
Analysis of Stress Test Value
In a secondary analysis, we calculated the pretest probability for all stress tests (n = 150). In our descriptive analysis, we found no trend between likelihood of loop closure and pretest probability (Table 6). We then conducted a sensitivity analysis of all completed tests (n = 97), looking at the relationship between pretest probability and test outcome. In review of completed tests by risk category, 100% of low-risk tests (n = 27), 82.9% of intermediate-risk tests (n = 41) and 86.2% of high-risk tests (n = 29) were negative. We calculated posttest probability for all positive tests (n = 11). Of all patients with a pretest probability in the intermediate risk range, posttest probability ranged from 15.9 to 48.6%, with a higher probability associated with imaging studies. Of all patients with a pretest probability in the high-risk range, posttest probability ranged from 45.5 to 74.9%, with a higher probability associated with imaging studies.
Association of Test Completion with PreTest Probability of Coronary Artery Disease (CAD)
Finally, when comparing our results to the larger cohort, assuming similar distribution in low, intermediate and high value tests across the entire patient population, our results suggest that in the larger cohort from which these random samples were drawn, with a total of 819 completed tests, 228 (27.8%) would have been low risk, 346 (42.3%) would have been intermediate risk and 245 (29.9%) would have been high risk. Of those who completed the test, we would have expected 59 (17.1%) intermediate risk and 34 (13.8%) high risk tests to have been positive. In our smaller cohort, had the 50 patients who did not complete the test completed it, we would have expected a total of 17 positive tests when compared with 11(11.3%) positive tests in the current cohort.
Discussion
Rates of loop closure for nonurgent tests were low for all patient groups based on our entire cohort but were shown to be significantly lower for Black individuals and those with a diagnosis of dyslipidemia in our random samples of patients with completed, delayed and never completed tests. These findings were also noted with time to loop closure, with these 2 groups having significantly increased time to test completion. In addition, the univariate analysis highlighted that social vulnerability, female gender, older age, and a diagnosis of diabetes were linked to lower rates of loop closure. Of note, Black race was associated with lower rates of loop closure while RPL theme 3 (residence in a US census tract with higher proportions of individuals of racial and ethnic minority status) was not, suggesting that this population may face particular barriers to loop closure and that disparities may be mediated primarily at the individual, rather than neighborhood or census-tract, level. Taken together, these patient factors highlight a particularly concerning finding—the sickest and most socially vulnerable patients have lower completion rates for a potentially lifesaving test.
Furthermore, our secondary analysis highlights the lack of association between pretest probability and likelihood of loop closure (ie, patients with a high risk of CAD were no more likely to complete their stress tests compared with those with a low risk), suggesting that important clinical variables are not being used by clinicians or the test sites to prioritize and motivate nonurgent patients for testing and test completion. However, one notable finding was the frequency with which patients in the lowest risk (pretest probability of CAD of less than or equal to 5%) group were tested, highlighting a category of patients who are likely receiving low-value stress tests which are unlikely to provide any clinically significant information. The European Society of Cardiology (ESC), which created and validated this pretest probability risk score, has studied outcomes and found annual risk of cardiovascular death or MI is less than 1% in patients with a pretest probability less than 15% and recommend against routine testing in patients in this cohort (low & intermediate risk) to reduce unnecessary procedures and costs. Other measures such as the ASCVD risk score may also help inform clinical decision making. However, given that this metric uses a race factor that tends to increase estimates of cardiovascular risk among African American patients, we utilized the ESC pretest probability risk score in our evaluation of test value which is widely studied and does not include race.
In light of these findings, it is crucial to consider opportunities to improve the likelihood of test completion for the highest risk, and the most vulnerable patients and to reduce unnecessary tests for low-risk patients. The first step to addressing this complex issue is recognizing the underlying systemic factors that may prevent racial minority and underserved patients from completing ordered tests (ie, poor access to care, higher burden of health-related social need, poor access to transportation), which must be addressed at the system level. In addition, with current access and staffing difficulties throughout our health care system, test scheduling is often complex with very little proactive outreach and no safeguards to ensure that patients have scheduled their appointments. Past studies have identified several opportunities to intervene, including automated tracking for outstanding tests within electronic medical records, phone outreach to patients, automated text and e-mail reminders, and the use of referral managers to help vulnerable patients schedule their tests.21 Further research will be needed to evaluate the relative efficacy of these interventions and to identify best practices for their implementation. Given the racial and gender disparities in loop closure identified in this study, additional efforts will be needed to reduce population-specific barriers to loop closure; for instance, women and minoritized populations have often reported clinician bias and distrust that may hinder test completion.22 Interventions to address these barriers may include educational programming, standardized protocols to reduce bias, and race- and gender-sensitive data collection for practice-specific quality improvement initiatives.23 Finally, in settings where the capacity for testing limits the ability to close the loop on ordered tests, calculating the pretest probability of disease based on clinical variables at the point of ordering along with a calculated predictive value of a positive test would help to diminish the number of low value tests ordered,24–25 creating more opportunities to schedule patients who are higher risk for higher value testing.
Limitations of this study include the limited sample size due to the necessity for chart review which may limit the generalizability of results, lack of urgent stress tests which may have impacted results, and lack of proportional stratified random sampling. In addition, though having English as a nonprimary language was significantly associated with decreased time to loop closure, it is important to note that all patients with open loops spoke English so we were not able to accurately assess this effect. Our secondary analysis was further limited by the small sample size, inclusion of only well-established cardiovascular risk factors in the model, and was designed to be descriptive, warranting further research to validate our findings. Strengths of this study include the novelty of the inquiry and the breadth of the patient/system factors evaluated, including chronic illness, risk of ASCVD, pretest probability of CAD, and social vulnerability across 4 domains in addition to traditional demographic measures.
In conclusion, this study found that Black race, female gender and a diagnosis of dyslipidemia were associated with lower rates of loop closure but shows that the pretest probability of disease is not related to loop closure. In addition, our secondary analysis underscores symptom characteristics that may help us identify patients who are low risk and would not benefit from cardiac stress testing. Taken together, these findings could help us distinguish patient groups who would most benefit from a cardiac stress test when presenting with chest pain in the outpatient setting and prioritize them for test completion, while also recognizing that some patient groups may need special attention or assistance in the effort to obtain test completion.
Acknowledgments
The authors would like to acknowledge Scot Sternberg and Keishi Nambara for their contributions.
This work was presented at the SGIM Annual Meeting in May 2023 and the SGIM Northeast Regional Meeting in November 2023.
Notes
This article was externally peer reviewed.
Funding: Dr. Phillips was supported by the Agency for Health care Research and Quality (AHRQ) R18 5R18HS027282. Dr. Amat was supported the Arnold Tofias and Leo Condakes Quality Scholarship Program.
Conflict of interest: None.
To see this article online, please go to: http://jabfm.org/content/37/6/1088.full.
- Received for publication February 15, 2024.
- Revision received May 14, 2024.
- Accepted for publication May 28, 2024.