Abstract
Introduction: A clinical decision rule to improve the accuracy of a diagnosis of influenza could help clinicians avoid unnecessary use of diagnostic tests and treatments. Our objective was to develop and validate a simple clinical decision rule for diagnosis of influenza.
Methods: We combined data from 2 studies of influenza diagnosis in adult outpatients with suspected influenza: one set in California and one in Switzerland. Patients in both studies underwent a structured history and physical examination and had a reference standard test for influenza (polymerase chain reaction or culture). We randomly divided the dataset into derivation and validation groups and then evaluated simple heuristics and decision rules from previous studies and 3 rules based on our own multivariate analysis. Cutpoints for stratification of risk groups in each model were determined using the derivation group before evaluating them in the validation group. For each decision rule, the positive predictive value and likelihood ratio for influenza in low-, moderate-, and high-risk groups, and the percentage of patients allocated to each risk group, were reported.
Results: The simple heuristics (fever and cough; fever, cough, and acute onset) were helpful when positive but not when negative. The most useful and accurate clinical rule assigned 2 points for fever plus cough, 2 points for myalgias, and 1 point each for duration <48 hours and chills or sweats. The risk of influenza was 8% for 0 to 2 points, 30% for 3 points, and 59% for 4 to 6 points; the rule performed similarly in derivation and validation groups. Approximately two-thirds of patients fell into the low- or high-risk group and would not require further diagnostic testing.
Conclusion: A simple, valid clinical rule can be used to guide point-of-care testing and empiric therapy for patients with suspected influenza.
- Clinical Epidemiology
- Decision Sciences
- Evidence
- Based Medicine
- Primary Health Care
- Respiratory Tract Diseases
Seasonal influenza accounts for losses in workforce productivity, a strain on health services, and an average of 36,000 deaths1 and 200,000 hospitalizations2 in the United States every year. Neuraminidase inhibitors such as zanamivir and oseltamivir can reduce the duration and severity of symptoms for both influenza virus types A and B, but empiric therapy must be initiated within 36 to 48 hours of symptom onset.3 Therefore, prompt diagnosis of influenza is needed to guide the use of antiviral therapy and infection control measures. Although rapid tests for the diagnosis of influenza are available, such tests have limited sensitivity and often perform no better than clinical criteria or a physician's unaided clinical judgment.4,5
As has been demonstrated with other conditions such as sore throat,6 pulmonary embolism,7 deep vein thrombosis,8 and ankle injury,9 the history and physical examination can help to establish a patient's pretest probability of influenza and inform the interpretation of subsequent diagnostic tests. Previous systematic reviews10,11 have found that individual signs or symptoms are of limited value in distinguishing influenza from other influenza-like illnesses. In a systematic review of studies published between 1966 and 2004, the summary likelihood ratio (LR) did not exceed 2.0 for any single sign or symptom.11 A more recent systematic review studied decision rules using combinations of symptoms or developed with multivariate techniques. It found that the positive predictive value of clinical heuristics such as “cough and fever” (26%–87%) and “cough, fever, and acute onset” (30%–77%) varied widely between studies.12
Our goal is to help physicians make the best possible use of the history and physical examination to minimize unnecessary testing and inappropriate antiviral use. To this end, we assembled data from 2 similarly designed studies of the clinical diagnosis of influenza with the intention of developing and validating a clinical decision rule (also called a “clinical decision aid” or “clinical prediction rule”). By combining data from 2 studies we were able to create a larger development set, still have enough patients for a validation group, and provide greater generalizability. We also attempted to validate previously proposed simple heuristics and point scores based on the multivariate analysis of influenza data published by Monto and colleagues.13
Methods
We identified 2 prospective cohort studies5,14 that reported the accuracy of signs and symptoms of influenza for adults in the outpatient setting and used an acceptable reference standard (polymerase chain reaction [PCR] or culture). Details of their study designs have been published previously.5,14 The study populations had many similarities, including similar mean ages, age ranges, sex distributions, and percentages of patients with cough, sore throat, rhinitis, headache, chills, and fatigue (Table 1). Though the populations differed somewhat regarding the proportion with influenza and the proportion presenting with fever, we felt that the increased generalizability of creating a decision rule using patients from 2 different countries justified combining the datasets. Also, the prevalence of influenza in the final dataset is very similar to that seen during the typical peak influenza season in the United States, and there is no evidence of spectrum bias in previous studies. We used the random number function of an Excel spreadsheet (Microsoft Corp, Redmond, WA) to divide the dataset into derivation (70%; n = 326) and validation (30%; n = 133) subgroups.
Characteristics of Included Studies
The dependent, or outcome, variable was influenza as diagnosed by PCR of culture. Independent, or predictor, variables statistically significant at P < .05 in the univariate analysis (Table 2) were used to develop a multivariate model. A backward, stepwise logistic regression was performed using the derivation group and removing variables with P > .20. This model was simplified into a point score, flu score 1, based on the odds ratios (approximately twice the odds ratio was used to avoid half points). We then created 2 additional models that included interaction terms for fever, cough, and acute onset (flu score 2) and fever and cough (flu score 3).
Univariate Analysis of Individual and Pooled Datasets
We also proposed 2 point scores (flu scores 4 and 5) based on Monto et al's13 previously published multivariate model. These scores differed in the approach they took to assigning points to each clinical variable, with flu score 5 taking a more simplified approach. We adjudicated minor discrepancies in variable definitions as follows: (1) Monto et al13 defined acute onset as symptoms present for <36 hours, whereas our datasets defined it as presentation to a physician within 48 hours of the onset of symptoms; 2) “stuffy nose” and “rhinitis” in our datasets were considered equivalent to “nasal congestion” in the Monto et al13 study, (3) we considered “fatigue” equivalent to “weakness,” and (4) we did not have data regarding loss of appetite, so that variable was omitted from the point scores. In Monto et al's13 model, acute onset was associated with a decreased likelihood of influenza. Because this is different from all other studies in the literature,10,11 we assigned acute onset positive points in each model (ie, predicting an increased risk of influenza). Two subjects were excluded from this validation because of missing information.
Cutpoints for flu scores 1 through 5 were determined by visual inspection of the model as applied to the derivation group of 326 patients. Our goal was to create low-, moderate-, and high-risk groups, with the low-risk group having a probability of influenza <10% and the high-risk group having a probability of influenza of at least 50%. These cutoffs were chosen because they corresponded to reasonable no-test/test and test/treat thresholds based on the opinion of the research team (all but one of whom are experience primary care physicians), an informal email poll of 20 academic generalists, and a previous decision threshold analysis.15 Patients in the low-risk group would not require further testing because disease had been ruled out, and those in the high-risk group would not require further testing because disease had been “ruled in.” We then evaluated these models and cutpoints using the validation group of 133 patients.
Finally, the combined adult dataset was used to validate 2 simple heuristics previously reported in the literature.12 These heuristics were the “fever and cough rule” (influenza is diagnosed if both fever and cough are present) and the “fever, cough, and acute onset rule” (influenza is diagnosed if all 3 are present). The simple heuristics were evaluated using the entire combined dataset of 459 patients because their definitions of abnormal were known at the outset of the study.
The accuracy of clinical decision rules and simple heuristics was evaluated using standard measures such as sensitivity, specificity, LRs, and posttest probabilities. Unless otherwise noted, all statistical analyses were performed using Stata software (version 11.0, Stata Corp, College Station, TX).
Funding Source
This study did not receive any external funding. It was approved by the Human Subjects Committee of the University of Georgia.
Results
Development and Validation of Clinical Rules
The univariate analysis for the individual and pooled datasets from the studies of Senn et al14 and Stein et al5 are shown in Table 2. Variables significantly associated with influenza in the combined dataset included fever, cough, myalgia, rhinitis, headache, chills, and acute onset (symptoms present for <48 hours). These variables were used in the multivariate analyses to develop flu score 1 (without interaction terms) and flu scores 2 and 3 (each including an interaction term). Flu score 4 and flu score 5 are based on the multivariate model of Monto and colleagues,13 with flu score 5 taking a more simplified approach using only whole integers. Flu scores 1 through 5, including the underlying multivariate models and final point scores, are shown in Table 3.
Clinical Decision Rules Based on Multivariate Models from Combined Dataset (Flu Scores 1 to 3) and Multivariate Model of Monto et al13 (Flu Scores 4 and 5)
Each flu score was applied to the derivation group of 326 patients, and visual inspection of the data were used to identify logical cutpoints that created low-risk (<10%), moderate-risk (10%–50%), and high-risk (>50%) groups for influenza. It was not possible to create a low-risk group with a probability of influenza less than 10% for flu scores 4 and 5. Each score then was evaluated for accuracy using the validation group of 133 patients. Posttest probabilities and LRs for the derivation, validation, and combined groups are shown in Table 4. The number of patients in each risk group also is shown. This is an important factor when considering the usefulness of a clinical rule because clinical rules are most helpful when they assist clinicians in ruling in or ruling out disease in a large number of individuals.
Performance of Clinical Prediction Rules and Simple Heuristics for the Diagnosis of Influenza
Validation of Simple Heuristics
Finally, we also validated 2 simple heuristics (fever and cough, and fever with cough and acute onset) using the entire combined dataset. The fever and cough heuristic was 61.1% sensitive and 79.8% specific, whereas the fever with cough and acute onset heuristic was 41% sensitive and 93% specific. The posttest probabilities and LRs for these heuristics using the pooled dataset are shown in Table 4.
Selection of the Most Useful Clinical Rule
The simple heuristics (fever and cough and fever with cough and acute onset) were helpful when positive, but not when negative. The residual probability of influenza was 20.2% to 24.9%, well above the test threshold. Furthermore, the simple heuristics classified only a relatively small percentage of patients as high risk (18.7%–34.2%). They were, therefore, inferior to the clinical decision rules. All 5 clinical decision rules successfully identified patients who are at low, moderate, and high risk for influenza (Table 4), and all the models generalized well with no evidence of overfitting.
Flu scores 2, 4, and 5 had the most patients in the moderate-risk group and the fewest patients in the low- and high-risk groups, making these rules less useful for clinical decision making because relatively few patients would have influenza ruled in (above the test/treat threshold of 50%) or ruled out (below the no-test/test threshold).
Flu scores 1 and 3 both identified a large percentage of patients in the low-risk (26.1%–32.5%) and high-risk (39.4%–40.7%) groups. The ratio of LR+/LR− (a measure of discrimination) was slightly higher for flu score 1 then flu score 3 (19.2 vs 16.0). Although both scores had 4 variables, Flu Score 3 is simpler and easier to remember for point of care use. Therefore, we selected flu score 3 as the most useful and accurate clinical decision rule for identifying a significant number of patients as low or high risk.
As a further internal validation, we applied flu score 3 to the original Swiss and US study populations, which had very different prevalences of influenza (21% and 53%, respectively). The rule performed well, with influenza prevalences of 9%, 35%, and 65% in the low-, moderate-, and high-risk groups for the Swiss study population (53% overall influenza prevalence), and 8%, 27%, and 43% in the low-, moderate-, and high-risk groups for the US study population (20% overall influenza prevalence).
Discussion
We were able to develop and validate a simple clinical decision rule that stratifies patients into group at low, moderate, and high risk for influenza. Combining data from the derivation and validation groups (because the rule performed similarly in both subgroups), approximately 32% of patients fell into the low-risk group and had a 8% likelihood of influenza, well below our predetermined test threshold of 10%. Conversely, 39% fell into the high-risk group and had a 59% likelihood of influenza, which was above our treatment threshold of 50%. Thus, over two-thirds of patients did not require further testing.
The likelihood of influenza depends on the baseline probability of influenza in the community, the results of the clinical examination, and, optionally, the results of point of care tests for influenza. We determined the probability of influenza during each season based on data from the Centers for Disease Control and Prevention.16 A recent systematic review found that point of care tests are approximately 72% sensitive and 96% accurate for seasonal influenza.17 Using these data for seasonal probability and test accuracy, the likelihood ratios for flu score 1, a no-test/test threshold of 10% and test/treat threshold of 50%, we have summarized a suggested approach to the evaluation of patients with suspected influenza in Table 5. Physicians wishing to limit use of anti-influenza drugs should consider rapid testing even in patients who are at high risk during peak flu season. Empiric therapy might be considered for patients at high risk of complications.18 Though the Swiss physicians identified a population with a 53% likelihood of influenza using implicit criteria, using this to guide treatment likely would result in overtreatment of otherwise healthy adults who have limited benefit from oseltamivir and zanamivir.
Predictive Values Based on Integration of Pretest Probability and Diagnostic Test With Flu Score 3* Using Test and Treatment Thresholds of 10% and 50%, Respectively
Based on these thresholds, neither testing nor treatment should be ordered for patients outside of flu season (which is consistent with usual practice), and it should not be ordered for low- or moderate-risk patients during shoulder season. Conversely, high-risk patients during flu season should be treated empirically. Point of care testing should be considered for high-risk patients during shoulder season and for moderate-risk patients during flu season.
Although we did not perform a cost-effectiveness analysis of the strategy described in Table 5, we believe it has the potential to reduce testing and treatment compared with the usual practice of many physicians. For example, in the Swiss population, if a physician treated all patients with influenza-like illness, he or she would treat the entire study population. Using our strategy, he or she would treat only those in the high-risk group and those with a positive rapid test in the moderate-risk group. A physician using rapid antigen tests for all patients with an influenza-like illness would test all these patients, whereas our strategy would result in testing for only those at moderate risk (25% of the total).
Our study had several limitations that should be acknowledged. We combined data from 2 different populations with somewhat different inclusion criteria, although the resulting dataset has the advantage of greater generalizability because it includes patients from 2 countries during 2 different flu seasons and has an overall pretest probability typical of that for influenza season.16 Also, data collection was limited to adults, so it is not clear whether these findings would apply to younger patients. Although simple, the point scoring may be too complex to remember and would be aided by programming as an application for smart phones and/or the Internet. Although we performed an internal validation using split-sample methodology, as well as re-evaluation of each score in the original individual study populations, a more robust validation will involve prospective evaluation in a completely separate population. Finally, it would be useful to evaluate the clinical decision rule during several different flu seasons with different viral subtypes. Because we used data from endemic, seasonal influenza, these results should be applied cautiously if at all to any future pandemic of novel influenza strains such as the recent H1N1 outbreak.
Conclusions
We have developed and validated a clinical decision rule that successfully classifies patients as being at low, moderate, or high risk for influenza based on 4 simple clinical findings. This clinical rule was designed to be consistent with the threshold model of medical decision making, so it can most effectively guide decisions about testing and treatment. Further research is required to develop test and treatment thresholds that are more rigorously derived and that incorporate patient preferences. It also will be important to validate this model prospectively in diverse populations and settings and outside of flu season.
Notes
This article was externally peer reviewed.
Funding: none.
Conflict of interest: none declared.
This article received a Distinguished Paper Award at NAPCRG 2011.
- Received for publication May 2, 2011.
- Revision received August 27, 2011.
- Accepted for publication September 6, 2011.