Abstract
Introduction: To develop and externally validate a simple risk score for influenza diagnosis based using vaccination history and patient-reported symptoms.
Methods: Adult outpatients in 12 European countries during flu season with a chief complaint of acute cough between 2007 and 2010 were used to derive and internally validate the risk score (Genomics to combat Resistance against Antibiotics in Community acquired LRTI in Europe (GRACE) data), and contemporary US data were used for external validation (EAST-PC data). Patient-reported symptoms were recorded and polymerase chain reaction (PCR) was used to diagnose influenza. The score was derived using logistic regression and assigning points based on the β-coefficients. The score was externally validated in a contemporary US population (EAST-PC data). Accuracy was measured using influenza prevalence in each risk group and the area under the receiver operating characteristic curve (AUC). Calibration was assessed by plotting observed versus expected.
Results: We developed a risk score with 6 items (subjective fever, interfered with usual activity, headache, wheeze, phlegm, and recent flu vaccine) and a range from -5 to 6 points. The AUC was 0.75 for both derivation and internal validation subgroups. The prevalence of influenza was 15.1% in the GRACE data and 14.4% in the EAST-PC data. The percentage with influenza in the low, moderate, and high-risk groups was 6.8%, 21.8%, 35.3 in the external validation population (EAST-PC data). The low-risk group included 61% of participants in the external validation. Calibration was excellent.
Conclusions: We developed and externally validated the FluScoreVax risk score, available as an app. It classifies 61% of patients as low risk, of whom only 7% had influenza.
- Clinical Prediction Rule
- Evidence-Based Medicine
- Infectious Diseases
- Influenza
- Influenza Vaccines
- Logistic Regression
- Respiratory Diseases
- Risk Score
- Telehealth
- Vaccination
Introduction
Seasonal influenza accounts for losses in workforce productivity, a strain on health services, and an average of approximately 40,000 deaths and a range of 140,000 to 710,000 hospitalizations in the United States every year.1,2 Neuraminidase inhibitors such as baloxavir and oseltamivir can reduce the duration and severity of influenza symptoms for both influenza virus type A and B, but therapy should ideally be initiated within 24 to 36 hours of symptom onset.3–5 Prompt diagnosis of influenza may guide the use of antiviral therapy and infection control measures. In addition, identifying low risk patients who require no testing or treatment for influenza can improve clinical efficiency.
Clinical prediction rules or risk scores use a combination of signs, symptoms, and sometimes simple point of care tests to assist diagnostic and therapeutic decisions. As has been demonstrated with other conditions such as sore throat,6 pulmonary embolism,7,8 and ankle injury,9 clinical prediction rules can help establish a patient’s probability of a given condition and inform the interpretation of subsequent diagnostic tests. As influenza and other infections are increasingly managed via home tests and telehealth, risk scores designed for this setting may be useful to guide decision making.10
The original FluScore was developed in a combined US/Swiss population and has been prospectively validated in 2 different populations.11,12 While classification of low risk patients was similar in the derivation and external validation groups (likelihood ratio 0.20 to 0.23), prospective validation in the high risk groups was less accurate (1.46 to 1.67 compared with 2.72 in the original derivation study). One limitation of the FluScore was that it did not have access to vaccination data. In this study, we use a large prospectively collected European dataset of outpatient adults with cough during flu season to develop and internally validate a novel risk score for influenza that incorporates flu vaccination status. We then externally validate it in a contemporary US population of outpatients, again in adults during flu season.
Methods
Informed Consent
The original GRACE (Genomics to combat Resistance against Antibiotics in Community-acquired LRTI in Europe) study received full human subjects approval for the original data collection and all participants provided signed informed consent. The current study was cross-sectional using previously collected and deidentified data that was judged by the GRACE investigators to fall within the scope of the original approval. EAST-PC was a prospective observational study of adults presenting to an outpatient setting with acute cough. It was approved by the Western Institutional Review Board (study number 1253415) and the IRBs of each participating institution. EAST-PC (Enhancing Antibiotic Stewardship in Primary Care) was sponsored by the federal Agency for Health care Research and Quality (grant number 1R01HS025584-01A1).
Populations Studied
For derivation and internal validation, we used data from the previously reported GRACE dataset, which recruited adults with acute cough or suspected of having a lower respiratory tract infection (LRTI) from 16 primary care research networks in 12 European countries between October, 2007 and April, 2010.13,14 All had cough as the main or dominant symptoms, which had been present for less than 28 days. The population was randomly divided into derivation (65%, n = 992) and validation (35%, n = 532) subgroups.
For external validation, we used data from the EAST-PC study. Participants were identified when they registered in a primary care or urgent care clinic in the Washington, D.C., Madison, Wisconsin, or Athens, Georgia metro areas between June, 2019 and April, 2023. All participants were between 18 and 75 years and reported a cough for no more than 14 days accompanied by at least one lower respiratory or systemic symptom. In both datasets, only patients presenting during flu season were included.
Influenza Season
For the GRACE data we defined the influenza season each year based on annual surveillance reports published by the European Centers for Disease Control.15–17 These dates were Dec 15, 2007 to Mar 15, 2008; Dec 1, 2008 to Mar 21, 2009; and Sep 15, 2009 to Dec 31, 2009.
For the EAST-PC population, we excluded data from the 2020 to 2021 flu season (due to the pandemic, only 20 patients were enrolled during that period, none of whom had influenza). Dates for each year’s flu season were based on Centers for Disease Control (CDC) reporting and were Oct 1, 2019 to Apr 4, 2020; Oct 1, 2021 to Jun 11, 2022; and Oct 2 to Sep 9, 2023 (although our data collection ended in April, 2023).18
Data Collected
For the GRACE data, general practitioners recorded signs and symptoms at the index visit, as well as influenza vaccination status. A patient was considered vaccinated if they reported having been vaccinated before the visit and before or during the current flu season. A symptom was considered present if reported at any level of severity. All participants had a nasopharyngeal swab for polymerase chain reaction (PCR) for influenza A and B.19
For the EAST-PC study, patients self-reported the presence or absence of symptoms and severity (absent, mild, moderate or severe) to research assistants. A combined midturbinate and pharyngeal swab was obtained and tested for 47 pathogens including influenza A and B by the CDC respiratory laboratory.
Derivation of the Risk Score
Univariate analysis of the GRACE dataset was used to identify symptoms, signs, and historic factors such as flu vaccine status associated with the diagnosis of influenza at P < .2. Those predictors were entered into a logistic regression model and backwards stepwise selection was used to create the final model. The logistic regression coefficients were used to create a simplified risk score, dividing each β-coefficient by the smallest β-coefficient and rounding off to whole numbers.20
Visual inspection of the distribution of influenza prevalence by point score in the derivation group was used to identify cutoffs to define low, moderate and high risk groups for influenza. In a previous study we had identified test thresholds of 5% for US physicians and 31% for Swiss physicians; the difference was hypothesized to be due to differences in availability and use of rapid tests, which are rarely used in Swiss general practice and widely used in the US. The same study found treatment thresholds of 55% for US physicians and 67% for Swiss physicians, again hypothesized to be due to a greater propensity to prescribe oseltamivir in the US.21 Our primary goal was therefore to identify a low risk group with a probability of influenza less than 15%, and a high risk group with a probability of influenza greater than 50%.The risk score and identified cutoffs were then applied to the internal validation subgroup of the GRACE dataset, and to the external validation population from the EAST-PC study.
Classification accuracy was measured using the prevalence of influenza and stratum specific likelihood ratios for each risk group. Area under the receiver operating characteristic curve (AUC) was used as a measure of overall accuracy, with plots of observed vs expected values to evaluate calibration. Logistic regression was performed using Stata version 1822 and R version 4.3.123 was used to create receiver operating characteristic curves and calibration belts for the derivation and internal and external validation groups. R Shiny was used to create an online interactive version of the risk score.
Patient and Public Involvement
Patients and the public were not involved in the design, analysis, or writing of this study and manuscript.
Results
Participants
The characteristics of participants in the GRACE study during the defined influenza season are shown in Table 1. The mean age of patients presenting during flu season was 50 years, with a range of 18 to 92 years, and 22.6% had received a flu vaccine in the previous year. Patients with influenza A or B were significantly more likely to report fever, chest discomfort, myalgias, headache, generally feeling unwell, and that illness interfered with their activities than those without influenza. Patients with influenza were significantly less likely to report phlegm.
Characteristics of Included Patients Limited to Those Presenting During Flu Season in the Derivation (GRACE) Population from 2012
Logistic Regression and Derivation of the FluScoreVax Risk Point Score
The results of the logistic regression and the assignment of point scores is shown in Table 2. Using the derivation subgroup we also explored risk scores with more complex point assignments as well as a risk score that incorporated C-reactive protein and lung examination findings as predictors, but these added complexity and data burden and did not add significantly to the accuracy or predictive power of the model. The final risk score had 6 predictors and a range from -5 to 6 points. The prevalence of influenza at each level of the risk score in the derivation and internal validation groups is shown in Appendix Tables 1 and 2.
Final Logistic Regression Model with Assigned Points Based on Logistic Regression Coefficients
Accuracy and Calibration of the Model
Classification accuracy for the FluScoreVax risk score is summarized in Table 3 for the derivation and internal validation subgrouops using GRACE data. The percentage with influenza in the low, moderate and high-risk groups was 6.1%, 21.4% and 40.0% in the internal validation subgroup. The corresponding SSLRs were 0.37, 1.55, and 3.77. The percentage of participants classified as low, moderate, and high risk for influenza was 58.6%, 27.2% and 14.1% respectively. The overall accuracy of the model as measured by the AUC was 0.745 (95% CI 0.70 to 0.79) for the derivation subgroup and 0.748 (95% CI 0.69 to 0.81) for the internal validation subgroup and is shown in Figure 1A, with no significant difference between groups. Similarly, calibration belt plots for derivation and validation subgroups are shown in Figure 1B and 1C and illustrate excellent calibration in both subgroups, especially in the validation subgroup.
ROC curves for the derivation and internal validation subgroups using the GRACE data (A); calibration belt for the derivation group using the GRACE data (B) and calibration belt for internal validation using the GRACE data (C).
Classification Accuracy of the FluScoreVax Risk Score for Influenza.
External Validation
There were differences between the EAST-PC and GRACE cohorts, including the maximum duration of symptoms (14 vs 28 days), the mean duration of symptoms before presentation (5.1 vs 9.7 days), and the mean age of participants (38.7 vs 49.7 years). We therefore compared the frequency of symptoms in the original GRACE study with those in the EAST-PC defined 2 ways, as presence of any symptom and as presence of at least a moderately severe symptom. We then calculated the difference between prevalences to see which definition better matched the patient self-reports of symptoms from the GRACE study (Appendix Table 4). Based on these data, we calculated the FluScoreVax using all symptoms reported as at least moderate severity for the EAST-PC study participants.
The classification accuracy is shown in Table 3. It was similar to that in the GRACE study, with low, moderate, and high-risk groups having flu prevalences of 6.8%, 21.8%, and 35.3%. The percentage of participants classified as low, moderate and high risk in the external validation was also similar to that of the derivation and internal validation subgroups: 60.7%, 26.5%, and 12.7% respectively. Overall accuracy as measured by the AUC was very good for the FluScoreVax in the external validation population, with an AUC of 0.735 (95% CI 0.673 to 0.798) and is shown in Figure 2A. Calibration was excellent as shown by the calibration belt in Figure 2B.
ROC curve for the external validation group using the EAST-PC data (A); calibration belt for the external validation group using the EAST-PC data (B).
Discussion
We have successfully developed and both internally and externally validated a simple risk score for influenza A or B. It had very good overall accuracy (AUC 0.75 in both derivation and internal validation groups, and 0.73 in the external validation) and excellent calibration in the internal and external validation groups. While symptoms and severity of influenza may vary from year to year, by using data from 3 flu seasons the FluScoreVax risk score has the potential to be more generalizable. This is supported by the robust performance in 3 influenza seasons on a different continent 12 years later.
The primary potential value of the FluScoreVax is to identify patients at low risk of influenza. These patients do not require testing or treatment, while patients in the moderate risk group could have a rapid test and those in the high risk group could either have a test or be treated empirically based on clinician judgment. However, use of diagnostic tests for influenza varies greatly between countries, and is influenced by availability, reimbursement, medical culture, whether use of anti-influenza drugs is routine, and patient expectations.24 For example, the usual practice in the United States is to test most or all patients presenting with ILI during flu season using a point-of-care test for influenza and to prescribe baloxavir or oseltamivir if positive. On the other hand, in countries such as the United Kingdom point of care testing for influenza is uncommon. Thus, this risk score may have the greatest utility in telehealth settings and in health systems where tests for influenza are widely used.
A traditional medical practice is to perform a diagnostic test and then target treatment to the infection that is detected. While a meta-analysis found no evidence that oseltamivir provides a reduction in symptom duration for patients not infected with influenza,25 a recent pragmatic primary care trial of treatment of influenza-like illness (ILI) found a similar benefit among patients with and without influenza by PCR and that empiric therapy was cost-effective.5,26 The latter findings would support a strategy of empirically providing an anti-influenza drug to patients with influenza-like illness, especially if they are older or have more severe symptoms (groups that had a greater symptomatic benefit in that study).
Our study has several strengths. In addition to very good accuracy and excellent calibration, the score is simple and can be used via telehealth as it relies only on patient reported symptoms and history of flu vaccination. As noted earlier, our primary goal was to identify low risk patients who did not require testing or treatment. We were successful, as the FluScoreVax classified 60% of patients into the low-risk group in the external validation with a prevalence of influenza of only 6.8%. This is considerably below the test threshold of 12% identified in a previous study of primary care physicians21 and may be useful as a way to identify patients who may not require testing or in person evaluation. This in turn has the potential to reduce costs. Patients in the moderate or high risk groups could be offered a point of care test for influenza, which are increasingly available for home use.27 They could also be empirically treated if additional information (such as a confirmed case in household member) suggested influenza.
The prevalence of influenza was 15.1% in the GRACE data and 14.4% in the EAST-PC data. The peak prevalence of influenza was 24% to 30% between 2015 and 2020 (before the pandemic). Given the likelihood ratio of 0.41 for the low-risk group, that would correspond to a prevalence of influenza of 11.5% to 14.9% in the low-risk group even at the peak of flu season. This is close to the 12% test threshold previously discussed, so the risk score should be used thoughtfully when the point prevalence exceeds 25%.
We saw significant differences in patient reporting and behavior between the GRACE and EAST-PC cohorts. Others have reported differences in reported symptoms for patients with menopause28 and COVID-1929 between different countries. In this case, in Europe it may therefore be more appropriate to use the presence of any symptom in the risk score, while in the US one might ask about moderate to severe symptoms.
While the duration of illness before the index visit was significantly shorter in patients with influenza (5 vs 10 days) this is likely to depend on cultural and health system factors. For example, significantly shorter duration of cough before presentation with influenza have been observed in US studies, which may be driven by heavy use of drugs like oseltamivir which must be given within 2 days of onset. Future studies could include a question less tied to the number of days before seeking care and asking instead about the rapidity of onset of moderate to severe symptoms.
It would be most useful to know about the probability of influenza in patients presenting within 2 days of onset who are potentially eligible for anti-influenza drugs. However, our studies had too few patients presenting that early. We are part of a team that has assembled a dataset from multiple previous studies to perform individual patient data meta-analysis and we will explore using those data to derive and validate a risk score limited to patients with 2 or fewer days of symptoms.
We must also acknowledge limitations. This risk score was validated using symptoms reported by patients as being moderate to severe, and this same approach should be used in practice. There was not a perfect correlation between an increase in the score by 1 point and a corresponding linear increase in the probability of influenza. Validation in other populations should be pursued. The derivation data are now 12 to 15 years old and predate the COVID-19 pandemic. While there is no evidence that the body’s symptomatic response to influenza infection has changed, that is possible, and could influence usefulness of the FluScoreVax. Visual inspection of the derivation data to identify cutoffs is subjective, but readers can view the results in the Appendix tables and could conceivably use other cutoffs that fit their populations and values better. While very good at identifying a large group of patients as low risk, the score was not good at identifying a high risk group that exceeded the treatment threshold. These patients may benefit from shared decision making regarding further testing and/or empiric treatment. Finally, fever is an important predictor in our risk score, but may be low grade or absent in older persons. In the GRACE data fever was less strongly associated with influenza in patients over age 70 years. Therefore this rule should be used with caution in that population.
Summary
We have developed and both internally and externally validated a simple risk score (FluScoreVax) that has very good accuracy and excellent calibration for the risk of influenza in outpatients presenting with cough during flu season. In the external validation, it classified over 60% of patients in the low-risk group, with only a 6.8% probability of influenza. It is available as a free online calculator.
Acknowledgments
The investigators would like to thank Drs. Chris Butler, Theo Verheij, and all the GRACE investigators for their generosity in sharing these data.
Appendix
Prevalence of Influenza at Each Level of the Point Score in the Derivation, Internal Validation, and External Validation Subgroups and Comparison of Symptom and Vaccination Frequency Between the Populations
Derivation Subgroup (GRACE)
Internal Validation Subgroup (GRACE)
External Validation Subgroup (EAST-PC)
Comparison of Symptom and Vaccination Frequency Between the Derivation/Internal Validation Population (GRACE) and External Validation Population (EAST-PC)
Notes
This article was externally peer reviewed.
Funding: Drs. Ebell, Merenstein and Barrett were supported by a grant from the Agency for Healthcare Research and Quality to conduct the EAST-PC study. EAST-PC was sponsored by AHRQ (grant number 1R01HS025584-01A1).
Conflict of interest: The authors have no financial or intellectual conflict of interest to share.
To see this article online, please go to: http://jabfm.org/content/38/3/401.full.
- Received for publication October 7, 2024.
- Revision received January 24, 2025.
- Accepted for publication February 17, 2025.








