Abstract
Introduction The Certificate of Added Qualification in Sports Medicine is administered by the American Board of Family Medicine (ABFM) and cosponsored by the American Board of Emergency Medicine (ABEM), the American Board of Pediatrics (ABP), and the American Board of Physical Medicine and Rehabilitation (ABPMR). This article reviews the methodology used to determine the weighting of CAQSM assessments across five content domains.
Methods A survey was comprised of 231 sports medicine clinical activities asking respondents to rate how often they perform a clinical activity (the Frequency Index- FI) and to rate the level of risk to the patient if the condition is misdiagnosed or not managed properly (the Index of Harm-IoH). A random sample of 800 diplomates representing Sports Medicine Diplomates from five member boards was selected to participate. Rasch modeling was employed to analyze the survey results. The sports medicine advisory committee voted to equally balance FI+IoH survey results to weight the content domains.
Results The survey response rate was 42.2%. Survey respondents were demographically representative of the overall sampling frame across key variables, except for primary certifying board. Weights for the content domains were as follows: Musculoskeletal Conditions (32.1%), Medical Conditions (30.2%), Care of Emergency Conditions (22.4%) and Preventive Aspects of Sports Medicine (10.4%).
Discussion This is the first nationally representative survey of sports medicine clinical activities performed in the United States. This survey data was used to develop a new sports medicine blueprint. In the future, this data may be used to develop curriculum or assessments.
- Athletic Injuries
- Certification
- Clinical Competence
- Examination Questions
- Licenses
- Program Evaluation
- Psychometrics
- Quantitative Research
- Specialty Boards
- Sports Medicine
Introduction
In 2023, the Sports Medicine Advisory Committee approved an updated blueprint for the Sports Medicine Examination, scheduled for implementation in 2026. The Sports Medicine Advisory Committee is comprised of representatives of the administrative board (American Board of Family Medicine), the cosponsoring boards (American Board of Pediatrics, American Board of Physical Medicine and Rehabilitation, and American Board of Emergency Medicine), representatives of the American Medical Society for Sports Medicine (AMSSM), and a public member with an extensive understanding of the practice of sports medicine. This committee oversees the work of the Sports Medicine Assessment Committee, reviews the blueprint, and approves the passing standard for the sports medicine assessments.
This revised blueprint aligns examination content more closely with contemporary clinical practice. The blueprint serves as a critical framework for examinations, defining the content domains assessed and the percentage of exam questions allocated to each domain. In this manuscript, we describe the methodology used to determine the appropriate weighting of exam questions across the five content domains (shown below) approved by the Sports Medicine Advisory Committee.
Domain I: Foundations of Practice
Domain II: Preventive Aspects of Sports Medicine
Domain III: Care of Emergency Conditions
Domain IV: Diagnosis, Management, and Epidemiology of Sports- and Exercise-Related Musculoskeletal Conditions
Domain V: Diagnosis, Management, and Epidemiology of Sports- and Exercise-Related Medical Conditions
Domain I (Foundation of Practice) was the only content domain not defined by specific, observable clinical activities; therefore, its weighting could not be determined empirically. Instead, the Sports Medicine Advisory Committee relied on expert judgment to assign it a fixed weight of 5%. In contrast, the remaining four clinical content domains contained observable clinical activities, enabling an empirical approach. Initial weights for these domains were established based on the total number of distinct clinical activities included within each domain. These initial percentages were subsequently refined using two key factors:
The frequency with which Sports Medicine physicians from various specialties perform these clinical activities.
The potential risk of patient harm if an activity is performed incorrectly or inadequately.
The process of determining the final weights for the content domains involved two analytical phases.
In Phase 1 (Representative Data Sample), Data were collected from a representative sample of physicians certified through previous Sports Medicine (SPMED) examinations across four American Board of Medical Specialties (ABMS) member boards, including the American Board of Family Medicine (ABFM), the American Board of Emergency Medicine (ABEM, including physicians initially certified through the American Board of Internal Medicine (ABIM)), the American Board of Physical Medicine and Rehabilitation (ABPMR), and the American Board of Pediatrics (ABP). These physicians provided ratings on the frequency of performing clinical activities and the potential risk of patient harm associated with these activities if performed incorrectly or inadequately.
In Phase 2 (Content Domain Weights), the collected ratings were analyzed to assign specific weights to each clinical activity. These weights were then used to refine and finalize the proportional distribution of exam questions across the content domains.
Method: Representative Data Sample
Survey
The survey was comprised of the 231 clinical activities grouped in the clinical content domains as specified above. For each activity, respondents were instructed to provide two ratings. The first was to rate how often they expect to perform the activity using a 5-point Likert scale (Daily, Weekly, Monthly, Few Times Per Year, Rarely or Never).1 The second was to rate the general level of risk to the patient if the condition is misdiagnosed or not managed properly, using a 4-point Likert scale (Minimal, Moderate, Considerable, Extreme).1
Sampling Frame
A sampling frame is a comprehensive list representing the population from which a sample is randomly selected. In this study, the sampling frame included all currently certified Sports Medicine Diplomates from five ABMS member boards: ABFM, ABEM, ABIM, ABPMR, and ABP. Physicians were excluded if they were initially certified in 2024 or later, scheduled to lose certification by December 31, 2024, clinically inactive (ABFM only), declined email contact (ABFM only), or lacked a valid email address (ABFM only). After applying these criteria, the final sampling frame consisted of 4,791 eligible physicians.
Sample
From this sampling frame, a random sample of 800 diplomates was selected and invited to participate in the survey. Prior to distributing invitations, demographic characteristics of the selected sample—including race/ethnicity, certifying board, gender, medical school origin (US or International), medical degree (MD, DO, or other), and initial certification year—were compared against the full sampling frame to ensure representativeness.
Survey Administration
Invitations to participate in the survey were emailed to the selected diplomates on January 22, 2025. The survey was estimated to require approximately 30 to 45 minutes to complete. Incentives varied by certifying board: ABFM diplomates received 5 certification points toward their certification along with a $200 honorarium; ABEM diplomates received a $200 stipend; no incentives were offered to ABPMR or ABP diplomates. A total of six reminder emails were sent between January 22 and February 24, 2025, to diplomates who had not yet responded. Additionally, the survey deadline was extended three times to improve response rates.
Institutional Review Board (IRB)
The study procedures were reviewed by senior ABFM executive staff to verify compliance with ABFM privacy policies. Additionally, the American Academy of Family Physicians Institutional Review Board determined that the study met criteria for IRB exemption.
Method: Content Domain Weights
The frequency and risk-of-harm ratings from the survey were analyzed separately using a Rasch rating scale model to produce two interval-level scales: the Frequency Index (FI) and the Index of Harm (IoH).2 The Index of Harm has been used previously with ABFM blueprint validity studies.3–5 Rasch modeling was employed due to its ability to generate interval scales that are independent of individual raters’ baseline rating tendencies, thus enabling unbiased and meaningful comparisons across clinical activities.
More specifically, the model is described as
where:
Paik is the probability that Rater
when rating Activity
would select Rating Category
,
is the probability that the rater would select Rating Category
,
is the severity of Rater
,
is the difficulty of endorsing Activity
,
is the difficulty of endorsing Rating Category
relative to Rating Category
, where the categories are numbered
.
Rater Quality Control
To ensure data quality, raters whose responses substantially deviated from model expectations were identified using an outfit mean square statistic (≥ 2.75) on the risk-of-harm ratings. Both the frequency and risk-of-harm datasets were reanalyzed after excluding these misfitting raters.
Adjusting the Weights for Each Clinical Activity
The baseline percentage of questions recommended for each of the four clinical content Blueprint domains was initially determined by counting the number of clinical activities within each domain, with equal weight assigned to each activity. To refine these weights, the survey collected ratings on both the frequency of performing each clinical activity and the associated risk of patient harm from a random sample of Diplomates. The sports medicine advisory committee voted on the weights. The expert opinion on the advisory committee had a consensus that equal weights should be assigned to both the frequency of performing each clinical activity and the associated risk of patient harm with each clinical activity. The conclusion was that neither frequency nor risk of harm could be judged to be more important than the other.
From these data, three additional sets of clinical activity weights were created:
Frequency-based weights (FI): Frequency ratings from the survey were analyzed using the Rasch model, producing a frequency scale. Since the Rasch-derived scale had an arbitrary zero origin, a constant was added to ensure all resulting weights were positive. These adjusted values were normalized (divided by their sum and multiplied by 100) so that the total FI weights equaled 100.
Risk-of-harm-based weights (IoH): The risk-of-harm ratings were similarly analyzed with the Rasch model to create a risk scale. Like the FI weights, a constant was added to shift the scale positively, and the resulting values were normalized, again summing to 100.
Combined FI and IoH weights: A third set of weights was calculated by averaging the normalized FI and IoH weights, equally balancing (50%) considerations of frequency and risk of harm.
These three empirically derived sets of weights were subsequently adjusted to fit within a 95% weighting framework, accommodating the predefined 5% allocated to the Foundation of Practice domain. This adjusted framework provided alternative approaches to baseline weighting, further enhancing alignment of exam content with actual clinical practice patterns and patient safety considerations.
Results: Representative Data Sample
Response Rate
After 34 days, the survey window closed (Figure 1). Among the 800 invited participants, 12 had invalid email addresses, and 2 became ineligible as they lost their certification during the survey window, leaving 786 eligible invitees. Of these, 337 completed the survey. However, a quality control check using misfit analysis identified five respondents whose answers suggested careless responding by misfit statistics. After excluding these cases, the final adjusted response rate was 42.2% (Table 1).
Rater Quality Control
A quality control check was performed on the risk-of-harm ratings by comparing each rater’s responses to the overall risk hierarchy, which was stable across most physicians. Because this hierarchy is broadly shared regardless of individual scope of practice, substantial deviations from it suggest inattentive or invalid responses. We defined misfit as an outfit mean square statistic ≥ 2.75, a threshold for detecting extreme outliers in Rasch-based analyses used in developing Sports Medicine Certification Examination (SMCE) blueprint. Applying this criterion led to the exclusion of five respondents.
Although a parallel analysis for frequency ratings was considered, variability in clinical scope makes it difficult to distinguish genuine differences in frequency from careless responses. Consequently, the same five misfitting raters identified through the risk-of-harm analysis were also excluded from the entire dataset, thus reducing the final response rate.
Representativeness of the Responding Raters
A key consideration in practice analysis/job analysis is whether respondents adequately represent the overall sampling frame. To address this concern, we conducted χ² tests of independence comparing survey respondents to the sampling frame across Race, Ethnicity, Gender, IMG Status, and Degree Type. None of these demographic comparisons reached statistical significance, indicating that survey respondents closely matched the sampling frame on these variables. The only statistically significant difference occurred for the certifying board, reflecting the differential incentives described in the Methods section (Table 2). Consequently, the 322 raters who provided complete and valid responses appear broadly representative of the sampling frame, suggesting minimal nonresponse bias for the evaluated demographic variables.
Results: Content Domain Weights
Table 3 (supplement) summarizes the frequency of all 231 clinical activities from the sports medicine clinical activities survey. The frequency represents the weighted values based on frequency ratings—not the raw frequency. In practice analysis, “frequency” values are produced by a Rasch rating-scale model that translates ordinal frequency ratings (e.g., never → very often) into an interval logit scale. The model estimates each activity’s location on a latent frequency/importance continuum while accounting for rater severity and category step thresholds, so the resulting numbers are weighted, relative measures—not raw counts or absolute occurrence rates. The purpose of using Rasch modeling here is to derive the relative importance of each activity based on frequency—not to estimate their absolute frequency. An activity with a higher logit simply means it is endorsed, on average, at higher frequency relative to other activities in this sample, holding the rating-scale structure constant. Building on this, we linearly rescaled the Rasch measures into percentage weights so that the set of all existing activities sums to 100%; for the final blueprint, we then proportionally adjusted those weights to sum to 95% to reserve a fixed 5% for Foundations of Care. These percentages should be interpreted as normalized, comparative weights suitable for blueprinting and prioritization—not as direct estimates of how often activities occur in practice. To our knowledge, this is the first published data of the relative frequency of clinical activities in sports medicine practice.
Table 4 summarizes the distribution of examination questions across content domains. The Clinical Activities column reports the total number and corresponding percentage of clinical activities assigned to each domain (excluding Foundations of Practice), reflecting domain weights based solely on equal weighting without adjustments for frequency or risk factors. The Foundations of Practice domain was assigned a fixed 5% weight based on policy, as it does not include observable clinical activities. Consequently, empirically derived weights for the remaining four domains were proportionally adjusted to collectively represent the remaining 95% of the examination. The subsequent columns report three empirically derived weighting schemas described in the methods section: Frequency Index (FI), Index of Harm (IoH), and combined Frequency and Index of Harm (FI + IoH), and The final column, labeled “Proposed Approval Version,” provides the recommended domain weights based on an equal weighting of frequency and risk-of-harm indices (FI + IoH). Minor discrepancies in column totals are due to rounding but sum precisely to 100% at three decimal places.
Summary
Overview
This manuscript outlines the methodology used by the ABFM to refine content domain classifications and establish exam question distributions that accurately reflect the current scope of sports medicine practice nationwide. The survey achieved a 42.2% response rate, exceeding the typical practice analysis response rate among physicians (less than 30%), enhancing the robustness of the findings.6 A good response rate is necessary to reduce the risk of response bias.7
Importantly, respondents were demographically representative of the overall sampling frame across key variables, with the only statistically significant difference observed in certifying board distribution. This high level of representativeness minimizes concerns about nonresponse bias and supports the generalizability of the proposed content domain allocations. To ensure validity and accuracy in determining content domain proportions, the Rasch model was employed.2 This methodology eliminates individual rater biases while integrating both the frequency with which clinical activities are performed and their potential risk of harm. By applying this empirically grounded approach, the final domain percentages reflect an objective, data-driven framework for exam content allocation.
Limitations
Three key limitations should be considered when interpreting the findings of this study. First, the Foundations of Care content domain lacks empirical validation for its assigned percentage, as it is primarily knowledge-based rather than grounded in observable clinical activities. Unlike other domains, which were weighted based on data-driven estimates of clinical activity frequency and risk of harm, the Foundations of Care domain could not be empirically assessed in the same manner. As a result, its allocation of 5% of the examination was a policy decision made by the SPMED Advisory Committee, balancing the inclusion of essential foundational topics with the broader objective of basing the exam structure on empirical data. Second, while the survey sample was generally representative across key demographic variables, participation rates varied by certifying board, likely due to differences in incentive structures. This discrepancy may have influenced response rates and was an inherent limitation of the study, as governance constraints prevent the ability to offer uniform incentives across all certifying boards. Despite this, the overall representativeness of the sample on other demographic variables supports the validity of the findings. The third limitation of the study is that it has yet to be externally validated in actual practice; however, it will be challenging due to the difficulty of controlling other cofounding variables related to patient outcomes.
Future work
The practice of Sports Medicine is continuously evolving, particularly in times of transformative change. Advances in technology, such as artificial intelligence and telemedicine, are likely to shape the field by enhancing diagnostic capabilities, expanding access to care, and streamlining clinical workflows. Additionally, policy changes, including regulations governing telemedicine, modifications to medical reimbursement policies, and evolving restrictions on scope of practice, will further influence the delivery of care. As a result, practice analysis activities must be responsive to these ongoing developments to ensure that assessments and certification standards remain aligned with the evolving landscape of Sports Medicine.
Conflicts of Interest
The authors report no conflicts of interest.
Corresponding Author
Bradley G. Changstrom, MD, The American Board of Family Medicine, Lexington, KY, University of Colorado School of Medicine, bradley.changstrom{at}cuanschutz.edu
This article was externally peer reviewed.
Acknowledgements
The authors would like to acknowledge the staff of the American Board of Family Medicine, members of the co-sponsoring boards, members of the sports medicine advisory committee and members of the sports medicine assessment committee for their time in creating the blueprint, assisting with research, administering the survey or reviewing the manuscript. This information is also included to disclose the participants in the blueprint process.
ABFM Staff: Kevin Rode, Amanda Dawahare, Mirelle Hughes, Hannah Gregory, Tate Downey, Sravanthi Nallandula, Eric Ding
Co-Sponsoring Boards: Michael Barone, Carolyn Kinney, Melissa Barton
Assessment Committee: Chad Asplund, Joel Brenner, Erin Hammer, Joseph Ihm, Elena Jelsing, Michelle Labotz, Amy Powell, Brett Toresdahl, Karin Van Baak, Anna Waterbrook, Irfan Asif, Nailah Adams Morancie, Alicia Tucker, Dale Colorado, Bradley Changstrom
Advisory Committee: Ronnie Barnes, David Berkoff, Anthony Beutler, Katherine Dolbec, Kristine Karlson, William Micheo, Mark Stovak, Christopher Visco, Andrew Perron, James C Puffer, James Kinderknecht, Suzanne Hecht, Holly Benjamin, Warren Newton, Bradley Changstrom
- Received for publication July 30, 2025.
- Accepted for publication September 15, 2025.




















