Abstract
Objectives: This study compared the performance of and patient preference for New York City Health and Hospital’s (NYC H + H) social needs screener to 2 widely used screeners, a version of the Accountable Health Communities screener and the WellRx screener, that include the same core domains of social needs.
Methods: Two NYC H + H primary care clinics provided data for analysis. A convenience sample completed 1 of the 2 other screeners during May–June 2024, in addition to the NYC H + H screener. Analyses compared rates of needs detected and number of needs identified as well as patient preference.
Results: The H + H screener performed similarly to both alternate screeners in identifying patients with social needs, (κ = 0.7, P < .001 and κ = 0.6, P < .001). The number of positive items identified by each screener was virtually identical. Patients preferred the H + H screener to the alternates, but differences were not statistically significant.
Conclusions: Despite differences in question phrasing and response options, all 3 screeners performed similarly.
- Health Policy
- Health Services
- New York
- Patient Preference
- Population Health
- Primary Health Care
- Screening
- Social Determinants of Health
- Urban Health
Introduction/Background
In response to the large body of knowledge related to social determinants, or drivers, of health (SDOH), and how they impact health, identifying and addressing individual level SDOH, referred to as health related social needs (HRSN), is rapidly becoming a standard of care.1,2 The Centers for Medicare and Medicaid Services (CMS) defines these HRSN as unmet social conditions that contribute to adverse health outcomes and has identified 5 core domains that all “qualified standardized screening instruments” should include: food insecurity, housing insecurity, transportation insecurity, safety, and utility insecurity.3,4 In January of 2024, CMS announced authority for states to reimburse for SDOH related screening and resources to address HRSN.5 CMS currently requires HRSN screening of adult Medicaid participants in inpatient settings and will require it in outpatient settings beginning in 2026.
Although many HRSN screeners exist, are in current use,6–8 and might claim some degree of face validity in that they include the core domains identified by CMS as required, no single tool that identifies multiple HRSN domains has emerged as a gold-standard. CMS has developed the Accountable Health Communities (AHC) Health Related Social Needs screener, but it has consistently allowed flexibility in the use of qualified standardized screening instruments that identify needs in key domains: food insecurity, housing insecurity, transportation insecurity, utility insecurity, and safety.9,10 Multiple stakeholders, including the Gravity Project and the American Medical Association,11 recommend flexibility in screening tool choice.
When deciding on a screener to implement, organizations consider many factors, including their proposed use of the screener results, feasibility of implementation and acting on positive screens, literacy level, mode of administration, and length.8,12 One aspect that is harder to assess is relative performance – how do screeners compare in the rate of detecting patients with HRSN, in identifying number and types of HRSN, and in acceptability to patients? We know of no other efforts to compare the performance and acceptability of multiple screeners among the same patients. In this article, we compare the results of the screener currently in use in the New York City Health and Hospitals (H + H) ambulatory care system with 2 other widely used screeners, the New York State (NYS) version of the AHC screener and the WellRx screener, using data gathered from 2 H + H primary care clinics. H + H selected the NYS AHC screener specifically because the NYS Department of Health has identified it as an approved screener, and potentially the only approved screener, for Medicaid reimbursement in NYS. H + H chose the WellRx screener because it is a widely used screener that is similar in format to the H + H screener and it, like the NYS AHC screener, has been issued Logical Observation Identifiers, Names, and Codes (LOINCs) by the Gravity Project for reimbursement purposes.13,14 All 3 screeners include items related to core domains identified by CMS as necessary for “qualified standardized instruments.”
Methods
The NYU Grossman School of Medicine (NYUGSOM) Institutional Review Board approved this study to conduct secondary analysis on existing screener data provided by H + H from an internal quality improvement (QI) initiative comparing the existing H + H HRSN screener with 2 other screening tools.
Setting
H + H is the largest public hospital system in the country, serving over 1 million patients annually.15 In 2018, H + H piloted a HRSN screening tool, with the goal of identifying HRSN and providing support to patients in need.16 The screener was incorporated into the electronic medical records in 13 languages in 2019 and scaled throughout adult and pediatric primary care. Over the past year, over 300,000 patients were screened for HRSN system-wide, almost 70% of patients seen in primary care.
Screeners
All 3 screeners include items concerning 4 core social domains identified by CMS (food insecurity, housing insecurity, transportation insecurity, utility insecurity), plus additional items that assess education and employment related needs. The H + H and WellRx screeners also include questions about legal and childcare needs. The H + H screener includes 2 questions about health care related financial needs and public assistance needs, and the WellRx screener includes an item about income generally. All 3 screeners are available in languages other than English, including Spanish.
The NYS AHC and WellRx screeners include questions concerning the fifth core domain of safety. As is required by NYS Public Health Law,17 H + H has separate screening workflows for safety and domestic violence. Because these workflows require different interventions to address acute needs appropriately, the H + H HRSN screener does not include safety items. Although the NYS AHC and WellRx screeners included the safety items in their administration in the QI initiative, we excluded those items in the analysis to better reflect both the specific H + H setting and health care settings in general, where safety questions are often addressed elsewhere in the clinical encounter and may not be included in HRSN screening.
(See Appendix A for copies of the full screeners used in the quality improvement project.)
H + H Screener
Development of the H + H screener included review of multiple existing screeners, including the AHC, PRAPARE, WellRx, and Health Leads screeners.18–20 Questions were adapted to 1) ask what needs patients wanted help with rather than first collecting information about their social conditions; 2) be succinct enough for use in busy clinics; and 3) be fifth-grade literacy level. The H + H screener is a 248 word-count, 11-item measure. All but the 2 housing instability items inquire whether the patient wants help with a particular need for example, “I want help getting food for my family.” Answer options include “Yes,” “No,” or “Choose not to answer.”
NYS AHC Screener
The AHC screener, developed by CMS with input from a national panel, also covers the core domains.3 The screener proposed by the NYS Medicaid program (NYS AHC) includes the 10 AHC items plus 2 supplemental questions about education and employment needs and totals 374 words.21 The NYS AHC screener includes the 2-item version of the Children’s HealthWatch food insecurity instrument.22 All other items are phrased as questions, 2 ask patients if they want help with that need, and 6 are about social conditions. The question on housing instability requires respondents to select 1 of 3 options that best describe their current living situation.
WellRx Screener
The WellRx screener is a 193 word-count, 11-item instrument developed by researchers in the Office for Community Health at the University of New Mexico.19 All items on the WellRx screener are in the form of questions, and all response options are Yes/No. Three of the questions on the WellRx screener are phrased to ask if the respondent needs help.
(See Appendix B for a complete list of included screener questions cross-listed by domain.)
Sample
Community Health Workers (CHWs) or Health Advocates (volunteers that help to screen patients and connect them to resources) at 2 H + H primary care clinics sites asked a convenience sample of patients to complete a second screener in addition to the standard H + H screener administered during their regularly scheduled visits; 50 patients at 1 site in Queens completed the NYS AHC screener and 50 patients at a Bronx site completed the WellRx screener; it was deemed too great a burden to have respondents complete 3 similar screens. Fifty Spanish speakers were included in the sample, with 25 completing each non H + H screener, respectively. Although this sample is not sufficiently large enough for subgroup analyses by language, we present them for descriptive/exploratory purposes. All screeners were self-administered by participants in waiting rooms at the time of their visits with the order of screener completion randomly varied. All primary care patients 18 years and older were eligible for recruitment whether they were a new or returning H + H patient. No additional incentives were provided. On completion of both screeners, CHWs and Health Advocates asked participants to indicate which of the 2 screeners they preferred, or if they had no preference.
Scoring
Individual items on each screener were considered “positive” if the participant indicated having that social need or wanting help with that need, depending on the wording of the screener. For the H + H and WellRx screeners, and several questions on the NYS AHC screener, an item with a “Yes” indicated a positive need and was coded as such. For the NYS AHC screener, items with a 3-item Likert scale response (never, sometimes true, often true) were considered positive for the HRSN when respondents selected either “sometimes true” or “often true.” The “living situation” item on the NYS AHC-was considered positive if any response except “I have a steady place to live” was checked. The “choose all that apply” item concerning housing quality on the NYS AHC screener was considered positive if any response other than “none of the above” was checked. The utilities question was considered positive if either “yes” or “already shut off” was answered. For each screener, an overall positive screen was counted when the respondent indicated one or more needs on that screener (excluding the safety items on the NYS AHC and WellRx screeners).
To see how the screeners compared in identifying the number of different types of social needs, we constructed 2 sets of count variables. First, for each screener we constructed the number of positive items from the 4 core domains (insecurity in food, housing, utilities, transportation) identified as positive; this count could thus vary from 0 to 4. For food insecurity, the NYS AHC screener utilizes the 2-item Children’s HealthWatch survey measure,22 while both the H + H and WellRx screeners use a single item to assess food insecurity. Similarly, the NYS AHC screener includes 2 items that reflect utilities insecurity while the other 2 have a single item. To avoid “double counting” on the NYS AHC screener when comparing to the H + H screener, we counted a positive for food/utilities insecurity if either related item was endorsed. Both the H + H and NYS AHC screeners include 2 items for housing, one that asks about current houselessness while the other assesses housing insecurity while housed; the WellRx screener combines those aspects of housing insecurity into a single item. Again, for scoring comparability, we considered an individual positive for housing insecurity on the H + H and NYS AHC screeners if they affirmed either of the 2 housing items.
All 3 screeners included additional types of HRSN (eg, educational needs, employment needs, childcare, safety, etc.), but the number and type of additional needs were not identical across all 3 screeners. Accordingly, we constructed count variables to represent the number of common needs on both the H + H and NYS AHC screeners and on both the H + H and WellRx screeners. For the H + H/NYS AHC count, the 6 common types/domains represented by the items included the 4 core domains plus education and employment needs. For the H + H/WellRx count, the 8 common domains included the 4 core domains, education and employment needs, as well as legal and childcare needs.
Analysis
H + H provided researchers at NYUGSOM the deidentified data from their QI initiative. IBM SPSS Statistics Version 29.0.20 was used for all analyses. Counts and percentages are reported for all categorical variables, and means and standard deviations for continuous variables. Cohen’s kappa statistic (κ) assessed the degree of agreement between the H + H screener and each of the alternate screeners in terms of an overall positive screen result.23 We also calculated Cohen’s kappa for common individual domains on the screeners as described above.
Correlation coefficients assessed the linear relationships for counts of positive items between the H + H screener and the other 2 screeners, and paired t test assessed the differences in the mean counts. Chi-square goodness-of-fit test assessed differences in reported screener preference.
We conducted subanalyses within English and Spanish language screeners to detect any potential differences in agreement by language.
Results
Sample Description
Although 100 individuals completed 2 screeners, 1 Queens participant was removed from analysis because they answered none of the questions on the NYS AHC screener. Subsequently, 49 individuals in the sample were patients at the H + H clinic in Queens who completed the NYS AHC screener in addition to the H + H screener and 50 were patients at the clinic in the Bronx who completed the WellRx screener in addition to the H + H screener. 35.4% were male and 54.5% female with 10.1% abstaining. The mean age of respondents was 48.8 years with a standard deviation of 16.3 years. 49.5% of the sample completed the screens in English and 50.5% in Spanish. Demographic characteristics were consistent with the typical patient population served at these H + H sites (Lincoln hospital: 52% female/48% male, Queens hospital 54% female/46% male) according to recent internal H + H data.
Comparison of Overall Positive Screens
For the 49 individuals at the Queens site who also completed the NYS AHC screener, 89.9% were positive on the H + H screener and 81.6% were positive on the NYS AHC screener. As shown in Table 1, the percentage of overall agreement (both positive and negative) screening results was 91.8% (κ = 0.7, P < .001).
Congruence of Overall Positive Screens
For the 50 participants at the Bronx site who also completed the WellRx screener, 58.0% were positive on the H + H screener and 60.0% were positive on the WellRx screener. The overall agreement (both positive and negative) was 80% (κ = 0.6, P < .001). These patterns generally held true across language groups, with agreement slightly greater among Spanish speakers than among English speakers; agreement was lowest for H + H/WellRx for English speakers (κ = 0.5, P = .01). Agreement for the H + H versus NYS AHC Spanish comparison was 95.9%, but we were unable to calculate a Kappa statistic because there were no negative screenings in the H + H Spanish group.
Counts of Needs Identified
Overall, the H + H screener identified a virtually identical count of positively endorsed items out of the core 4 domains compared with the NYS AHC screener (1.7 vs 1.8, P = .437), and compared with the WellRx screener (0.8 vs 0.8, P = 1.000). This pattern was consistent by language, with minimal differences observed in mean count of positive items, none of which were statistically significant (see Table 2). The mean count of core items endorsed on the H + H screener was positively and substantially correlated with the mean count endorsed on both the NYS AHC screener (r = 0.6, P < .001) and the WellRx screener (r = 0.8, P < .001). This pattern of statistically significant positive correlations between the H+H screener and the alternate screeners was consistent across language; the lowest correlation was between H + H and NYS AHC English speakers (r = 0.5, P = .011).
Comparison of Counts of Positive Items
Among the 6 common domains between H + H and NYS AHC, the H + H screener also identified a similar average count of endorsed items as the NYS AHC screener (2.5 vs 2.6, P = .537). Of the 8 common domains shared by the H + H screener and the WellRx screener, mean counts were also very similar (1.6 vs 1.5, P = .844). Differences in means increased slightly when calculated by language, with the only significant difference in mean being for the Spanish subset of the H + H/WellRx group (2.0 vs 1.5, P = .025) (see Table 2). The mean count of common items endorsed on the H + H screener was positively and substantially correlated with the mean count endorsed on the NYC AHC screener (r = 0.7, P < .001) and the WellRx screener (r = 0.7, P < .001). This pattern held true across language.
Domain-by-Domain Comparisons
Results of Cohen’s Kappa test of agreement by individual domain showed fair to substantial levels of agreement overall across all core and common domains between the H + H screener and the alternate screeners (see Table 3). All were statistically significant. The pattern of fair to substantial relationships held true across language, although several associations were not statistically significant. (Cohen’s kappa could not be calculated for the H + H/WellRx childcare domain among English speaking participants due to a zero cell count). The lowest congruence occurred in the housing and utilities domains for H + H/NYS AHC; in transportation for H + H/WellRx, especially among English speakers; in food insecurity for H + H/WellRx among Spanish speakers and in education for H + H/WellRx among English speakers.
Congruence of Individual Domains
Screener Preference
Of the total sample of 99, 10 individuals expressed having no preference between the 2 screeners they completed (7 in the NYS AHC group and 3 in the WellRx group). Of the 89 individuals who preferred one screener over the other, over half (59.6%) preferred the H + H screener to the alternate screener administered (see Table 4). The pattern of results was consistent across both alternate screeners and by language, however, none of the differences in screener preference were statistically significant.
Screener Preference (n = 89)
Discussion
Overall, the H + H screener performs similarly to both the NYS AHC and WellRx screeners, despite notable differences in wording and response options. The rates of overall positive screens (ie, 1 or more affirmative responses) were similar for both alternate screeners when compared with the H + H screener and agreement was quite high, indicating that all 3 screeners identified largely the same individuals with and without HRSN. The counts of distinct types of HRSN that were endorsed as positive were remarkably similar across screens and by language, suggesting that the 3 screeners identified a very similar degree of need among patients (as assessed by number of different types of needs), although which specific need(s) each screener detected was not always the same. Screener preferences were not statistically significant, but the pattern of preference for the H + H screener was consistent across the board. The preference for the H + H screener was more pronounced when paired with the NYS AHC screener than with the WellRx screener. We are pursuing additional qualitative work with patients and clinicians to further explore the observed differences in patient preference for one screener over the other.
A potentially important difference in wording across the screeners was that the H + H screener mostly identified social needs for which the patient “wanted help,” while the other 2 screeners mostly asked if the patient had a particular need. H + H’s decision to word questions in this way was deliberate, as the primary purpose of the screener was to address social needs where help was desired. When developing the screener, internal H + H stakeholders discussed whether the “I want help…” phrasing would result in greater numbers of needs being identified as patients would be eager for help with say, housing, even though it was not a pressing need, or whether this phrasing would yield a lower rate of needs because patients might have needs but did not need or want help from their health care clinicians to address those problems. The evidence here supports neither hypothesis – regardless of whether the questions were phrased as wanting help or having needs, the screeners yielded very similar results.
There was at least moderate agreement between the H + H and alternate screeners for most of the social domains, including food insecurity, education, employment as well as legal and childcare (H + H/WellRx only). Agreement between the H + H and NYS AHC screeners was lower for utilities and housing, and between the H + H and WellRx for transportation. However, a closer look at the data did not reveal systematic patterns of one screener identifying more or less of the specific HRSN than the other. For example, the NYS AHC screener identified 8 individuals with housing instability that the H + H screener did not, but the H + H screener identified 5 with housing instability that the NYS AHC screener did not. The WellRx screener identified 6 individuals with transportation needs that the H + H screener did not pick up, while the H + H screener identified 7 individuals with transportation needs that the WellRx screener did not. These differences may be due to the specific wording of the questions, response options, or time frames (eg, in the past 12 months vs current).
A secondary purpose for HRSN screeners may be to estimate the prevalence and/or degree of HRSN at a population level, perhaps to contribute to risk adjustment scores at the institutional or aggregate level. We would argue that brief screeners are not sufficient to deliver comprehensive profiles of patients’ HRSN, but our findings with respect to the 3 screeners we compared suggest that they provide very similar estimates of the prevalence of HRSN (the percentage of the population with any HRSN) and the degree of HRSN (the number of different types of needs) for such population description.
The results of this study strongly support the approach to HRSN screening that requires inclusion of core domains, but allows flexibility in selecting screening tools, as well as the addition of domains to better fit the needs of the patient population. For example, the H + H screener contains a question regarding help with medical bills and accessing public assistance which are important to the patient population H + H serves and can be addressed through payment assistance and referrals to financial counselors. In addition, both the H + H and WellRX screeners have a childcare question, another need that is important to many of H + H’s patients.
Some limitations of our analysis are due to the relatively small sample size, especially for subgroup analyses. A large number of tests were conducted for the sample size, however, the great majority of tests for degree of association were significant at P < .001; the small sample size was more of a limitation for screener preference, as none of the observed differences were significant. As a QI project with limited resources, a larger sample was simply not possible and we recommend larger scale studies of concordance among tools, especially as payers are starting to require specific tools for reimbursement. Another limitation is that we could not directly compare the WellRx and NYS AHC screeners to one another because of the way the data were collected in the QI initiative. The small sample size may also limit generalizability of the findings. It is also possible that the administration of the two screeners at the same point in time may have biased concordance upwards. On the other hand, administration at points in time that are not close (ie, at two consecutive visits to the clinic) would be problematic as the underlying HRSN could well be different at these points in time. Finally, it is important to note that this study was not an attempt to validate any of the included screeners. As others have noted, there remains a lack of rigorous validation of HRSN screeners in general,6,8 and although challenging, such validation is sorely needed.
Conclusion
In this first published direct comparison of HRSN screeners, 3 screeners with similar core content but different wording and response options identified the presence of and number of HRSN very similarly. The results of this work suggest that HRSN screeners that meet core CMS requirements for content will perform similarly in detection and documentation of HRSN. This study tested whether the screeners themselves produced differential results; future research might explore whether different modes of administration – self-administered vs administered, tablet vs paper and pencil, type of administrator (eg, lay person vs clinician) – affect the identification of HRSN in the primary care setting.
Appendix A New York City Health and Hospital’s (NYC H + H) Health-Related Social Needs Screening Tool
Appendix B
Notes
This article was externally peer reviewed.
Funding: The work that produced this research was supported by funding from New York City Health + Hospital Corporation.
Conflict of interest: The authors have no conflicts of interest to report.
- Received for publication January 7, 2025.
- Revision received May 1, 2025.
- Accepted for publication May 27, 2025.









