Abstract
Background: Primary care practices with greater integration of behavioral health care have better patient-reported outcomes. We sought to identify whether there is a threshold effect in the relationship between the degree of Integrated Behavioral Health (IBH) and patient-reported outcomes.
Methods: Secondary analysis of survey results from Integrating Behavioral Health and Primary Care, a multistate longitudinal randomized, controlled study of 3,929 adults with multiple chronic medical and behavioral conditions. Patient outcomes included Patient-Reported Outcomes Measurement Information System-29 (PROMIS-29) functional status (PROMIS-29), depression (PHQ-9), anxiety (GAD-7), the Duke Activity Status Index, Consultation and Relational Empathy (CARE), patient centeredness, and utilization. IBH was measured by the Practice Integration Profile (PIP) version 1.0. The optimal threshold was identified by examining the relationship of PIP to PROMIS-29. The discriminatory power of the threshold was examined using multilevel linear regression with adjustment for potential confounders.
Results: Fifteen of 44 practices with 1,237 patients were highly integrated (PIP ≥ 65). All outcomes tended to be better in patients from practices with high integration. After adjustment for potential confounders, the relationship remained beneficial for all outcomes, with Pain Intensity (−0.51 [95% CI −0.97, 0.04]), patient centeredness (2.52 [0.88, 4.16]), and CARE (1.62 [0.62, 2.61]) statistically significant.
Conclusions: Patients in high integration practices report better outcomes. A measurable target for IBH, such as a PIP total score ≥ 65, provides a focus for practice leadership and guidance on the time and resources needed to achieve integration associated with positive patient outcomes. The results of this analysis provide further evidence of the broad, beneficial impacts of integrating behavioral health and primary care services.
- Behavioral Medicine
- Integrated Delivery of Health Care
- Patient Reported Outcomes
- Practice Management
- Primary Health Care
- Psychometrics
- Surveys and Questionnaires
Introduction
Behavioral Health (BH) care encompasses mental health, substance use, health behavior, and support for psychological and social issues. Although primary care offers a prime opportunity to address these needs1, a lack of resources and systems to deliver these services often leaves patients without needed care2, despite efficacious interventions.3–9 An arbitrary separation of physical and behavioral health has perpetuated siloed care and is inconsistent with the lived experience of most patients, particularly those with multiple chronic conditions.10
Despite evidence supporting Integrated Behavioral Health (IBH), just 39% of family physicians report collaborative work with behavioral health clinicians (BHCs).11 Implementation varies, with different approaches to integration in different practices6,12 and wide variability in the degree of integration.13 Considerable unevenness in definitions, access, conditions addressed, care delivered, and the extent of integration into practice workflow make analysis of IBH efforts difficult.11,14
Peek’s Lexicon15 established broadly accepted core dimensions of integration, defining IBH as “A practice team of primary care and behavioral health clinicians working together with patients and families, using a systematic and cost-effective approach, to provide patient-centered care for a defined population.” Common models of integration such as the Collaborative Care Model or the Primary Care Behavioral Health model vary in their emphasis on different elements of this definition. Real-world integration frequently combines selected features of multiple models, ranging from simple colocation (BHC has the same address as medical clinicians) to intensive integration of practice workflow, work space, infrastructure, records, support systems, consults, and professional education.16
The Practice Integration Profile (PIP), derived from the 2013 version of the Lexicon, is the first model-agnostic measure of integration to be psychometrically validated.13,17,18 There is a positive and statistically significant association between degree of integration (measured by PIP total score) and patient function. For example, the PROMIS-29 Mental Health Summary was 0.05 points higher for every point of Total PIP (P = .05).19 This relationship has implications for clinical, organizational, and financial operations as well as policy. Although more integration is associated with better outcomes, we do not know how much integration is needed. We sought to identify whether there is a threshold effect in the relationship between the degree of IBH and patient-reported outcomes in primary care, that is, a level of integration beyond which outcomes are significantly better and whether such a threshold predicts future patient outcomes.
Methods
Data Sources
Secondary data were analyzed from Integrating Behavioral Health and Primary Care (IBH-PC), a large study of adult primary care patients with multiple chronic conditions from 2016 to 2021.19–21 Data were collected from practice staff, clinicians and patients from each of 44 family medicine and internal medicine practices with colocated BHCs in 13 US states at baseline and follow-up 2 years later. Patients had at least 3 medical conditions or 1 behavioral and 1 medical condition from the following list: arthritis, obstructive lung disease, diabetes, heart disease (heart failure or hypertension), mood disorder (anxiety or depression), chronic pain (headache, migraine, neuralgia, fibromyalgia, or chronic musculoskeletal pain), insomnia, irritable bowel syndrome, and substance misuse (substance use disorder, tobacco use, or problem drinking). Conditions were identified via electronic medical record using International Classification of Disease diagnosis codes, medications, problem lists, and laboratory results. Participants were included without regard to whether they had received behavioral health services, which was not recorded. Participants self-reported their demographics, functional status, and other outcomes via online, mail, or telephone surveys in English or Spanish.
Instruments
The Patient-Reported Outcomes Measurement Information System-29 (PROMIS-29)22 is a 29-item questionnaire with 8 domains: physical function, anxiety, depression, fatigue, sleep disturbance, social functioning, pain intensity, and pain interference. Lower scores indicate better outcomes for depression, anxiety, fatigue, pain interference, pain intensity, and sleep disturbance. Higher scores indicate better physical function and social participation. Physical and Mental Health Summaries are calculated that combine the 8 domains and rescale them all so that higher scores represents better function.23 Scores are standardized to the US adult population, with mean of 50 and standard deviation of 10. The minimally important difference for scales in the PROMIS series is 2.0 to 5.0.24,25
The Patient Health Questionnaire-9 (PHQ-9)26 and Generalized Anxiety Disorder-7 (GAD-7)27 measure depression and anxiety symptom severity on continuous scales from 0 to 27 and 0 to 21. Higher scores indicate higher symptom burden.
Patient-centered primary care was assessed by the Patient Centeredness Index28, a 14-item survey that records how patients perceive their care. The Consultation and Relational Empathy (CARE) scale is a 10-item survey used to assess patients’ perceptions of clinician empathy.29 Both instruments report higher scores for practices with better performance.
The Duke Activity Status Index30 is a 12-item self-reported measure of functional capacity that correlates well with maximal oxygen consumption. We converted the index to Metabolic Equivalent of Task (METs) units31; higher METs indicate better function.
The Utilization Patient Report32 is a 3-item survey assessing health care utilization in the past year, including emergency department visits, overnights in the hospital, and outpatient appointments. The Restricted Activity Days33 survey assesses restriction of daily life due to illness and disability.
PIP 1.0 is a 30-item survey of clinicians, nurses, administrators, and staff to assess the degree to which BH and primary care services are integrated within a practice.13,34 Four or 5 nurses, staff, and behavioral and medical clinicians (at least 1 of each) from each site completed the PIP. Most of the questions start with “In our practice…”, followed by a specific characteristic (such as “…we use registry tracking for patients with identified BH issues”), an example (“Insomnia registry”), a definition (“Numerator=# of patients in BH registries; Denominator=# of patients with BH needs”), and 5 responses from Never to Always. It has 6 domains: workflow, clinical services, workspace, clinician integration, patient identification, and patient engagement, with 2 to 9 questions each. Scores range from 0 (least degree of integration) to 100. The Total Integration Score represents the overall degree of behavioral health integration.
In previous analyses using data not otherwise reported here, the PIP 1.0 has a Cronbach’s α of 0.95 with high retest reliability.13,35 It was tested for validity and reliability in a sample of 1,372 respondents from 774 practices in 52 states/territories. The total PIP score was 58 with a standard deviation of 23, median of 61, and range from 0 to 100. Within each practice, the median range in total score among respondents was 10.13 An exploratory factor analysis found an α coefficient >0.80 for 5 of the domains. We used the median value of the responses from each practice. Other practice characteristics were reported by clinic management or derived from census data for the practice’s county.
Analysis
Identifying a Threshold
We sought to identify a Total PIP score threshold that would identify practices with good clinical outcomes with high specificity. In other words, above the threshold, most practices should have better-than-average clinical outcomes, and few practices should have poor outcomes. Baseline PIP was plotted against baseline mean PROMIS-29 mental and physical health summary scores for each practice and the mean PROMIS-29 summary score was plotted as a horizontal line. We overlaid locally-weighted smoothing curves (LOWESS)36 to enhance visualization. Based on visual inspection, we selected a vertical threshold at PIP = 65 that isolated a sizable number of practices above the horizontal with few below (Figure 1). Similarly, a receiver-operating characteristic (ROC) curve plotting the true-positive rate (sensitivity) of the PIP for better practices against the false-positive rate (1-specificity) demonstrated good discrimination for both the Mental and Physical Health Summary Scores near a PIP of 65.
Relationship of baseline Patient-Reported Outcomes Measurement Information System-29 (PROMIS-29) summary scores and Practice Integration Profile (PIP) scores.
Modeling the Effect of the Threshold
Unadjusted multilevel linear regression models were used to estimate the mean differences in each patient-reported outcome in high- versus low-integration practices with 95% confidence intervals. Practice was included as a random intercept to account for similarities of patients within practices. Potential confounders included the number of qualifying diagnoses; age; gender; race coded as white versus Nonwhite; ethnicity coded as Hispanic versus non-Hispanic; marital status coded as married or living as married versus widowed, separated, divorced or single; education coded as college attendance versus high school or less; income <$30,000 per year; employment coded as full-time, part-time, student or homemaker versus retired, disabled, or unemployed. Each was added to the model as a fixed effect if they changed the coefficient of high- versus low-integration practices on the outcome by ±10%37,38 in unadjusted models and was associated with both the predictor and the outcome with P < .15.
The threshold for statistical significance was set at 5%. Analyses were performed in Stata 18.0 (StataCorp LP, College Station, TX, USA). Institutional Review Board approval was obtained before data collection.
Results
Both the PROMIS-29 Mental Health and Physical Health Summary scores tended to increase with greater integration. (Figure 1) Visual inspection showed that both plots had a point near a Total PIP score of 65 that best distinguished low from high integration practices. Fifteen of 44 participating primary care practices (34%) scored 65 or better at baseline (high-integration practices) and cared for 31% of the patients. 93% of the high-integration practices had average Mental Health Summary scores above 46 and Physical Health Summary scores above 43, indicating that a PIP > 65 had high specificity for practices with good outcomes. High- and low-integration practices were similar in all recorded practice-level characteristics, except for PIP scores. (Table 1).However, patients in high-integration practices were significantly older, male, white, non-Hispanic, married, employed, and had more education, higher incomes, and fewer chronic medical and behavioral problems (Table 2).
Practice Characteristics
Patient Characteristics
Practices that had total PIP scores above 65 at baseline had better unadjusted patient outcomes at baseline (Table 3). For instance, among the more integrated sites, the PROMIS-29 Physical Health Summary was 2.7 points higher at baseline (P = .001) and the Mental Health Summary was 2.5 points higher (P = .001). Beneficial effects were seen for all PROMIS-29 domains, clinician empathy, functional capacity, depression, anxiety, functional capacity, emergency department visits, and patient centeredness. Health care utilization and restricted activity days showed better outcomes in the high-integration practices, but did not reach statistical significance. After adjustment for patient and neighborhood characteristics, all the effects remained beneficial, but were smaller. Pain Intensity, patient-centeredness, and empathy remained statistically significant. (Figure 2)
Effect of Practice Integration Profile (PIP) > 65 at baseline.
Effect of High Baseline Integration on Patient-Reported Outcomes
A similar pattern was observed at follow-up 2 years later. Unadjusted differences favored the high-integration practices in every outcome. After adjustment for patient and neighborhood characteristics, the differences still favored high integration, but were smaller; social participation, physical function, the Physical Health Summary, patient centeredness, empathy, and the Duke Activity Status Index were statistically significant (Table 3).
Discussion
In adjusted analyses, patients in practices that have achieved a Total PIP score of at least 65 at baseline reported better functional and other outcomes for nearly all domains examined, with significant differences in pain intensity, patient centeredness, and empathy. Results were similar at follow-up 2 years later, with significant effects seen for social participation, physical function, Physical Health Summary, patient centeredness, empathy, and the Duke Activity Status Index. Because the adjusted models showed smaller effects than the unadjusted regressions, it seems likely that some, but not all, of the effects may be attributable to social and demographic differences.
Anywhere in the moderate range of PIP, from about 45 to 65, is associated with a similar level of functional status as measured by the PROMIS (Figure 1). The mechanism of this phenomenon is uncertain, but may suggest that Behavioral Health services require a “critical mass” of integration to be effective.
Better outcomes in more integrated practices were observed at the practice level even though not all the patients in each practice received direct care from a behavioral health clinician. Patients in practices with higher levels of behavioral health integration reported greater levels of empathy from their PCPs and a greater sense that their care was patient-centered. These aspects of the therapeutic relationship benefit patient outcomes.39,40 The close collaboration between medical and behavioral clinicians required for IBH may positively influence the ability of the team to provide patient-centered care41 even when the BHC is not directly involved. To be patient-centered, a practice must function well as a team and communicate with patients in a way that empowers the patient to ask questions and participate in treatment decisions.42,43 Integrated care involves the entire practice team of clinicians and staff working together with patients and families to provide patient-centered care.44 The association between integration and patient centeredness may be based on the interconnected definitions of these 2 constructs.
Although the relationship between integrated care and patient health, cost of care, and improved experience of care has been demonstrated,7,8,45 the present study is unique: the level of integration of each practice was assessed using a validated measure; it focused on the practice-level effect of integration, including patients who did and did not have contact with a BHC; and it addressed both physical and mental health outcomes. Prior research often focused on models serving narrow, specific populations, like the Collaborative Care Model.46 Our study, in contrast, takes a pragmatic approach to delivering multiple interventions to diverse populations within a practice, with broad impacts on practice culture, care delivery, and meaningful patient centered outcomes across a broad range of subjects.
Measurement of integration on a continuous scale, as provided by the Total PIP score, is useful not just for research but for planning, designing, implementing, and monitoring integration. Goal setting with objective measures and feedback is critical for effective management.47 A valid, specific target – a Total PIP score of 65, for example – can support the implementation of IBH by health care leaders, managers, clinicians, and staff using consistent measurement to effect complex change.48–50
Limitations
The data we used may not generalize to all primary care practices and patients. The IBH-PC study enrolled only patients with multiple chronic medical and behavioral conditions; the results may not apply to patients with less complex presentations. All participating practices had at least some degree of IBH in that they had a BHC on site. It is unclear what effect, if any, stems from this minimum degree of IBH. All the practices were participating in a large-scale clinical trial designed to increase IBH, demonstrating commitment to this general model of care. Although they were similar to other practices around the nation in many measured respects, it is not clear if the effects of integration are similar in other settings.
Although the high- and low-integration practices were similar in most regards, their patients differed in many characteristics, such as markers of social and economic deprivation (Table 1). Therefore, we adjusted for these factors in multivariate regression. While the adjustment partially attenuated the effect of integration level, significant effects in several domains remained (Table 3). Additional unmeasured confounders may be unaccounted for.
We made many comparisons (19 at each time point), raising the possibility of false significance due to highlighting a small number of positive outcomes among a large number of tests. In this case, all 19 outcome measures were better in the high-integration practices, suggesting that random error is unlikely to explain the differences seen. If we were to apply the Bonferroni correction (dividing the nominal P of 0.05 by 19 = 0.00263)51, at baseline only empathy would achieve significance. We note that there are arguments against adjusting for multiple comparisons.52,53
Baseline data were collected before the COVID-19 pandemic whereas follow-up data were collected until December 2020, when clinical care was significantly disrupted. This may have influenced the follow-up results, but not the baseline cross-sectional analyses.
The study used PIP 1.0. Since then, version 2 has been released (www.practiceintegrationprofile.com). Although it attempts to measure the same underlying constructs, many of the items were revised.54 However, it has not yet been validated or used in a population with functional status measures as required for the analyses presented here. It is not clear how closely the 2 versions correlate or what an equivalent threshold score might be.
The threshold score of 65 was derived from the same study that was used to evaluate it. Although the threshold works well with many measures beyond those used to derive it, including those collected 2 years later, it is not clear how it will perform in fully independent populations.
Conclusions
Patients in primary care practices with Total PIP scores of 65 or more reported better functional status, with significant differences in pain intensity, patient centeredness, empathy, social participation, physical function, the Physical Health Summary and the Duke Activity Status Index. Some, but not all, of this effect may be attributable to differing patient characteristics. A measurable target for Integrated Behavioral Health, such as Total PIP > 65, provides a focus for practice leadership and guidance on the time and resources needed to achieve a level of behavioral health service integration associated with desired outcomes for patients. This analysis provides further evidence of the broad, beneficial impacts of integrating behavioral health and primary care services. Future replication of this study with independent data sets will help confirm or disconfirm the relationship between PIP and patient outcomes reported here.
Notes
This article was externally peer reviewed.
Funding: This work was funded through a Patient-Centered Outcomes Research Institute (PCORI) Award (PCS-1409-24372). The views, statements, and opinions presented in this report are solely the responsibility of the authors and do not necessarily represent the views of PCORI, its Board of Governors or Methodology Committee. PCORI is an independent, nonprofit organization authorized by Congress in 2010. Its mission is to fund research that will provide patients, their caregivers, and clinicians with the evidence-based information needed to make better-informed healthcare decisions. PCORI is committed to continually seeking input from a broad range of stakeholders to guide its work.
Conflict of interest: The authors have no competing or conflicting interests to declare.
- Received for publication February 8, 2025.
- Revision received April 23, 2025.
- Accepted for publication May 5, 2025.








