Abstract
Background: Educating physicians and other health care professionals about the identification and treatment of patients who drink more than recommended limits is an ongoing challenge.
Methods: An educational randomized controlled trial was conducted to test the ability of a stand-alone training simulation to improve the clinical skills of health care professionals in alcohol screening and intervention. The “virtual reality simulation” combined video, voice recognition, and nonbranching logic to create an interactive environment that allowed trainees to encounter complex social cues and realistic interpersonal exchanges. The simulation included 707 questions and statements and 1207 simulated patient responses.
Results: A sample of 102 health care professionals (10 physicians; 30 physician assistants or nurse practitioners; 36 medical students; 26 pharmacy, physican assistant, or nurse practitioner students) were randomly assigned to a no training group (n = 51) or a computer-based virtual reality intervention (n = 51). Professionals in both groups had similar pretest standardized patient alcohol screening skill scores: 53.2 (experimental) vs 54.4 (controls), 52.2 vs 53.7 alcohol brief intervention skills, and 42.9 vs 43.5 alcohol referral skills. After repeated practice with the simulation there were significant increases in the scores of the experimental group at 6 months after randomization compared with the control group for the screening (67.7 vs 58.1; P < .001) and brief intervention (58.3 vs 51.6; P < .04) scenarios.
Conclusions: The technology tested in this trial is the first virtual reality simulation to demonstrate an increase in the alcohol screening and brief intervention skills of health care professionals.
Training health care professionals to ask or talk with patients about substance use, exposure to interpersonal violence, sexual practices, and other sensitive topics is an ongoing challenge. Despite the critical importance of these areas, several studies have shown that physicians infrequently ask about these topics as part of routine care. For example, in a study of primary care practices, patients with alcohol dependence received the recommended quality of care, including assessment and referral to treatment, only approximately 10% of the time.1 Fiore et al2 reported that a population-based survey found that less than 15% of smokers who saw a physician during the last year were offered assistance and only 3% received a follow-up appointment to address tobacco use. In a survey conducted by Elliot et al,3 10% of physicians reported screening for domestic violence and only 6% screened all their patients. In a Canadian survey more than 80% of physicians felt they had adequate or excellent medical training in assessing risk behaviors for heart disease and sexually transmitted diseases. The proportion of physicians who felt this way about their training in screening for substance use disorders, family violence, and sexual abuse ranged between 12.7% and 31.6%.4
Traditional educational methods used to increase the clinical skills of students and practitioners in these difficult clinical topics include lectures, case-based discussions, evidence-based journal reviews, role plays, videotape playback reviews, e-learning, and standardized patients.5 Bowman and colleagues6 and Olsen7 used simulated patients to improve physician performance on discussing with patients the prevention of sexually transmitted diseases; they were able to demonstrate significant improvements in clinical practice. However, because these methods provide limited ability for learners to repeatedly practice new clinical skills, these teaching methods have significant limitations. Students need the opportunity to repeat the words in multiple clinical situations, to observe patients’ reactions to sensitive questions, to practice behavioral intervention statements over and over again, and to receive direct and immediate feedback.
Highly interactive role play simulations have been shown to improve training effectiveness and “boost learning retention rates dramatically.”8,9 Virtual reality simulations offer many potential advantages over traditional educational methods. These advantages include (1) allowing learners to practice the simulation multiple times; (2) ensuring that learners receive a different patient response to the questions and behavioral statements for each virtual reality play; (3) permitting the learner to verbally ask questions and conduct brief intervention through what becomes a “real” clinician-patient interaction using activated voice response; (4) allowing the learner to play the simulation at any time and in any location (portability); (5) scoring of the performance of the learner and providing immediate feedback, both from the patient and a computerized “coach”; (6) a playback feature that can replay the clinician-patient interaction, repeating good interactions or trying purposeful mistakes; (7) providing educational screens that allow the learner to read about basic screening and counseling skills before or during the simulation; and (8) offering the learner course credit or continuing education credits by linking the simulation to an online connection.10
The goal of this project was to produce and test a self-contained, off-the-shelf virtual reality simulation system for health care professionals, helping them to improve their clinical skills in the areas of alcohol screening; brief alcohol interventions; and referral for at-risk, problem, and dependent drinkers. This article presents the results of an educational trial designed to test the ability of this system to improve the clinical skills of students and primary care clinicians. A successful demonstration of effect will lead to the development of additional “virtual reality” training simulations for other sensitive behavioral issues such as screening and intervention with underage drinkers; tobacco addiction; illicit drug use; nonprescription opioid abuse; interpersonal violence; sexual risk reduction; and suicide ideation.
Methods
Design
A randomized controlled educational trial was conducted to test the hypothesis of interest. One hundred two health care professionals (students and practitioners) were assigned to the experimental virtual reality simulation program or to a no education control group. The intervention was based on SIMmersion simulation technology (SIMmersion LLC, Columbia, MD).7,11 Professionals assigned to the intervention group were expected to read the educational materials and to practice the simulation on their personal computer at least 10 times during the 3-month study period. The primary outcomes of interest were changes in the clinical skills of the participants. Clinical skills were assessed using standardized patients. Each participant was tested with 3 different case scenarios at baseline and 6 months after randomization. The case scenarios were developed specifically for this study because there are currently no standard case scenarios that have been tested and validated for alcohol screening, brief intervention, and referral. Our scenarios build on prior work conducted by the principle investigator's (MF) research group.12
Participant recruitment
Participants were initially recruited by email and invited to participate in an educational study focused on alcohol screening and brief intervention. Eligible participants included physicians, residents, nurse practitioners, physician assistants, nurse practitioner and physician assistant students, medical students, and pharmacy doctoral students at the University of Wisconsin—Madison. Fourteen percent of the students and clinicians (102 of 731) who were sent a blanket email participated in the trial. The primary groups were fourth-year medical students and primary care physicians. Participants who responded to the email invitation were contacted by telephone and screened to determine eligibility. These criteria included 18 years of age or older; currently enrolled in a professional training program or currently practicing medicine, nursing, or pharmacy; ability to practice the required number of plays for the simulation; and availability to complete the 6-month posttest at the testing facility in Madison.
Participants who met these criteria were then scheduled to participate in the baseline standardized patient skills test. Written informed consent was obtained at the time of the baseline testing scenario. Randomization also occurred at this time. The study was approved by the University of Wisconsin—Madison Health Sciences Human Subjects Committee. Participants in both groups were paid $50 for completion of the pretest and $50 for the posttest. Intervention participants received an additional $10 for each play up to 10 plays. Eight participants who completed all aspects of the trial were randomly selected to receive an additional $500 for their participation at the end of the trial. The participants in the control group were given the virtual reality simulation program and voice recognition equipment at the end of the 6-month posttest.
Standardized patient testing scenarios and scoring
The testing scenarios were developed by the study research team. Three standardized cases (screening, intervention, and referral) and rating methods were used for the pretest and a separate set of 3 cases were used for the posttests. The scoring items for each of the 6 scenarios described below are listed in Tables 1, 2, and 3.
Scores for the Screening Scenario Skills
Scores for the Brief Intervention Scenario Skills
Scores for the Referral Scenario Skills
The pretest screening case discussed a 43-year-old salesman who presented to a new physician for a blood pressure medication refill and evaluation of hypertension. He had a stressful job in a new location, had approximately 4 drinks most nights after work, and occasionally had up to 6 to 8 drinks. He was annoyed by his wife's concern about his drinking and had tried to cut down, but was not successful.
The pretest brief intervention case was of a 48-year-old single, female financial advisor who presented to a new physician for trouble sleeping; she awakened often each night. She had 3 to 4 drinks each night, recently missed a crucial morning business appointment, and sprained her ankle when drinking. She had been previously diagnosed as a problem drinker but did not believe it.
The pretest referral case was of a 48-year-old divorced woman who had been a licensed practical nurse for 30 years. She presented to her physician for a refill of a diazepam prescription to help her relax and sleep. She had 5 to 6 drinks every night, her father and brother were alcoholics, and she had been irritable at work. She did not believe she was an alcoholic and would not go to Alcoholics Anonymous. She was a dependent drinker.
The posttest screening case was of a 40-year-old man who worked as a salesman for a manufacturing company and who presented to his physician to have stitches removed from an injury suffered during a bar fight. He had approximately 4 drinks most nights after work, 6 to 8 on an occasional heavy night, was annoyed by his wife's concern about his drinking and had hangovers several times each month.
The posttest brief intervention case was of a 48-year-old woman with 2 married children and 3 small grandchildren. Her marriage was failing and she did volunteer work for several organizations. She had 3 to 4 drinks most nights and her children were concerned about her drinking and were reluctant to have her babysit her grandchildren on weekends. She had tried to cut down and her father and brother were alcoholics. She was a problem drinker.
The posttest referral case discussed a 47-year-old woman who owned her own business and presented to her physician for increasing fatigue and trouble sleeping. She had 3 to 5 drinks per night and lived with a female partner who is concerned about her drinking. Her father was an alcoholic and she had tried to cut down but was unsuccessful. She was a dependent drinker.
Training of standardized patients
The University of Wisconsin School of Medicine and Public Health Clinical Teaching and Assessment Center (CTAC) was formally established in 1994 and has an active standardized patient (SP) program for research, teaching and clinical assessment. SPs for this project were drawn from a pool of 90 persons who had previously participated as SPs; they were selected based on longevity of experience, interest in the project, time availability, and background in a health care-related field. There were 9 used for the pretest and 10 for the posttest. The SPs included 6 women and 4 men. Their ages ranged from 38 to 60 and they had been with the SP program for up to 4 years.
The CTAC director (J.B.) was the primary trainer who taught the SPs to portray the cases and score the clinical skills of the subjects. Each SP participated in two 2-hour training sessions that focused on practicing their assigned role. The CTAC director provided feedback and comments to every SP, and each was expected to portray one case scenario for the pretest and one for the posttest. SPs were also asked to attend several meetings with the researchers to assist in script development as well as to refine checklist items so they were descriptive, clear, and intuitive to the SPs. The research team created detailed scripts and directions for the checklist completion based on these discussions. The close training with and collaboration between the SPs and members of the research team resulted in an open atmosphere in which SPs’ questions could be readily asked and addressed.
A pilot test of the cases, role plays, checklists, technology, and logistics took place in September 2007 with 3 participants and 8 SPs either portraying roles or viewing the sessions. None of the 3 participants involved in the pilot participated in the trial. Further modification of the scripts and scoring items were facilitated by immediate after-pilot discussion and videotape review with the SPs and the research team observing the pilot and subjects.
Testing procedure
There were 10 sessions used to conduct the pretest standardized patient scenario during a 6-week period in November and December of 2007, with 8 to 12 participants tested during each session. Each of the 102 participants interviewed 3 SPs. The participants were given 5 minutes to read an abbreviated medical record on the outside of the door before entering the examination room and interviewing the SP. Each research participant had 15 minutes to interview and/or counsel the SP. The SP had 5 minutes to score the checklist before the next clinician entered the room. Research participants completed the pretest in 60 minutes. The posttest sessions occurred during April and May of 2008 using similar testing methodology. All subject-SP sessions were videotaped, and a 20% sample of the video taped sessions was reviewed by a panel of research team members to determine the validity of the SP scoring. The reviews found strong agreement between the ratings of the SPs and research team. The degree of agreement (Kappa) between the SP and research team panel's 20% review was 0.95.
Virtual reality simulation
The virtual reality simulation software used for this study was developed by Dr. Dale Olsen at the Johns Hopkins University Applied Physics Laboratory. The stand-alone software modified for this study consisted of 6 elements. The first element was an educational program that could be read by the participant to give basic background on alcohol screening and intervention. This was based on the National Institute on Alcohol Abuse and Alcoholism's 2005 clinician's guide and consisted of 20 screens of text. The second element consisted of 707 questions and statements learners were able to ask the simulated patient to conduct counseling or make a referral to a treatment center. The third element was a set of 1207 responses developed for the simulated patient to respond to the learner (an actress video recorded these responses with varied mood and affect in a production studio during a 5-day period). An algorithm embedded in the program was programmed to respond based on the type and appropriateness of the question or statement made by the learner, as well as the history of the conversation between the character and the learner. Negative or inappropriate questions or statements were linked to audio and visual anger or mood changes in the SP.
The fourth element was the on-screen help agent. The agent is an action figure in the corner of the screen who intermittently displays hand and body signals to indicate especially good or not-so-good questions posed by the learner. The fifth element was an instant replay feature. The sixth element was scoring and feedback to the learner about their performance during the play. The basic computer screen for the simulation is presented in Figure 1. For this simulation the program was designed so that 40% of the time the character in the simulation would be an at-risk drinker, 40% of the time they would be a problem or dependent drinker, and 20% of the time they would be a low-risk drinker (based on criteria developed by the National Institute on Alcohol Abuse and Alcoholism).
The basic computer screen for the simulation.
Participants were asked to conduct face-to-face alcohol screenings, brief interventions, and referrals with the simulated character using a microphone or a computer mouse to communicate. The questions and statements were scripted to include a variety of natural choices. For each scripted question or statement there were multiple simulated character responses available. The simulated character's brain selects a response based on the level of rapport developed by the physician, the character's risk level, previously discussed information, and chance. For example, if the physician selects a series of inappropriate options, the simulated character will become curt and uncooperative; if, however, the physician selects a series of appropriate options, the character will become friendly and forthcoming. This realistic emotional variation allows the simulated character to emulate actual human behavior.
Participants received feedback from an on-screen help agent who provided nonverbal cues regarding the user's choice of questions. In addition, the participant could click help buttons for assistance with question choices and character responses. The system scored the participant's performance and an instant replay feature enabled users to review portions of dialogue or their entire conversation. At the end of the conversation, the participant would have to decide which type of drinker (randomly selected by the program) they were talking to and were scored on their accuracy.
Each participant was expected to screen the patient for alcohol use and decide if the patient required a brief intervention and a referral for treatment. Because each of the 3 patients would require varying amounts of time, a completed play was based on the number of statements and questions that were used by the participant, with 10 statements set as the minimum requirement for one play. The guidelines could be accessed at anytime during the play as a reference for the participant.
Plays were tracked using SIMmersion's (SIMmersion, LLC) online tracking system, with the goal being that each experimental group participant would play at least 10 times in the 3 to 4 months before the final posttest. After pilot testing by the expert panel there was a general consensus that 10 plays was a minimum number of plays needed to take advantage of the various patient scenarios and responses built into the simulation. The experimental group participants were contacted by research staff once during the practice period to ensure they were using the simulation software correctly and to aid in solving any issues.
The questions and SP responses were written and developed by an expert panel at the University of Wisconsin. The expert panel consisted of 14 University of Wisconsin—Madison primary care clinicians, addiction medicine physicians, psychologists, and members of the SIMmersion team with significant expertise in substance abuse. Members of the panel also played and pilot-tested a number of versions of the virtual reality simulation as it was developed. The simulation took approximately 9 months to develop before testing.
Statistical analysis
The demographic variables for the experimental and control groups were described by way of frequencies (%). Univariate analysis was used to assess potential differences on sex, age, clinician versus student status, and prior alcohol training. As noted in Table 4, there were no significant differences between groups on these 4 variables.
Sociodemographic Characteristics of Study Population
The primary outcomes were the intervention, screening, and referral scores from the SP scenarios. As noted in Tables 1, 2, and 3, the clinical skills of the participants were assessed using a set of clinical criteria developed for the screening, brief intervention, and referral scenarios. Eighty-five points was allocated to 17 specific skills criteria, with 5 points for each. The SPs were instructed to score the skill done as a simple yes (the learner demonstrated the skill) or, alternatively, no (they did not demonstrate the skill). Participants received 5 points or 0 points for each skill. No partial credit scores were given. Fifteen points were used to rate the clinicians overall performance. These scores (0 to 100) were aggregated to create a total score for each scenario and were described with means and standard deviations.
Ninety-one of 102 participants completed the 6-months posttest. The primary reasons the 11 participants did not complete the posttest included scheduling conflicts with patient care and course work, relocation, and illness. Intention to treat analysis was followed. Baseline scores were imputed to the missing data in the posttest scores for the 11 participants who did not complete it. We elected to use the most conservative method to handle missing follow-up data. All 102 participants originally randomized into the trial were included in the outcome analysis.
The mean values for experimental and control groups on the posttest were compared with t tests to derive effect size for the educational trial on the 3 scenarios. The t tests were executed separately for all items on the intervention, referral, and screening scales. The sample was too small to assess a dose-response affect between the number of plays and changes in clinical behavior. We stratified the results on each of covariates in Table 4 and found no statistical association between these variables and the primary outcomes. The analyses were performed with SAS software (version 9.1 for Linux; SAS Institute, Inc., Cary, NC).
Results
Table 4 provides a general description of the 102 trial participants. As noted previously, the majority of the research participants were medical students and other health care professional students. Ten primary care physicians (4 family physicians, 4 internal medicine physicians, and 2 pediatricians) participated in addition to 30 nurse practitioners and physician assistants. As in most health care professions, the majority of participants were women. Only 5% reported previous training in alcohol screening and brief intervention.
Table 5 presents the primary results of the trial. The scoring was set up to range between 0 and 100 points for each case. Baseline scores were similar for each group across all 3 scenarios. The posttest scores demonstrated significant differences between groups for alcohol screening (P < .001) and brief intervention skills (P < .04). The screening skills in the control group increased by 3.7% whereas the skills in the experimental group improved by 14.4% during the 6-month period. The number of participants who inquired about the frequency of alcohol use increased from 40 to 51 in the experimental group, with no change in the control group (42 to 43). The brief intervention skills went down in the control group by 2.1% and increased 5.7% in the experimental group. Although there were significant changes in referral skills on the pre- and posttests, there were no differences between groups.
Scores for the Screening, Brief Intervention, and Referral Testing Scenarios
Tables 1 through 3 present the score and differences for the individual items contained in each of the scenarios. Table 1 illustrates each individual screening item on which participants were scored. The items with significant changes between pre- and posttest included asking about quantity of alcohol use, the frequency of alcohol use, and the frequency of heavy drinking. Participants also asked more often about alcohol-related injuries and prior treatment. Table 2 lists the items scored for the brief intervention scenario. Items with significant changes focused on drinking cons and readiness to change. As shown in Table 3, there were some minor differences between the control and intervention groups in the frequency of referral to Alcoholics Anonymous and making a follow-up appointment.
Discussion
The technology tested here is in a true sense a “virtual reality teaching method” that allows a learner to engage in a “real doctor patient interaction.” The algorithm on which the technology is based allows for an unlimited number of variations in the interaction so that every time the learner plays the program the situation is novel and different. Because of the large number of potential responses to questions and counseling statements, the patient can respond in a nearly limitless number of ways. There is also a realistic emotional variation in the patient response that allows the patient to react to the type and sequence of questions and statements. These emotional variations include negative or positive facial expressions, nonverbal body postures, negative vocalizations, changes in eye contact and expressions of affect that change the atmosphere of the interaction. The simulation attempted to overcome concerns about the “reality” of patient simulations,13 allowing learners to “practice” on fake patients before applying these new skills to real patients where patient safety is a concern.14
The 14 members of the expert panel who practiced and played the program reported that the interview felt like entering an office examination room and dealing with a real patient. Similar unsolicited comments were made by the students and physicians in the intervention group. The program made learning fun and interactive, especially for a topic that is difficult and sensitive. For a topic like substance abuse, where learners often roll their eyes and become somnolent when being taught, this technology represents a refreshing, innovative way to teach and enhance clinical skills for a variety of health care professionals.
The most robust finding of the study was an improvement in alcohol screening skills. This finding may be related to nature of the simulation that began with screening and then moved into brief intervention and referral. With a limited number of plays (minimum of 10), learners may not spend sufficient time practicing brief intervention and referral skills. Although we were not able to test the dose effect of additional plays, future research may want to focus on the brief intervention and referral skills training portions of the simulation.
How can this virtual reality program be used to improve the knowledge and skills of health care professionals? First, the program could function as part of a core curriculum for teaching medical, physician assistant, nursing, and pharmacy students about how to conduct alcohol screening and intervention. Other parts of the curriculum could include case discussions, e-learning sites, an evidenced-based review of clinical protocols, and SP testing. The program could serve as the platform for a more comprehensive program that could include rotations on consult services, use of these skills on clinical rotations with feedback by faculty supervisors, supervised assessments in addiction medicine programs, as well as opportunities to practice these new skills in community-based prevention and treatment programs.
Second, because the program is designed as a stand-alone resource, the protocol could be used for continuing medical education (CME) for practicing physicians. In a review of studies on formal CME programs, Davis et al15 presented evidence that interactive CME sessions that enhance participant activity and provide the opportunity to practice skills can effect change in professional practice and, on occasion, health care outcomes. Although the program is easy to install and learn how to use, a brief orientation by local information technology staff may be helpful in overcoming challenges many clinicians have with new computer technology. Third, for health care systems or specific groups such as hospital-based trauma surgeons, the program could be used to meet the requirements for national certifying organizations like the Joint Commission or American Trauma Society.
The strengths of this study include the random assignment of a control group, a large diverse sample of learners, state-of-the-art measurement of clinical skills, intention-to-treat procedures, and collection of 90% of posttest follow-up information. Weakness of the study included challenges in measuring changes in clinical behavior skills. There are no standard methods to measure changes in alcohol screening and intervention skills, and we had to develop our own SP scenarios and scoring methods. Although we pilot tested the scenarios and used the research team to review 20% of the tapes to ensure consistency, scoring remained a challenge. The screening and brief intervention scenarios seemed to work well with large positive outcomes in the intervention group. The large change in the referral scenarios for both groups suggests a problem with the posttest scenario that needs further assessment.
Another potential limitation is the generalizability of a volunteer sample compared with a general sample of learners. On the one hand, one could argue that paid volunteers are more motivated or are more likely to change their clinical skills. However, in this case we make the opposite argument. A student or clinician who is required to play the simulation is more likely to learn the material and get a higher score on the posttest than a volunteer. Our volunteers did not receive a grade or feel any pressure to pass the posttest or to learn how to screen and conduct brief intervention. Grades and a passing requirement are a powerful motivator to perform. Another argument is the observation that the pretest baseline standardized patient scores on alcohol intervention skills are likely to be lower in a general sample than in a volunteer sample. Most of our volunteers were interested in leaning more about alcohol intervention and had fairly high baseline pretest scores. This created a ceiling effect that limited our ability to demonstrate change. Based on these arguments, we expect clinicians and students who are required to play the simulation and pass a certain level of proficiency would be likely to improve their skills at least as much as our volunteer sample.
Conclusion
The technology successfully tested in this study offers great promise. Virtual-reality, game-based teaching methods are ideally suited to increasing the behavioral skills of health care clinicians. It is difficult to ask patients about personal issues (substance use, sexual practices, violence, depression) when they made an appointment for a blood pressure check or a headache. Although patients appreciate concern and caring from their clinicians they also expect personal questions to be asked with skill, empathy, and confidentiality. Clinicians who do not have the skills to inquire or talk about these sensitive topics often generate fear and resistance on the part of the patient. Patients’ reactions to being asked about these topics can be strong and in some cases carry the risk of harm. Virtual-reality simulation offers learners the opportunity to practice and develop skills before trying to apply these skills with real patients.
Acknowledgments
Expert Panel: Bhushan Bhamb, MD; Randall Brown, MD; Richard Brown, MD, MPH; Jane Crone, NP; Tanya Jagodzinski, MD; Patricia Kokotailo, MD; Amy Miller, NP; Linda Roberts, PhD; Sharon Woodford, NP; and Aleksandra Zgierska, MD, PhD, from the University of Wisconsin; and Jason Kilmer, PhD, from Evergreen University and University of Washington.
Standardized patients: Catherine Antczak, Steven Clark, Jeanne Harris, Richard Kreklow, Karyn McCann, Rob Rivard, Joyce Schwert, Kim Stalker-Herron, Deborah Sutinen, and Dave Verban.
Production team: Zachary Barrier, Henry Dewitt, and Peter Roca, Software Development; Clay Hopper, Director; Sean Kobrin and Mark Smith, Audio and Video Producers; Julie-Ann Elliott, Actress; Elizabeth H. Richards, Female Voiceover; Michael Mortenson, Male Voiceover.
Notes
This article was externally peer reviewed.
Funding: National Institute on Alcohol Abuse and Alcoholism, grant number 1R42 AA016486-01.
Conflict of interest: none declared.
- Received for publication October 3, 2008.
- Revision received January 30, 2009.
- Accepted for publication February 3, 2009.