Abstract
Background: Artificial intelligence (AI) implementation in primary care is limited. Those set to be most impacted by AI technology in this setting should guide it’s application. We organized a national deliberative dialogue with primary care stakeholders from across Canada to explore how they thought AI should be applied in primary care.
Methods: We conducted 12 virtual deliberative dialogues with participants from 8 Canadian provinces to identify shared priorities for applying AI in primary care. Dialogue data were thematically analyzed using interpretive description approaches.
Results: Participants thought that AI should first be applied to documentation, practice operations, and triage tasks, in hopes of improving efficiency while maintaining person-centered delivery, relationships, and access. They viewed complex AI-driven clinical decision support and proactive care tools as impactful but recognized potential risks. Appropriate training and implementation support were the most important external enablers of safe, effective, and patient-centered use of AI in primary care settings.
Interpretation: Our findings offer an agenda for the future application of AI in primary care grounded in the shared values of patients and providers. We propose that, from conception, AI developers work with primary care stakeholders as codesign partners, developing tools that respond to shared priorities.
Introduction
Artificial intelligence (AI) is a field of computer science concerned with enabling computers to mimic human cognitive functions, such as advanced learning, problem-solving, and creativity.1 Over the past decade, advances in methods and computing power have made it possible to organize and interpret large amounts of data and, from that data, reveal patterns that accurately predict complex, nonspecific outcomes and human behaviours.2⇓–4 This has spurred a flurry of innovation in medicine. Clinical applications of AI are most advanced in image- and signal-intensive disciplines, including radiology, dermatology, and critical care, where the performance of AI algorithms in many tasks now meets or exceeds that of individual clinicians.5
Primary care supports the health of all members of society and is primed to realize the benefits of AI on a broad scale. Primary care electronic health records (EHRs) contain longitudinal data that span diseases, care settings, socioeconomic circumstances, and life experiences. Applications of AI to these and other linked data (eg, from wearable devices6) can enable proactive care,7⇓–9 clinical decision support (CDS),10⇓–12 and triage.13 Automation of documentation or practice operations tasks could reduce physician burnout and reclaim time spent with patients.14 With primary care AI in its infancy, these applications are unpotentiated on a broad scale.15,16
Examples of primary care-targeted AI applications under development in the public sector include automating checks on clinical decisions in real-time against chronic disease guidelines, detecting signs of dementia, and predicting outcomes such as nonelective hospitalizations.14,17,18 Some American private health systems and EHRs have deployed proprietary tools for risk stratification and CDS, although the extent to which underlying algorithms have been externally validated, assessed for risk bias, or developed in partnership with end users is unclear.14 Algorithm bias (i.e., where algorithms developed using nonrepresentative datasets produce outputs biased against structurally vulnerable,19) can worsen health equity, as has been demonstrated in some proprietary tools.20⇓–22
In 2019, only 1.1% of Canadian physicians in any discipline reported using AI tools in patient care.23 A 2018 survey of UK general practitioners (GPs) suggests that uptake will be highest for AI tools that support tasks less dependent on empathy and communication, such as documentation.24,25 The views of primary care patients have not been explored, although 1 small Canadian study found that public attitudes toward the use of health data for AI research were conditionally positive.26 Only a handful of educational resources exist to help primary care stakeholders engage in this emerging area.27,28
There is opportunity to potentiate the benefit of AI in primary care by engaging key stakeholders in guiding its application. In service of this aim, we asked patients, primary care providers, and health system payers from across Canada about how AI should be applied in primary care and asked them to identify shared priorities.
Methods
Setting
This study was conducted in Canada, where more than 85% of citizens over 12 years old have a primary care provider, most often a family physician or a nurse practitioner.29
Participants
We used purposive maximum variation sampling to recruit a socioeconomically diverse group of patients and interprofessional providers from across Canada (see Online Appendix 1 for details of sampling frames).30 Patients who (1) spoke English, (2) were aged 18 years or older, and (3) visited a primary care provider at least once within the last year were recruited through social media and patient advisory organizations. We aimed to attain sample variation with respect to age, gender, race and ethnicity, educational attainment, income, and province of residence. Providers or managerial office assistants working in a primary care setting at least 1 day per week were recruited through social media and Dr. Andrew Pinto’s (AP) professional networks. Variation was sought with respect to gender, race and ethnicity, provider type, country of health professions training, years in practice, practice size, and province of practice. We used a critical case frame for system leaders, inviting individuals in primary care, digital health, or health informatics roles at 5 provincial/territorial governments30 We collected demographic information during consent interviews and adjusted recruitment strategies throughout registration to promote variation (see Table 1).
Design
This emergent qualitative study used deliberative dialogue, a participatory method initially developed to enhance deliberative democracy by gathering people affected by an issue to advise decision makers. The method has been adapted for agenda setting in a variety of contexts, including health system planning.31⇓–33 In advance of the dialogue, participants are presented with evidence-based information related to the topic and 1 or more questions posed by the dialogue organizers. They are then gathered in 1 or more synchronous meetings to collaboratively ask questions and deliberate on approaches to address the topic at hand. Rather than draw on specialist knowledge as in Delphi methodologies,34 dialogue participants are encouraged to base their views and advice in their collective interpretation of the evidence presented, as well as their individual values and lived experiences.35,36 Consensus may or may not be a prespecified goal of the dialogue.35,36
We held 12 videoconference 90-minute dialogues split over 3 rounds between September 8th and October 15th, 2020 (Figure 1). Patients and providers were invited to participate in 1 session in each of the 3 rounds of dialogue. System leaders were invited to join 1 session during the final round or provide feedback on preliminary findings. All participants were offered $40 Canadian dollars as an honorarium for each session attended.
Dialogue guides (see Online Appendix 1) for rounds 2 and 3 were adapted in response to findings of the previous round. Patients, providers, and health system leaders were asked how they thought AI should be applied in primary care and to work together to develop shared priorities. We sought consensus on priority applications.35,36 Patients and providers met among their respective peer groups in the first round and in mixed groups during the second round. In the third round, they were joined by system leaders. All participants completed an online module written in plain language before participating (see Online Appendix 2). Module content was derived from peer-reviewed literature and included an overview of AI, technical definitions, examples of AI in medicine, potential applications in primary care, and ethical considerations. Before distribution, the module was reviewed by 4 content experts (TCYC, JG, CSG, ADP), 3 members of the public, and a patient partner. The study was approved by the research ethics boards of Unity Health Toronto and the University of Toronto on March 9th, 2020.
Data Collection and Analysis
We used Sittig and Singh’s37 model for studying health information technology (HIT) in complex adaptive health systems to develop facilitation guides (see Online Appendix 1) and for analysis. We adapted an 8-category framework from EIT Health and McKinsey & Company to conceptually organize AI applications.38 TLU, a master’s student, developed a detailed analysis plan that was reviewed by CSG, who holds a PhD in mixed-methods health technology research. CSG also trained TLU in facilitation techniques.
Each dialogue was recorded and transcribed verbatim, and observers (ACN or JM) recorded field notes in Microsoft Word. ACN is an experienced research coordinator and JM is an MD-PhD student. Both were experienced in qualitative methodologies and were trained by TLU on the specific methods and frameworks used for this study. The facilitator (TLU) met with the observers within 24 hours to rapidly analyze field notes using interpretive description approaches.39⇓⇓–42 Transcripts and field notes were coded (TLU and ACN) in Microsoft Word, focusing on priority AI applications, concerns, and normative views. Guided by the conceptual model,37 codes were grouped into concepts and then concepts were grouped into initial themes in Microsoft Excel. Themes identified by analysis of first-round data were member-checked with participants at the beginning of second-round sessions, with particular emphasis placed on clarifying priority applications. Desired functions and benefits of priority applications were based in explicit descriptions offered by participants when justifying their prioritization of a given application. Although thematic saturation—where no new insights arise from subsequently collected data—is not a necessary outcome of deliberative dialogue,41 we reached it for our key concepts after 9 sessions.43
In further support of validity, we achieved data triangulation through careful analysis of transcripts, field notes, interpretive notes, and participant activities (eg, in-session polling).44 Participants had 3 opportunities to member-check researcher interpretations of priorities and other themes during the study, including a member-checking survey at the study’s conclusion, to which 18 participants responded (see Appendix 1).
Results
Dialogues involved 48 individuals from 8 Canadian provinces, including 22 patients, 21 primary care providers (14 family physicians, 2 family medicine residents, 5 other professionals), and 5 health system leaders (Table 1). Seventy-seven percent of patients and 72% of providers attended at least 2 rounds of dialogues. Of the 53 people registered for the study, 2 providers (scheduling conflicts) and 3 patients (no reasons provided) withdrew without attending sessions. While participants were diverse across most dimensions, 91% of patients had some post-secondary education, and only 1 (4.5%) reported occasional income insecurity. Two patients self-identified as having professional backgrounds in information technology. Another patient was a heavily involved patient partner. Most physician participants reported an interest in health technology, and several had related professional roles. Five health system leaders participated, including a chief data officer at a hospital and several directors and data scientists from provincial health authorities. Two provided written feedback on preliminary findings only.
We identified 3 themes: (1) priority applications of AI in primary care, (2) impact of AI on primary care provider roles, and (3) considerations for provider training in AI (see Table 2). Shared values included health equity, patient-centered care, patient safety, accessibility, and care continuity. Patients and providers identified strikingly similar priority applications for AI and similar concerns about the impact of AI on care. Health equity, patient safety, and patient-centered care could be both supported and threatened by AI and thus materialized in discussions of priority applications and concerns. Most participants thought that AI should be applied in ways that actively reduce health inequities.
Priority Applications of AI in Primary Care
Patients and providers agreed that the highest priority applications of AI in primary care are to support clinical documentation, practice operations, and triage (see Table 3). The current state of technology in care drives provider burnout, challenges patient-centeredness, or limits access to care. Among other desired benefits, participants hoped that applying AI to these areas would create more time and cognitive freedom for providers to better manage medical or social complexity, coordinate care for patients facing barriers to access, and engage in patient-centered activities (eg, face time with patients). One physician collectively described these benefits as a shift from “full scope to optimum scope.”
Study participants also shared enthusiasm for AI-driven CDS and applications that enabled robust preventative or proactive care. They recognized the “high-reward” potential for such tools to enhance patient safety, outcomes, care quality, and health equity. However, considering unresolved issues of algorithm bias and the current dearth of evidence for AI safety and effectiveness, patients and providers perceived applications in these areas to pose a high immediate risk to patient safety. Weighing this perception with the prospective utility of all possible applications, participants assigned higher priority to “safer” areas that would address patients’ and providers’ most pressing unmet needs. One physician and all 3 system leaders who participated in dialogues were interested in applying AI to population-level datasets for discovery and health resource planning.
Participants believed that software interoperability, usability, and workflow integration must be addressed to fully realize the impact of AI in primary care. Patients and providers worried that triage tools, if designed and implemented without input from key stakeholders, could disrupt continuity or limit access to those unable to use technology (eg, people without computer access, some older people). Providers acknowledged that proactive and preventative care applications must be supported by workflow reconfigurations, appropriate compensatory models, and the development of evidence-based proactive interventions. Several patients and providers were interested in remote monitoring and self-management applications (Table 3). However, they were not shared priorities, given substantial concerns about overloading end users with unhelpful or unactionable information.
Impact of AI on Primary Care Provider Roles
While they were open to the potential for AI to alleviate the burdens of certain routine tasks, most patients and providers were unconvinced that AI will ever fully replace providers, especially in the context of clinical decision-making. This skepticism arose from the idea that the patient-provider relationship is intrinsically human and at once the defining feature and enabling mechanism of patient-centered primary care. This belief was widely held among participants. Several participants were particularly pessimistic that AI tools could fairly or comprehensively consider social and economic factors that impact care, believing this knowledge to develop through strong relationships built with individuals and communities over time. For these reasons, patients and providers agreed that the design, implementation, and use of AI should uphold or enhance the patient-provider relationship.
Considerations for Provider Training in AI
Several patients and most providers worried that future provider generations could fall victim to deskilling (ie, declining proficiency over time resulting from automation of tasks) if either the design of AI applications or AI training for health professionals does not preserve the core skills that promote patient safety or patient-centredness. They were worried that this scenario could lead providers to rely on algorithm outputs more than their clinical reasoning skills or knowledge about a patient’s unique history, circumstances, goals, and preferences. Several providers speculated that such overreliance could compromise patient safety should AI malfunction, be biased, or be unavailable in certain settings.
Participants agreed that training must prepare providers to use AI safely and effectively while retaining core clinical skills. Providers and system leaders identified 3 priority areas for formative and continuing professional education: basic AI literacy, algorithm critical appraisal, and workflow integration. They suggested that curricula cultivate AI literacy and algorithm appraisal skills early in training. Further, trainees should receive hands-on exposure to AI tools only after demonstrating clinical competence in the relevant knowledge domain to prevent deskilling.
Physicians and 1 system leader thought that training for practicing professionals should be online and largely self-directed. Further, they believed that content should be responsive to the needs determined by different practice settings (eg, remote, urban) and care models (eg, walk-in, family health team). Some physicians and a system leader viewed training on specific AI software and integration within clinic workflows as important for preventing the burnout experienced during the rollout of EHRs. Several physicians and 1 nurse practitioner stressed the need to clarify medicolegal liability for AI-enabled decisions and update risk management guidance accordingly. One system leader suggested that training would also help “demystify” AI technology, promoting adoption and enabling fulsome provider participation in its design, implementation, and regulation.
Discussion
This study is the first to engage a national group of health care stakeholders in dialogue about the role of AI in shaping the future of any medical discipline and the first to substantively describe primary care patient views about AI. Patients and providers thought that AI should be first applied to support clinical documentation, practice operations, and triage, which would improve efficiency while maintaining person-patient-centered care. They viewed complex AI-driven CDS and proactive care tools as impactful but lower priorities because AI applied in these areas posed a higher perceived risk to patient safety, given present technical and external limitations. Training in AI was viewed as the most important external enabler of the safe, effective, and patient-centered use of AI in primary care settings.
While the applications that our participants prioritized align in general with previous research,14,15,45 our study offers a distinction between higher- and lower-priority applications from the perspectives of end users. Notably, the highest-priority areas identified in our study are not reflected in existing AI research. A scoping review found that, as of 2018, the field focused heavily on methods development for diagnostic and treatment CDS. Our findings suggest that AI researchers should focus on the practical day-to-day challenges patients and providers face.
Participants’ priorities were influenced by how they understood the intersections of patient and provider roles, workflow and communication requirements, and external forces (eg, medical education, limited evidence of AI safety). As anticipated by the Sittig and Singh framework, the successful deployment of AI will depend on the interaction of multiple social and technical factors. To anticipate these factors, developers and software designers should engage patients and providers as codesign partners at a project’s conception phase.46,47 Development teams should be interdisciplinary from the outset, including health information and implementation specialists, whose expertise will facilitate the translation of AI algorithms into routine clinical practice.
Compared with Blease et al’s24,25 2018 survey of UK GP attitudes toward AI tools, our participants were more open to AI as a supplement to clinical reasoning. However, they shared the view that AI technology is unlikely to obviate the provider role. Notably, patients and providers in our study took a strong normative stance that AI is not an adequate substitute for providers for any clinical function that leverages knowledge gained through the patient-provider relationship, nor tasks that rely heavily on empathy and communication, skills they considered innately human.
Content about data science, AI, electronic medical records, and other HIT in medical education curricula is limited.48,49 Our findings give credence to the gaps identified by other commentators and recent surveys of medical students.50⇓⇓–53 This study offers concrete suggestions for the timing, focus, and delivery of training in AI. This presents another opportunity for cocreation between developers, clinicians, patients, and educators.
This study has several strengths. First, findings are based in values considered fundamental to the discipline of primary care,54 which supports transferability of our findings to non-Canadian settings. Second, deliberative dialogue created a novel opportunity for patients and interprofessional providers to engage with one another and agree upon common perspectives. Finally, unlike previous work,24⇓–26 study participants learned about AI technology before taking part, promoting an informed discussion.
This work also has limitations. Patients had higher educational attainment and lower levels of income insecurity than observed in the general population of Canada and other Organisation for Economic Co-operation and Development countries.55,56 In addition, our sample did not include any self-identifying indigenous people, who may have unique views on health data use.57 Physician views were likely biased toward those who are already interested in AI technology. We were also unable to engage participants from the northern territories or the other Atlantic provinces, regions that face unique resource constraints. For these reasons, we cannot assume our findings are generalizable. Many participants felt additional dialogues were needed to thoroughly explore the study topic, which encouragingly suggests an appetite for more deliberative work that robustly engages other stakeholders and broader publics.
Our findings offer an agenda for applying AI in primary care grounded in the shared values of patients and providers. We propose a paradigm shift, where from the conception phase, AI developers work with interdisciplinary teams that engage primary care end users as codesign partners, developing AI-driven tools that respond to patients’ and providers’ most pressing unmet needs.
Acknowledgments
We are grateful to the study participants who made this research possible by sharing their time and rich perspectives. We thank Millie Upshaw, Sally Headrick, Monica Brands, and Jane Cooney for reviewing the online informational module used to support participants.
Appendix 1.Deliberative Dialogue Process and Guides
Appendix 1 Table 1. Participant Sampling Frames and Recruitment Strategies
Round 1 Patient Dialogue Guide
Round 1 Provider Dialogue Guide
Round 2 Dialogue Guide with Use Cases
Round 3 Topic Prioritization Survey and Dialogue Guide
Member-Checking Survey
Appendix 1 Table 2. Patient, Provider, and Health System Leader Demographics
Round 1 Patient Dialogue Guide
Opening and Presession Questionnaire Responses (10 m)
Review study aim and guidelines for respectful dialogue.
Before we begin the structured portion of the dialogue, we want to make sure that everyone is on the same page in terms of the topics discussed in the module. There were a few questions that came up in the questionnaire at the end of the informational module. I’ve addressed these in the slides.
Are there any other questions?
Here are some of the things some of you found interesting or important after reading the module. We can speak to these ideas throughout the discussion, and I will do my best to bring them up while facilitating.
Deliberative Dialogue (1 hours 15 m)
Uses and Applications
Considering what you learned from the module and your own experiences as patients, what areas of primary health care do we think can be improved with AI-based tools?
Are there any areas of primary health care where we think the use of AI-based tools is not a good idea?
As described in the module, developing AI applications for use in primary care depend on availability of personal health information, like the clinical data stored in electronic medical records or administrative datasets that track your use of the health system.
○ How do you feel about the use of your personal health information to train AI tools that inform the care of others?
○ What would it take to make you feel comfortable that your personal health information is truly deidentified and couldn’t be linked back to you?
AI can be applied to health care datasets to predict more concrete things like the chance that a strange mole is skin cancer, or more abstract things like how likely someone is to be admitted to hospital in the next three months. The steps a provider can take for a skin cancer diagnosis are clear, but there isn’t one intervention to prevent something like a hospital admission or death, because these events are complex.
○ What kinds of expectations should we have about the steps a provider takes in response to a prediction or recommendation made by an AI-based prediction tool? Does it depend on the prediction?
○ How important is it for providers to be able to identify the factors that contribute to a prediction or recommendation made by an AI-based tool? Does it depend on the prediction?
We explored in the module an example of the negative impacts of a biased AI algorithm. Biased algorithms may result from poor quality datasets that reflect biases in our society or from the biased assumptions of people who develop AI applications. If an AI-based tool were used in your care, how important would it be for you to know the steps that were taken to ensure that its predictions were not biased against you?
○ How and when would you like those steps communicated to you?
Concluding Remarks and Participant Experience Survey (5 m)
Round 1 Provider Dialogue Guide
Opening and Presession Questionnaire Responses (10 m)
Review study aim and guidelines for respectful dialogue.
Before we begin the structured portion of the dialogue, we want to make sure that everyone is on the same page in terms of the topics discussed in the module. There were a few questions that came up in the questionnaire at the end of the informational module. I’ve addressed these in the slides.
Are there any other questions?
Deliberative Dialogue (1 hours 15 m)
Uses and Applications
Considering what you learned from the module and your own clinical experience, what tasks or processes in primary health care practice do you think can be improved with AI-based tools?
Are there any areas of primary health care where we think the use of AI-based tools is not a good idea?
How would your workflows have to change to allow the effective use of a given AI tool?
○ What changes in communication would you expect patients would need or want, if you started using these tools? Should they know you’re using AI?
Are there populations in your practice that you could better support with accurate predictions for certain outcomes?
What are your thoughts on predicting clinical outcomes (eg, 10-year mortality) versus patient-important outcomes?
Accuracy, Transparency, and Fairness
What evidence would you need to feel confident using an AI prediction tool to support clinical decision-making?
Is evidence less important for tools that support administrative tasks?
Would you want to double-check a flag before an AI assistant contacts a patient to schedule an appointment?
How important would it be for you to fully understand the factors an AI algorithm considers in generating a prediction?
How important would it be for you to know the steps that were taken to ensure that the predictions of an AI algorithm were not biased against your clinic population?
What safeguards need to be in place to ensure a tool doesn’t make health inequity worse?
System, Organizational, and Provider Readiness for AI
How prepared do you feel you to us an AI tool and its predictions in your practice, today? What supports do you need to address gaps in your knowledge?
How ready do you feel your clinic is to implement AI tools? What about your institution? Your health system?
What barriers do you anticipate to making use of the outputs of AI prediction tools in your routine workflow?
Concluding Remarks and Participant Experience Survey (5 m)
Round 2 Dialogue Guide with Use Cases
Opening and Round 1 Summary Review (20 m)
Review study aim and guidelines for respectful dialogue.
Review summary of preliminary Round 1 findings, including patient and provider use cases of interest and concerns.
Deliberative Dialog
Part 1 (20 m)
Let’s take some time to respond to the ideas participants shared in Round 1. What comes up for members of the group when reflecting on the summary?
Part 2 (40 m)
Let’s dig into a few example applications of AI in primary care. Our study team developed the following use cases based on the areas of application that participants in the first round of dialogues were interested in. They have been written in a way to provoke dialogue. We’ll vote on the use cases using the polling link posted in the chat:
Questions for use cases:
○ How would you want this tool to work for you?
○ What challenges do you anticipate might get in the way of this tool working optimally?
Use cases:
A large primary care organization that serves 36,000 people is using a new AI tool. It processes the usual health data found in EMRs and combines it with social data that the organization has collected on patients, like their housing, income and employment status, and their language preference. Clinic leadership has asked physicians to spend dedicated time each shift focusing on their top 10 patients who are at risk of getting sick.
An EMR vendor has developed a new AI-enabled tool. The vendor claims the tool can optimize patient appointment booking to help get those most in need seen first by their provider. The tool has a smartphone app that patients can use to book appointments and describe the reasons for the visit. The vendor offers clinics that use their EMR system free use of the tool for one year.
A primary care provider is testing a new AI tool in her clinic. The tool uses natural language processing and other AI methods to automatically create notes in the patient chart during a clinic visit. A microphone connected to the computer sits on the desk in consult room. A patient who has never heard of this kind of tool enters the room for an appointment.
AI researchers at an American university developed an AI tool that uses deep learning to help primary care providers and patients manage chronic conditions. The tools analyzes data entered by patients in an app; it can also integrate with certain smart watch models. It sends alerts to providers when it detects something unusual. The researchers started a company to sell this tool and have approached several Canadian primary care clinics.
An EMR vendor has developed a new AI-enabled tool. The vendor claims the tool can optimize patient appointment booking to help get those most in need seen first by their provider. The tool has a smartphone app that patients can use to book appointments and describe the reasons for the visit. The vendor offers clinics that use their EMR system free use of the tool for one year.
Concluding Remarks and Participant Experience Survey (5 m)
Round 3 Topic Prioritization Survey and Dialogue Guide
Method for Topic Prioritization Survey
Participants were asked to complete a questionnaire before their Round 3 session. The questionnaire asked them to rank the applications that had come up so far in the study as high, medium, or low priority. They were then asked to consider barriers to their high or medium priority applications, and then rank the feasibility of those applications based on how easy they thought the barriers would be to address. This process was repeated for concerns. The results were analyzed for each session in advance and used to narrow the discussion.
Dialogue Guide
Opening and Topic Prioritization Survey Results Review (20 m)
Review study aim and guidelines for respectful dialogue.
We created the topic prioritization survey after hearing from many of you in the last round that the discussion was too broad, simply because there is so much to talk about. The topics on the slides were ones that three or more participants voted on as high priority. As you can see, there are still many topics. We’re going to do a bit more in-session polling to try to narrow it down further.
Our goal is to focus in on about 10 related topics and develop some concrete recommendations for researchers and policy makers.
Dialogue
Let’s share a bit about why we prioritized the topics we did. Why are they important to us?
What barriers do we anticipate to high priority applications? How should they be addressed? Who should address them?
What barriers do we anticipate to resolving our high priority concerns? How should they be addressed? Who should address them?
Member-Checking Survey
At the conclusion of the data collection period, all participants who took part in at least 1 session were invited to provided feedback on preliminary study findings. Findings were framed as 6 recommendations. Participants were invited to improve the wording of the recommendations, add details they believed were essential to capture, and contribute any other reflections. Four of 6 recommendations are not included here, as they pertain to findings that will be published elsewhere.
System leaders who did not take part in sessions were also invited to review findings. Sixteen participants provided complete responses, including 12 providers and 4 patients. Two system leaders who did not take part in a dialogue session also provided reflections.
Future Perspectives on AI in Canadian Primary Care: A Deliberative Dialogue Series
Recommendation Feedback Survey
From September 9th to September 24th, 2020, 22 patients and 21 interprofessional primary care providers representing 7 provinces met over 9 virtual deliberative dialogue sessions. Participants worked together to identify useful applications of AI in primary care and outline concerns about AI technology. Many ideas about how AI should be used to benefit Canadian primary care arose organically during these sessions.
On October 13th and 15th, groups of patient and provider participants from previous sessions met with health system leaders for the final round of deliberative dialogues. Participants developed recommendations related to a narrow set of AI use cases and concerns voted on in advance of sessions.
This survey presents 6 recommendations based on early interpretations of data from the final round and previous phases of the study; detailed analysis is ongoing. We designed this survey to provide you with a voluntary opportunity to:
Clarify the research team’s understanding of your ideas and opinions, or improve the wording of recommendations,
Add details you believe are essential to capture in recommendations, and
Contribute any final thoughts to the study, including new recommendations.
The research team will analyze your feedback and incorporate it into a final set of recommendations presented in research publications. If you asked to receive copies of these reports during your intake interview, you will receive them sometime in 2021.
Recommendation 1: Initial applications of AI in primary care should alleviate the burden of administrative tasks for providers to create more time and mental space for patients and providers to focus on patient-centered care.
Context:
There was broad consensus among the participants across sessions that AI will supplement primary care providers' roles and skills rather than substitute them.
Patients and providers were enthusiastic about advanced AI-enabled clinical decision support (CDS) tools, particularly those that synthesize current evidence and patient histories to support diagnostics and treatment planning. However, they identified several issues to address before such tools can be deployed successfully in care settings, including data quality, availability, and security.
Participants agreed that AI is best applied in the near-future to tasks such as automating charting, managing referrals, issuing prescriptions, or sending reminders.
The desired benefit of offloading administrative tasks to an AI is to promote patient-provider face time and shared decision-making that accounts for patient values, preferences, and circumstances.
Is there anything you would like to add to the above recommendation? We invite you to:
Improve the wording of the recommendation,
Add details you believe are essential to capture in the recommendation or context, and
Contribute any other reflections on the recommendation.
Recommendation 6: Primary care providers should be trained to use AI tools to protect core clinical skills and enhance critical appraisal abilities.
Context:
Patient and provider participants agreed that practicing primary care providers and trainees must not be taught to rely on AI tools as substitutes for clinical reasoning skills. Instead, providers and trainees must be taught how to appropriately use AI tools as supplements in their reasoning processes and workflows.
Several provider participants identified a need for:
○ A scientific evidence base that describes the performance of AI technologies. Many provider participants said they would be reluctant to use an AI tool sold by a private developer if it was not evaluated through peer-reviewed research.
○ Critical appraisal frameworks that are developed and endorsed by trusted sources to help practicing providers decide if a tool is safe and effective to use in caring for their patient population.
Several provider participants also highlighted the need to clarify the management of medical-legal liability if AI tools in clinical decision-making.
Provider participants felt that provider training is an important part of managing burnout and ensuring usability of AI tools.
Is there anything you would like to add to the above recommendation? We invite you to:
Improve the wording of the recommendation,
Add details you believe are essential to capture in the recommendation or context, and
Contribute any other reflections on the recommendation.
Appendix 2. Participant Informational Module
Future Perspectives on AI in Canadian Primary Care: A National Deliberative Dialogue Series, September 9th-October 15th, 2020; prepared by Tara Upshaw, MHSc student
1. Introduction
2. Deliberative Dialogues: What to Expect
3. Artificial Intelligence: An Overview
3.1. Machine Learning
3.2. Deep Learning
3.3. Other AI Methods
4. AI in Medicine and Health Care
5. Applying AI in Primary Care
6.Ethical Considerations When Applying AI to Health Data
7. Summary
8. Presession Questionnaire
References
Glossary
1. Introduction
What is the purpose of this module? You have received this module because you are participating in the research study: Patient, provider, and health system leader perspectives on artificial intelligence technology in primary care. You do not need to know anything about artificial intelligence (AI) to participate in this study. This module is meant to teach you about AI and its possible uses in primary care so that you can feel comfortable talking with other participants.
You are not expected to be an expert on AI, and you will likely have further questions after reading this module. We encourage you to bring these questions to your sessions, along with your views and ideas.
This module will:
Describe the design of the study and what to expect during your sessions
Define AI and describe computer programming methods that underlie AI technologies
Explore applications of AI in medicine broadly
Describe possible applications of AI in primary health care
Describe ethical considerations when applying AI in the context of health care.
Where does this information come from? A variety of sources, including academic literature and consultations with experts in health information technology and AI. A reference list and a glossary of bolded terms is available at the end of this document.
How long will this module take to complete? This module will take most people about 30 minutes to read. There is a short questionnaire at the end that will take up to 5 minutes to fill out. This questionnaire must be completed online.
How does the module work? The online module is delivered using a secure survey platform. You received a custom link that saves your progress and tracks your questionnaire responses. Since you have chosen to review the printable PDF version, please remember to return to this link to complete the questionnaire when you are finished.
2. Deliberative Dialogues: What to Expect
What is a deliberative dialogue? A deliberative dialogue is a discussion among people involved in or affected by future decisions about a high-priority issue. A dialogue aims to bring stakeholders together to understand a topic in greater depth and share different views.1 Each dialogue will involve up to 10 participants. Dialogue participants have an opportunity to identify what is important to them about an issue and work together with other participants to provide advice to researchers and policy makers. Results may include consensus recommendations and emphasis on areas of disagreement among participants.
What is primary care? Primary care refers to the services involved in health promotion, illness and injury prevention, and the diagnosis and treatment of illness and injury.2 Primary health care is usually the point of first contact with a health care provider when you are sick or injured. Primary care providers include family physicians, nurse practitioners, pharmacists, and others that contribute to your holistic well-being, including social workers, dieticians, and traditional healers. Primary care providers help you move through the health system if you need more specialized care or enroll in social programs that can support your health.
Why is a deliberative dialogue about the future of AI in primary care important? Advances in AI are driving innovation in most industries, including health care. It is expected that AI will change primary care and other areas of health care soon.3 Exactly how AI will change primary care is unclear because it is not widely used in most health care settings. Many factors will interact in complex ways, including primary care provider roles, patient preferences, policies, political considerations, and commercial interests. What is most important for the success of any AI-based technology is that its function matches the needs of patients, providers, and health administrators.
This deliberative dialogue series will help us understand how AI can be useful for people who use and work in primary care. Dialogues will allow us to anticipate how certain uses of AI will impact patient experience and clinic workflows. As well, the dialogues will help us form recommendations for policy makers and identify critical unanswered questions to address through future research.
Who is involved in this dialogue? Patients and providers from across Canada were invited to reflect diverse life and work experiences. Policy makers in all provinces and territories involved in shaping policies that affect how AI is used in health settings were invited to participate. Most participants will take part in more than 1 round of the study.
What will happen during the dialogues? Deliberative dialogues will occur over 4 rounds (Appendix 2 Figure 1). Each round, a trained facilitator will work with participants to focus on a different aspect of AI in primary health care. Participants will receive short summaries of the previous dialogue rounds before their session. These summaries may include new information not included in this module that becomes relevant as the dialogues progress.
In Round 1, primary care patients and providers will meet in separate groups to discuss the application of AI in primary care and identify possible priority uses.
In Round 2, patients and providers will meet together to explore 3 use cases “prioritized” in the previous round. Discussions will focus on the impact that an application may have on patient experience and clinical workflows. With help from the facilitator, participants may form preliminary recommendations for policy makers and researchers.
In Round 3, policy makers from across Canada will meet to discuss key findings from Rounds 1 and 2 and provide insight into relevant policy contexts.
In Round 4, representatives from all 3 participant groups will meet together to integrate Round 3 input from policy makers. Participants may adjust priority uses, revise or finalize recommendations, or highlight important areas of disagreement.
What is my role as a participant? As a participant, your role is to bring your unique perspective to the discussion and learn from others' perspectives. The facilitator may ask participants to develop the reasoning behind their views and ideas. The information you read and hear may inform your opinions, and your opinion might (or might not) change over time. A dialogue is about embracing diverse views among us and finding ways we can work together to guide future directions.
What is the role of others?
Facilitator: The facilitator will work with participants to:
Discuss questions prepared by the research team and questions that emerge during the discussion
Help participants form recommendations
Respectfully explore areas of disagreement.
Observer: A research staff member will attend all sessions to take notes on the discussion and support session logistics.
Are my contributions to dialogues confidential? Participants are free to use the information gained during dialogues. We ask that all participants not disclose the identity or the affiliation of a speaker. In other words, it is okay to talk about the dialogues with people who did not take part, but we cannot credit anything we hear to a specific person or the place they work.
All participants will only be asked to identify themselves by their first name and their participant type (ie, patient, provider, or policy maker) during the session. We will provide instructions for setting your username on Zoom.
Sessions will be audio-video recorded. Recordings will be password-protected and stored on secure servers at St. Michael's Hospital. Only members of the study team will have access to recordings. Any identifying information will be removed during analysis.
3. Artificial Intelligence: An Overview
In this section, we will define AI and review common computer programming methods used to create most everyday AI-based applications. By the end of this section, you should understand how deep learning methods fit within AI, and how AI relates to Big Data (Appendix 2 Figure 2). Do not worry about memorizing details. Focus on the big picture and use the glossary if you need to jog your memory on specific terms.
We will use the terms “algorithm,” “system,” “program,” and “method” interchangeably when talking about AI. AI methods are computer algorithms. An algorithm is a set of step-by-step instructions for solving a problem. A computer algorithm can be simple (eg, if it is Saturday at 9 am, send a reminder) or complex (eg, Identify pedestrians).4 We will learn more below about how AI algorithms can be combined to produce applications with complex “intelligent” functions.
What is artificial intelligence? Artificial intelligence is a broad field of science concerned with getting computers to do tasks that would normally require human intelligence. In this definition, intelligence refers to processing information, reasoning and learning, planning actions, and communicating in natural language. Another way to think about AI is as a general-purpose prediction technology that estimates missing information from available information.5
Alan Turing (Appendix 2 Figure 3) established the concept of AI in 1954.6 Since then, computer scientists have developed a variety of methods that allow computers to mimic human intelligence. These methods can be grouped into several types; each type is suited to particular intelligent tasks. A combination of methods is usually used when developing AI technologies.
Today's AI applications are “narrow.” Narrow AI programs can only do what they were designed to do.7 For example, an AI program that can beat a human in a chess match cannot solve a complicated math problem. “Narrow” AI applications are often better than humans at the tasks they were designed for but cannot develop additional skills without being programmed by humans.
In contrast, general AI refers to a single system that can learn in different situations and then apply broad knowledge to solve any kind of problem—like the human mind!7 This is what many people think of when they hear “AI.” There are currently no actual examples of general AI. This is not surprising when we consider the complexity of human learning and problem-solving.
What is Big Data? How is it related to AI? “Big Data” is a term used to describe data produced by a variety of sources in large volumes at a fast pace. Big Data has grown across industries in the last 3 decades because of advances in computer processing power and storage, widespread adoption of mobile devices, and increased Internet availability.8 AI methods allow us to condense and make sense of these datasets. Advances in machine and deep learning in the past 10 years have depended on the availability of large, labeled “Big” datasets.
A dataset is a collection of data gathered using the same criteria for a specific purpose. Datasets from different sources can be shared and combined to create linked datasets. Linked datasets contain a broad range of information that can be analyzed to shed light on complex issues. When linked data are used for research purposes, they are deidentified to reduce the possibility that they could be traced back to any individual.
For example, the Insurance Corporation of British Columbia (ICBC) collects data from individual drivers in BC each time a person renews their car insurance. The Medical Service Plan collects data on visits to physicians in British Columbia. These 2 datasets could be shared and combined to reveal relationships between health care use and driving practices.9 As information technology continues to develop, new sources of data will appear, allowing new linked datasets to be created. New linked datasets offer new opportunities for research in many areas, including AI.9 We might explore data linkage in more detail during Round 2 of the study.
The following sections describe some common AI methods that form the basis of most modern AI applications. All of these methods work best when applied to Big Data.
3.1 Machine Learning
What is machine learning? Machine learning refers to AI methods where a computer program learns from experience over time.8
For example, algorithms that detect spam e-mails are “trained” through exposure to many examples of e-mails that have been manually labeled as spam or not spam. The spam-detection algorithm will learn particular words or combinations of words that increase the chance that an e-mail is spam. A feedback loop can be used to help the program improve after making a mistake.
The spam-detection album is an example of supervised learning, in which labeled examples (in this case, e-mails labeled as spam or not spam) are included in a training dataset. In unsupervised learning, there are no labeled examples and an algorithm instead groups data by similarities. Unsupervised machine learning is increasingly used to drive discovery in basic science and medical research by revealing unexpected relationships among data. Both types of learning will test the performance of an algorithm on a test dataset that the algorithm has never analyzed before.4
As narrow AI, machine learning algorithms are usually more accurate than humans at the prediction tasks they are trained for. A machine learning AI system is not truly intelligent because it does not understand what it was trained to do. A spam-detection algorithm can be great at filtering spam, but it will never understand what spam is and why it is bad in the way humans do. If a new type of spam emerges, it will probably have to be retrained by a (human) computer programmer to recognize it.
Machine learning forms the basis of most AI systems.
3.2 Deep Learning
What is deep learning? Deep learning refers to a subtype of machine learning methods that identify hidden and complex patterns in a dataset. These patterns can be used to classify new data into a defined category (ie, using supervised learning methods) or group new data by similarity (ie, using unsupervised learning methods) with very high accuracy.10
Deep learning is a more recent type of AI that has improved existing technologies and helped create others that were not possible before, like self-driving cars. The distinguishing characteristic of a “deep” learning algorithm is the use of many layers (Appendix 2 Figure 4). Each layer identifies different features of the data set; combining the layers allows for detection of complex patterns that can be used to make very accurate predictions. With “shallow” AI methods like classic machine learning, scientists must spend time:
Identifying data features important for making accurate predictions, and
Manually transforming these features into math a computer understands.
Deep machine learning is powerful because it eliminates the need for manual feature selection and representation. This allows for much more complex prediction tasks.
Consider the example of distinguishing photographs of cats from dogs—a popular project for AI developers-in-training (Appendix 2 Figure 5). This initially seems like a simple task because we have been doing it all our lives—but think about what makes a dog visually different from a cat. Both have 4 legs, 2 eyes, and 2 ears—all roughly in the same position. Both have a tail and come in different colors. Dogs and cats can be similar in size.
So, how do we know they are not the same? Looking closely at the photographs, you will find no hard-and-fast rules for deciding. Yet, we can still tell the difference, but it is hard to pinpoint how we do it. Most of us will say that it has something to do with a subtle pattern of features that exist uniquely together. We have learned these patterns over time and by exposure to examples, and probably by making a few mistakes. As you can imagine, these visual relationships are extremely difficult to represent in computer language.
Deep learning methods solve this problem by automatically detecting features of a dataset and defining relationships between them in mathematical terms a computer understands. This mimics the nuanced and highly accurate process of human pattern recognition. Humans can learn to categorize or group objects pretty well from a small number of examples. Deep learning algorithms learn from thousands or even millions of examples, making them extremely accurate at the prediction task they are designed for.
It is important to note that the “features” that a deep learning algorithm extracts from a dataset are abstract mathematical representations of data qualities. They are not always concrete qualities that we can easily name. This can make it difficult to precisely understand how a deep learning AI system arrives at a prediction, even when it is correct. This is called nonexplainable AI. For this reason, deep learning algorithms are often compared with a “black box.”10
Deep learning is well suited to prediction problems in medicine because health and disease often involve complex interactions. As you can imagine, it is sometimes necessary for clinicians to understand contributing factors to an illness or injury to inform effective prevention and treatment. Efforts to develop “explainable” deep learning systems are ongoing.
3.3 Other AI Methods
Natural language processing (NLP) refers to AI methods used to interpret human communication and reproduce it in various forms.4 In combination with deep learning, NLP is the basis of automated translation services like Google Translate, chatbots, and virtual personal assistants such as Apple's Siri and Amazon's Alexa. Most advanced NLP systems also use deep learning.
Computer vision refers to AI algorithms that interpret digital images or videos.10 Computer vision underlies facial recognition software and autonomous vehicles. Modern computer vision systems almost always use deep learning to perform their function.
For example, social media platforms like Instagram, Snapchat, Facebook, and TikTok use computer vision for their video filter technology (Appendix 2 Figure 6). Filters use an AI algorithm to detect the features of a digital image that represent a human face. An animated filter is then applied to the face and can follow that face's movement in the frame.
Cognitive analytics and robotics are other types of AI we will not discuss in depth for this study.
If you would like to take a break and return to this booklet later, this is a good place to pause.
4. AI in Medicine and Health Care
As we have learned, AI is not so much 1 thing as a set of computer programming methods for analyzing large volumes of data from various sources to identify patterns. Patterns within data can be used to make highly accurate predictions. Predictions can add value to a wide range of tasks or problems in most industries, including health care.
Several factors are driving the adoption of AI in health care today.11 First, labeled digital health datasets have grown exponentially in the past 30 years. This is the combined result of:
Development of comprehensive administrative datasets that track a patient's journey through a healthcare system
Widespread adoption of electronic medical records (ie, the computer software your doctor enters notes into during a visit)
Advances in medical imaging technology
Decreased costs of genomics technology (ie, gene sequencing and mapping) resulting in more genomic data
Broad uptake of “wearable” devices that track physical data like your heart rate, body mass index or your blood glucose levels.
These datasets can potentially be linked to provide a detailed data profile of an individual or a population. Second, faster hardware emerged in the mid-2000s that allowed for the computational power necessary to analyze such large and high-variety datasets using deep learning.
Clinical application of AI (especially deep learning) has been most rapid in medical disciplines that rely heavily on medical imaging. In many cases, deep learning AI systems outperform individual clinicians when interpreting medical images. For example, a group of AI scientists at Stanford University in California developed an AI algorithm to diagnose skin cancer from pictures of abnormal skin marks using deep learning and computer vision methods (Appendix 2 Figure 7).12 The algorithm was trained on 129,450 images of skin conditions labeled as cancer or not cancer by dermatologists. Algorithm performance correctly classifying skin cancer was compared with the performance of 21 dermatologists. Algorithm accuracy was equal to or better than that of most dermatologists involved in the study. To our knowledge, this tool has not yet been tested in clinical settings.
Increasingly, health care organizations are interested in applying deep learning AI to develop risk stratification tools that can organize patients according to their risk level for a health event.13 These tools can be used to inform preventative care and support hospital operations (eg, staffing changes when care needs will likely be high). When implemented in clinical practice, a provider may increase support for individuals at increased risk for a particular event, shifting health care resources to those predicted by an algorithm to have the most need.
For example, a group of AI researchers at the University of California applied deep learning AI to raw data from patient electronic medical records to predict hospital readmission and death.14 The algorithm was tested on 2 hospital electronic medical records datasets and predicted events more accurately than existing non-AI predictive models. Other researchers have developed tools that can predict blood infection risk or delirium during a hospital stay, among many other examples.
It is important to note that few algorithms developed by AI researchers have been widely implemented in clinical practice. This limits our understanding of how well prototype tools integrate within existing health care delivery systems or improve patient health.
5. Applying AI in Primary Care
What kind of data are created through primary care? Electronic medical records (EMRs) are the richest source of data in primary care. In Canada, more than 70% of primary care settings in most provinces collect and store patient data in electronic health records.15 EMRs contain clinical data—data collected by a hospital or provider to provide appropriate health care services. Clinical data include
Sociodemographic information, such as your age, race or ethnicity, gender, or income level,
Laboratory and medical imagining test results, and
Unstructured notes your provider makes during your visit. These notes may include descriptions of your symptoms, existing health conditions, or even important life events, such as the loss of a loved one.
EMRs also contain your provincial health insurance number. EMRs can be linked with administrative data maintained by your provincial health authority that records your health system encounters. It is now possible to sync vital sign or activity data from wearable devices (eg, Apple Watch or Fitbit) with many EMR systems.
These data are difficult to condense and analyze for use during a clinic visit. The AI methods we have discussed can be applied to create value for primary care patients and providers.
Is primary care ready for AI? As with other medical disciplines, there are currently few examples of widely used AI tools in primary care. Searching online, you will find many tools developed by private companies that could be used in primary care. Some of these are likely already in use in some primary care settings. For most of these tools, there is little academic research on their effectiveness in improving health.
In addition, a 2020 review found that research specific to AI in primary care is in the early stages of maturity.16 Of 405 studies identified in the review, most were focused on developing or improving AI methods to achieve good algorithm performance, and not on how those tools function in a clinical setting. Only 14.1% of study teams included primary care providers. The authors of this review emphasized the need for research that:
Involves interdisciplinary research teams that include people with direct clinical experience
Engages end users (eg, patients, providers, or administrative staff) throughout the development of AI applications
Evaluates the effectiveness of AI-based tools for improving health when used in primary care settings.
What can AI do for primary care? It is not guaranteed that AI will transform primary care or any other areas of medicine. Sometimes new technology has unexpected consequences.
For example, EMRs were expected to massively increase efficiency for physicians. EMRs made it easier for providers to bill provincial health insurance plans for their services, but many providers find they have to spend more time typing to maintain the patient chart than interacting with their patients. This ‘4000-clicks-a-day' problem has been linked to physician burnout.17,18 This problem exists because EMRs were designed to improve billing efficiency and not patient-centered care. What is most important to any new technology's success is that its function matches the end user's needs.
Appendix 2 Table 1 lists examples of ways AI may be applied to primary care data. Some of these examples are based on commercially available applications or prototype algorithms described in published research. Other examples are possible with today's AI methods and existing primary care data sources but haven't been created yet.
As you read, think about what use cases stand out to you. Which examples would most improve your experience as a patient or your job as a provider? Which ones do not seem as useful? Can you think of other problems in primary care that could be addressed by AI? Can you anticipate issues that might arise if you had to use a particular application?
These are some of the questions that we will explore more in-depth throughout the deliberative dialogues. On the last page of this module, you will be asked to list some of the use cases that you think are important or interesting or other ideas you have about where AI could be applied in primary care.
6. Ethical Considerations When Applying AI to Health Data
AI has great potential to improve health and health care, but as with all innovations, there are limitations. These limitations raise some important ethical questions.
What is bias? Why does it matter? Many researchers have shown that non-health-related AI can worsen existing social inequalities by duplicating or worsening race, gender, and other biases.22 Bias describes a preference for 1 thing, idea, person, or group compared with another, usually in a way that is considered unfair.
It is important to realize that AI algorithms are only as good as the data they are trained on. Some groups of people with better access to health care may contribute to health datasets more than other groups. AI algorithms trained on datasets that poorly represent certain groups may make less accurate predictions for members of those groups.
When used in health care decision-making, biased predictions can worsen health inequality between groups.
Consider the following hypothetical example:
A provincial health services agency partners with a private company to develop an AI algorithm that identifies patients at high risk of serious complications from their chronic diseases. The goal is to use this algorithm to target more supportive primary care resources to these patients to reduce their risk and improve their overall health.
The company links the primary care EMR data for all provincial residents with system administrative records and applies explainable, supervised machine learning methods. The primary variable used for prediction is the historic costs of care. Patients who have previously received the most health care are the first to be targeted with new comprehensive care management programs. The company files a patent on the algorithm.
A few providers notice that most of the patients flagged as high risk are White. They voice their concern to the provincial health authority, which then hires an external group of scientists to evaluate the algorithm under a strict nondisclosure agreement. The scientists learn that most of the patients flagged by the algorithm are not the sickest patients in the province. Black and nonstatus Indigenous patients are sicker on average than other racial groups, but data on sickness were not included when training the model.
Adjusting for sickness, the researchers find that Black patients should be receiving 46.5% of comprehensive care program resources instead of the current 17.7%. Nonstatus Indigenous patients should receive 23.4%, instead of 4%. It is well known that non-White people in Canada and the United States experience more barriers to accessing health care. This means that their overall health care costs are decreased compared with White people, even though their health needs are greater, on average.
This example is based on a published scientific analysis of a real AI algorithm applied to 200 million people each year in the United States.23 This algorithm is not a unique case but represents a general approach to risk prediction in the health sector. If health systems use biased algorithms when distributing health resources, the health of certain groups will improve while the health of others will get worse. To combat bias, many researchers have emphasized the importance of including patients, clinicians, and ethicists from the beginning when developing AI applications for use in patient care.
What other ethical considerations are there? There are other ethical questions relevant to the above example. What about the patients whose data were used to train the model? Did they agree to have their data used for the development of the tool? Do they have to consent? How do you feel about your health data being used to train an AI algorithm used in other people's care?
Many private companies do not want to disclose their methods for developing a novel AI tool and file patents to protect this information. How can we trust that a tool marketed by a private company is unbiased and safe to use for all patients? Should developers and health care providers have to tell the public if they learn that an algorithm they are using is biased? What if a tool was instead developed and sold by researchers who published their methods in academic journals reviewed by other researchers?
And what about the "black box" quality of some deep learning AI systems? When is it important for the patient and provider to know what features of a dataset contribute to a prediction?
As you can see, many important ethical questions arise when we think about the use of AI tools in health care settings. There are currently no clear answers to many of these questions.
7. Summary
Artificial intelligence is a broad field of science concerned with getting computers to do tasks that would normally require human intelligence. AI methods can be applied to large datasets to identify relationships and extract meaning. These relationships can be used to sort new data into predefined categories or group data by similarity. Machine learning, deep learning, computer vision, and natural language processing are all types of AI that can be used alone or together to get a computer to perform a narrowly defined task. All of these methods work best when applied to large datasets.
Both growth in labeled digital health datasets and increases in computer processing power are driving the adoption of AI in health care today. Clinical application of AI has been most rapid in medical disciplines that rely most on medical imaging. There is growing interest in applying deep learning AI to develop risk stratification tools that can organize patients according to their risk level for a health event. To date, few algorithms developed by AI researchers or companies have been widely implemented in clinical practice. This limits our understanding of how well prototype tools integrate within existing health care delivery systems or improve health.
Primary care electronic medical records are a rich source of data that can be linked to other datasets, including data about health system encounters and data from wearable devices. The AI methods we have discussed can be applied to these data to create a wide variety of applications that may add value to primary care patients and providers. It is not guaranteed that AI will transform primary care or any other areas of medicine. Researchers studying this area have emphasized the importance of involving primary care patients, providers, and health administrators in the process of developing useful AI applications that address the needs of end users. It is also important to evaluate AI tools to be certain that they actually improve patient care.
There are some limitations to AI that raise important ethical questions. AI algorithms are only as good as the datasets they are trained on. Datasets that poorly represent certain groups may result in AI algorithms that make less accurate predictions for members of those groups, leading to bias. When used in health care decision-making, biased predictions may worsen existing social inequalities in health. Issues like consent to data use are also important to consider when applying AI in health care settings.
8. Presession Questionnaire
Please return to the online module to complete the questionnaire at least 2 hours before your first session.
Glossary
Administrative data: Data collected in the course of providing and/or paying for services (eg, hospital admissions, physician visits).
Algorithm: A set of step-by-step instructions for solving a problem. A computer algorithm can be simple (eg, If it is Saturday at 9 am, send a reminder) or complex (eg, Identify pedestrians).
Artificial intelligence: A broad field of science that is concerned with getting computers to do tasks that would normally require human intelligence; a-general purpose prediction technology that estimates missing information from prior information.
Bias: A preference for 1 thing, idea, person, or group compared with another, usually in a way that is considered unfair.
Clinical data: Data collected by a hospital or provider to provide appropriate health care services.
Computer vision: AI algorithms that interpret digital images or videos.
Dataset: A collection of data gathered using the same criteria for a specific purpose.
Deidentifed data: Data where identifiers such as a person's name and address have been removed to reduce the possibility that the data could be traced back to any individual.
Deep learning: A subgroup of machine learning algorithms that identify hidden patterns in a dataset. These patterns can be used either to classify new data into a predefined category (ie, using supervised learning methods) or group new data by similarity.
Deliberative dialogue: A discussion among people involved in or affected by future decisions about a high-priority issue. A dialogue aims to bring stakeholders together to understand a topic in greater depth and share different views.
Electronic medical record: Digital records kept by hospitals or individual providers that contain clinical data.
Explainable AI: AI algorithms where the features of a dataset that contribute to a pattern are possible to identify and define.
General AI: A single system that can learn in different situations and then apply broad knowledge to solve any kind of problem. There are currently no actual examples of general AI.
Genomic data: Data describing DNA sequences.
Linked dataset: Dataset created when 2 datasets from different sources are combined. Linked datasets contain a broader range of information than either of the original datasets.
Machine learning: AI methods where a computer program learns from experience over time.
Narrow AI: AI programs that can only do what they were designed to do. “Narrow” AI applications are often better than humans at the tasks they were designed for but cannot develop additional skills without being programmed by humans.
Natural language processing: AI methods used to interpret and reproduce human communication.
Nonexplainable AI: AI algorithms where the features of a dataset that contribute to a pattern are difficult or impossible to identify and define.
Physical data: Data about how your body is functioning from moment-to-moment, such as your heart rate or blood glucose levels; often collected from wearable devices.
Primary care: The services involved in health promotion, illness and injury prevention, and the diagnosis and treatment of illness and injury; usually the point of first contact with a health care provider when you are sick or injured.
Risk stratification: Organizing people into risk levels that can be used to guide decisions. Risk stratification can be carried out using statistical models.
Supervised learning: A type of machine learning where an algorithm learns from labeled examples or positive and negative cases. Supervised learning is used to assign data into predetermined categories.
Unsupervised learning: A type of machine learning where an algorithm learns without examples. Unsupervised learning is used to group data by similarities.
Notes
This article was externally peer reviewed.
Funding: This project was supported by the Canadian Institutes for Health Research (FRN 156885). Tara Upshaw was supported by a CIHR Frederick Banting and Charles Best Canada Graduate Scholarship. Andrew Pinto is supported as a Clinician-Scientist by the Department of Family and Community Medicine, Faculty of Medicine at the University of Toronto and St. Michael’s Hospital, and the Li Ka Shing Knowledge Institute, Unity Health Toronto. He is also supported by a fellowship from the Physicians’ Services Incorporated Foundation and as the Associate Director for Clinical Research at the University of Toronto Practice-Based Research Network. The opinions, results, and conclusions reported in this article are those of the authors and are independent of any funding sources.
Conflict of interest: None to declare.
To see this article online, please go to: http://jabfm.org/content/36/2/210.full.
- Received for publication May 6, 2022.
- Revision received October 14, 2022.
- Accepted for publication December 5, 2022.