#62 Artificial Intelligence in Medicine: Where Are We Now?

Panel 1: (left to right) Dr. Joanna Yu, Dr. Jason Lerch, Dr. Marzyeh Ghassemi, Dr. Oren Kraus, and Dr. Shreejoy Tripathy

0:00:00

July 31, 2019

Raw Talk recently hosted their second annual live event, entitled Medicine Meets Machine: The Emerging Role of Artificial Intelligence (AI) in Healthcare. Today's episode with Grace and Stephania shares the discussion from our first panel of experts focusing on current applications and limitations of AI in medicine. Our speakers share perspectives from healthcare, academia, industry, and policy development. You'll hear from Dr. Oren Kraus, Dr. Jason Lerch, Dr. Marzyeh Ghassemi and Dr. Joanna Yu, as well as moderator Dr. Shreejoy Tripathy about the realities of AI and the challenges involved, as well as the implications of AI for their own work.

Written by: Grace Jacobs

Grace Jacobs [0:11] Hi listeners! This is Grace.

Stephania Assimopoulos [0:13] And I'm Stephania.

Grace Jacobs [0:14] Welcome to Episode 62 of the Raw Talk Podcast. We hope you're having a lovely summer. Over the next two episodes, we will be sharing the intriguing and inspiring panel discussions from our second annual raw talk live event, Medicine Meets Machine: the Emerging Role of Artificial Intelligence (or AI) and Healthcare, that took place in May at J Labs in the MaRS building.

Stephania Assimopoulos [0:32] AI is a very current topic rapidly gaining popularity in what seems like almost every scientific field, and not without a good reason. AI has led to various advancements on many fronts. In medicine, for example, it has led to the development of cutting edge treatments and diagnostic tools.

Grace Jacobs [0:48] However, besides all the hype surrounding it, is also burdened by many misconceptions and even suspicion. So on this year's live event, we decided to keep it raw when it comes to AI to gain a better understanding of what AI is, while clarifying some common myths.

Stephania Assimopoulos [1:01] At our event, we unpacked this topic with two distinct panels of experts so that we could talk about AI in two different perspectives. Number one, the current applications and limitations and number two, the ethical and future considerations of AI in medicine. Our panelists share diverse perspectives from healthcare, academia, industry and policy development.

Grace Jacobs [1:21] In today's episode, we will share with you the discussion from the first panel on current applications and limitations.

Stephania Assimopoulos [1:27] We also want to take a moment to thank our sponsors J Labs who let us use their fantastic venue, the Syrian initiative fund SGS, EMSA, UTGSU, News, and of course, IMS.

Grace Jacobs [1:38] A big shout out to Dr. Lu, the director of IMS for introducing our second panel. Let's begin by hearing a bit more about our moderator, Dr. Shreejoy Tripathy and panelists Drs. Oren Kraus, Jason Lerch, Marzyeh Ghassemi and Joanna Yu.

Dr. Shreejoy Tripathy [1:38] So I'm Dr. Shreejoy Tripathy. I'm a new independent scientist at CAMH and Assistant Professor of Psychiatry at U of T. As of January 2019, I'm part of the new Krembil Center for Neuroinformatics at CAMH, and our center's goal is to use the power of big data, artificial intelligence, and brain modeling to treat mental illness. The focus of my lab specifically is how to understand how an individual's genetics affects the functioning of the cells in their brain. And we use AI specifically to help draw associations between different scales of brain organization, like between human genotypes and neuronal physiology.

Dr. Oren Kraus [2:26] So I'm Oren Kraus, I'm one of the cofounders of Phenomic AI. I started the company based on some of the techniques I developed as a PhD student here at the University of Toronto, those working in the machine learning group with Professor Brendan Frey, and some other collaborators in the Donnelly center. And we were really the first to apply AI to image based microscopy screens. So on the academic side, we're using those for functional genomics to figure out where genes localize to in cells, and really understanding more in genome wide screens. But in industry, these types of technologies are used a lot for drug screening. So there's really a huge opportunity to apply AI to these types of screens, as a company. So that's kind of where we started off two years ago. Since then, we've grown to a team of over 10 people and we're using AI on these kind of screens every single day.

Dr. Shreejoy Tripathy [3:13] That's really exciting.

Dr. Marzyeh Ghassemi [3:15] I'm Marzyeh Ghassemi. I am a professor at the University of Toronto in Computer Science and Medicine, and a faculty member at the Vector Institute. I do research in machine learning for health. So my primary appointment is in computer science, which means a lot of what I do focuses on developing models and methods that I think are going to work really well with health data. So we already know how to process image data efficiently with convolutional neural networks, we already know how to process graphs efficiently with graph convolutional neural networks, and we know how to work with text using recurrent neural networks. A lot of data that comes out of a healthcare setting doesn't follow a lot of the same ground rules when you're trying to learn things efficiently. And a lot of the speed up that we get, and learning now, is because we know how to model data efficiently. So you can learn really well, even if you have very little examples of a specific minority class. And so the research that I would like to do moving forward is focusing on learning what kind of models work well in healthcare, and correspondingly, what kind of healthcare actually works well for people.

Dr. Jason Lerch [4:17] So I'm Jason Lerch. I'm at the Wellcome Trust Center for Integrative Neuroimaging at the University of Oxford, and I still have an adjunct appointment at the University of Toronto and SickKids, which is where I was at until recently. I'm going to, again, make the curmudgeonly argument that we're using AI as an extension of statistics, and that it's part of a long continuum that has a very proud tradition in medicine. It's part of the toolkit that will help to try to understand both how to optimally solve detailed problems, as well as to make better predictions about what our data is telling us about subjects, patients, participants, molecular and biological systems.

Dr. Joanna Yu [4:54] Hi, everybody. I'm Joanna Yu, I'm a colleague of Shreejoy at CAMH and the Clinical Center for Neuroinformatics. I'm working on the BrainHealth DataBank as a Senior Project Manager. So unlike the rest of my fellow panelists, I don't actually conduct AI analysis myself, but I'm really interested in ensuring that there's high quality data sets available for people to conduct AI, for the purposes of both transforming care and informing research.

Stephania Assimopoulos [5:18] We were really excited when putting together this panel because, as you'll find out soon, our panelists although all AI users have quite different perspectives, which made for a lively and engaging discussion. Suffice to say that an agreement was not always reached.

Grace Jacobs [5:33] Before we dive in, what is the technical definition of artificial intelligence or AI? It's often a broad term used to encompass many methods and ideas. I definitely personally get lost trying to narrow it down. So how do our experts describe AI? Drs. Marzyeh Ghassemi and Jason Lerch gave a great definition to help clarify.

Dr. Marzyeh Ghassemi [5:51] Artificial intelligence is the colloquial term that's used to describe many methods that are used by statisticians, people who do optimization, people who do computer science. And we use these methods often to either do, unsupervised learning, clustering, supervised learning, prediction, or causal/reinforcement learning, trying to understand cause and effect in data. So it's a large class of models, and it's a general term.

Dr. Shreejoy Tripathy [6:25] Jason.

Dr. Jason Lerch [6:27] It's hard to top that. It is statistics where you don't have to predefine your model exactly.

Grace Jacobs [6:32] Or equivalently, AI is a tool to find patterns in data with the help of machine learning, as Drs. Kraus and Yu pointed out. Okay, now, let's dive in.

Stephania Assimopoulos [6:41] AI is often thought to be the holy grail of modern innovation, revolutionising technology, and leading to drastic progress. But that isn't necessarily the case. Or, is it just not the case yet? To understand the importance of AI, we wanted to know from our panelists how their respective fields work, prior to using AI, and how they are now. Starting off with Dr. Oren Kraus.

Dr. Oren Kraus [7:03] So the field of generating lots of microscopy data, it's called high content screening, and there's still a lot of data science that went into it just using the traditional computer vision techniques. So things like segmentation and feature extraction.

Dr. Shreejoy Tripathy [7:17] What was it like before that?

Dr. Oren Kraus [7:19] It's really like...

Dr. Shreejoy Tripathy [7:20] ...dudes looking at photos?

Dr. Oren Kraus [7:22] Yeah, the scale came, and then once you had all these images, it was really like an afterthought of, "okay, how do we analyze them?" So in the bigger companies were more focused on just narrowing that down to a specific measurement. So something simple would be like cell viability. Are the cells alive or dead under this condition versus not? But the trend in the last 5 to 10 years has been thoroughly try to extract more and more information. And that's really where the content part of high content comes in. So measuring lots of parameters, at once, from every single cell, and these large scale experiments. But what AI lets us do, that really only took off in the last two to three years, is really take a lot of the engineering that goes into that. It would take weeks, months, or even years sometimes to analyze all these images. But now you can really apply the exact same techniques over and over again, and get results almost as fast as you can generate the experiment in the first place. So it really makes that whole tool set a lot more productive, and a lot more useful.

Dr. Shreejoy Tripathy [8:15] Can you give us a sense just the amount of data you're talking? Are you talking like two slides or like 2 million?

Dr. Oren Kraus [8:20] Yeah, so the idea with these automated screens is you can image a whole 384 or 1536-well plate within a matter of hours. And then the pharma companies will screen 10s to hundreds of plates. So the data sets can be 300 gigabytes, to terabytes. You've generated that within a few days and want to analyze that almost as fast as you can generate it.

Dr. Shreejoy Tripathy [8:41] Marzyeh if you wanna go next.

Dr. Marzyeh Ghassemi [8:42] It's a hard question for me, because, I was a computer scientist first. So I think that the big shift in medicine is that if you watch a doctor, at least in the ICU, and I spent a lot of time and ICUs during my PhD at MIT, so I rounded up at Israel with some of the attendings. A patient would come in, and I swear, they would look to me exactly the same as prior patients. And the senior doctors would call it. They would say, "this one's not going to make it, they're going to die." And I would say, "okay," I have my notepad, "why?" And they said, "I'm not sure I have a sense." They all essentially said the same thing, "I just, I have a sense." And when you probe them on it, as annoying future academics do, they would finally come down to, "I saw a patient once who had something like this, it was a while ago, but you know, it sort of looked this way." So this makes sense to me in my head. If you talk to the doctors, they'll tell you that they operate in a shocking lack of evidence. There is no compendium of every patient that's ever been seen, and all the treatments that have been given to them. Nobody writes a textbook about that. If you're lucky, you'll get a case report about a bizarre patient. And people are trained reasonably consistently at very high profile academic hospitals. But these teaching hospitals, they train you, and then you go off wherever you're going. Maybe you don't retrain. Maybe practice changes. Maybe you haven't seen a patient who looks like this very recently. So I think that the difference that machine learning can offer, that AI and other tools can offer, are, we can look at all the patients who have ever been seen in an institution, not just your institution, but everybody's institution. We can see what kind of treatments actually work well for patients, like the one you're seeing. I think that that's amazing, because even the best doctors have to really think when they look at a patient, and then they can make that prediction. I think that we can make doctor's jobs more about providing care, and less about guessing. And I think that that will improve care for everybody.

Dr. Shreejoy Tripathy [11:07] Jason, Joanna, do you want to add to that?

Dr. Jason Lerch [11:09] I think the main thing I would add is that, in my part of neuroscience and brain imaging, and the more clinical side that I'm involved in, I don't think AI has had that big of an impact, yet. So it's more coming. I think the biggest association and where it's coming now is... AI is starting to replace what we can do with more classic image processing, and sometimes classic statistics. But it can't yet, or has not yet helped, in improving understanding. I think that's one of the big things that, we'll probably have more of a discussion of today, to what extent can you have a large data set and learn what we already know, and apply it much more efficiently and potentially more robustly? And to what extent are you trying to get something new out of it?

Dr. Shreejoy Tripathy [11:53] Can you ... I'm just gonna push back on that. So can you quickly give an example of what you would mean by understanding?

Dr. Jason Lerch [12:00] What very often happens with AI algorithms is that you can provide a large training data set, and you can say "can you discriminate..." and to use the example we just heard,"...cases that will make it, and cases that will not?" Then it'll tell you, "I can do this with 88% accuracy," or whatever the nu mbers might be. Then it starts what's almost a process of archaeology of figuring out, "well, what is it about these patients?" Which can often be very difficult to learn from, especially more complicated, deeper neural net, or whatever it is that you've used to apply this tool. Compared to the more classic way of analyzing the data. If you use multivariate statistics or massive univariate statistics, the accuracy wouldn't be anywhere close to that high, but our ability to interpret what the models are telling us is still higher.

Dr. Shreejoy Tripathy [12:43] Right. So the challenge is these tools seem to work, but no one really knows why or how they work.

Dr. Jason Lerch [12:51] Right. There's a lot of really interesting work happening in that field of trying to understand and interpret the output of what did the learning algorithm learn? But it isn't always immediately clear why is this actually working?

Dr. Shreejoy Tripathy [13:02] Yeah. Joanna, the BrainHealth DataBank at CAMH is still being built, but can you speak a bit to some of the promise,

Dr. Joanna Yu [13:14] Definitely. So I think to further Marzyeh's points about being able to compare to similar populations, then predicting whether or not the patient will receive a positive or negative outcome, in order to be able to do that, we really need to be able to have this data centralized, standardized, of high quality. I think thats the largest challenge going across hospitals, even within a hospital. So part of what the BrainHealth DataBank is doing is really looking to better integrate clinical and research data, and also improve the quality of it. Can we have standardized assessments being used across clinics, as appropriate? Can we then layer on additional research measures, including blood samples, or even wearables, so that we don't just have data from when the patients coming into the clinic, but in their real world, in their lives? So just to add to that, I think where we're going is not only [to follow them] when they come in, when there's something wrong with them, but even after the discharge. Can we continue to follow them to be able to really apply all these AI and machine learning algorithms to the full capacity, and accept that they're able to hold promise for improving our future?

Stephania Assimopoulos [14:19] So from what we just heard, AI has made an impact in some fields and contexts more than others, but also in different ways. AI is leading towards more personalized medical decisions based on evidence from big data sets. However, we need to rethink how to most effectively collect such big data sets, both in the short and long term for each patient. It's also important to look past the hype and think of what we have actually gained in terms of understanding while keeping in mind the challenges related to AI.

Dr. Shreejoy Tripathy [14:44] So the question is, what are some challenges that you face using AI in your work? Joanna, maybe we can start with you? I want to know, is there pushback from the clinicians at CAMH in these old school doctors who are like, "why do I need to use this measure? Or some other measure? Does this data really need to go into a database? Can't we just do things the old way?"

Joanna Yu [15:05] Yeah, so that's a great question. We're fortunate at CAMH we're working with a lot of programs that have already developed standardized care pathways. So in that sense, there's buy in from the clinicians, and the clinic, and integrated care teams, be it nurses and therapists as well, to be following some sort of standard data collection process. But I think what's really neat about this process, is that we're building it together. So a lot of it is co-design. Imagine, someone discovered this fancy AI algorithm, 90% prediction rate, we can tell who's gonna respond to drug A positively, and we just sort of walked into the clinic told the psychiatrist, "listen, we've got this." I mean, they've also had a ton of training, a ton of background, they also understand that there's different circumstances for each patient that makes each case unique. There's a need for personalized medicine as well. I think we're not in an era where AI is to replace the doctor, the physicians. At the same time, can we co-design to expedite how we develop these algorithms? There's certain expertise that is not captured by just the data themselves. There's certain inherent knowledge that is within the experts, psychiatrists, therapists, everyone, that can also contribute and expedite how we how we move forward.

Dr. Shreejoy Tripathy [16:14] That's cool. Oren, can you speak a bit to some of the challenges in implementing AI in your work? And maybe can you speak a bit to using AI in a company?

Dr. Oren Kraus [16:23] Sure. Yeah, there's definite challenges on the infrastructure side. So it's okay if you're a data scientist working on a specific data set, and you can get results relatively quickly. But once you want to build something that works reproducibly across lots of different experiments, or whatever the data sources, you really need to put a lot of effort into the engineering and getting different types of skills together, in order to get that whole thing working. In terms of the AI component, the idea of training supervised models where you know conditions ahead of time that you care to classify, that tends to work relatively straightforwardly. But what the field is interested in general, across all of drug discovery, or life sciences, is actually discovering new biology or just new relations in general in the data sets that we're generating. In that setting, it's a bit harder to apply AI, because if you use some of these unsupervised clustering techniques, it's really hard to know whether you're finding something of interest or just experimental noise. So that's something we're actively exploring at Phenomic AI, and other researchers too, is how to design algorithms that really let the biologists, or the scientists, interact with the with the results, also give feedback, and close that loop between the data science and the experiment.

Dr. Jason Lerch [17:34] I guess the two [challenges are]... and one have spoken to already a little bit, which is the separation between performance and gaining understanding from the algorithms. The other big one, so far is still simply numbers. To train an algorithm, you very often need large number of samples being put in or large data sets.

Dr. Shreejoy Tripathy [17:54] Do you get a sense for your field, what numbers we're talking about here?

Dr. Jason Lerch [18:00] It really depends on the application. So I've seen two types of applications. One of them is where AI replaces image processing, and there very often you can bootstrap your way to the right number of samples. So you might take a data set of 10,000 brains, you segment it using classic tools, you use that to train an AI algorithm that then does a better, more robust, or much quicker job of segmenting the same brain. So I think there I'm less worried about it. But you're starting to talk about predicting patient response, trying to find who is going to respond to what drug. We tend to work in the world of rare disorders. There are seven patients in the world of that particular disorder, that's a very hard algorithm to train. Seven is the extreme, but very often, it gets into the 10s, 20s, few hundreds with all sorts of heterogeneity and diversity in there. So if it's on the patient prediction type outcome, that's much more of a challenge when you're in the rare disorders world, than it is in some of the more common fields. Trying to understand, how do you begin to generalize from one set of patient groups to another? How do you begin to train these algorithms when you don't have the classic training data set available?

Dr. Shreejoy Tripathy [19:04] Right, yeah. Marzyeh, do you have more comments about challenges?

Dr. Marzyeh Ghassemi [19:07] No, I don't think it's that challenging to use. So maybe, no, not particularly.

Dr. Shreejoy Tripathy [19:13] No? So AI will solve everything?

Dr. Marzyeh Ghassemi [19:16] I mean, people use calculators. There are already risk scores in the ICU. You come in, they add up seven numbers, and they're like, "yeah, that one has this score." It's a fancier calculator, if you want to use it that way. There's a convolutional neural network, it's just not requiring you to specify the features that you want. In the pre conv-net days, you had to say, "I think something is happening at frequency 75 hertz." Seriously, you would talk to a cardiologist and they would say, "look at this component of the QRS." If you can just learn that from a convolutional neural network and get an equivalent categorization, I think it's fantastic. I think if we're talking about exploring new science, all techniques should be put under extreme scrutiny. If we're trying to predict new drug responses, or something completely novel where we are doing completely new science, then it doesn't matter that that technique is AI. You should be terrified if your logistic regression model is super confident too. This is a recent burn I had on a paper review, somebody said, "you should use a logistic regression model, because it's interpretable." And I want to know who thinks in log odds in this room? Who thinks in e to the beta knot, plus beta one, x one, plus ... Nobody. Nobody. I agree that low capacity models are often easier to look at, because we have ways that we have trained ourselves to look at them. But there are ways to look at high capacity models. So I don't think it'll solve everything. But I also think it's a tool just like all other methods we use.

Grace Jacobs [20:58] It was interesting to hear Dr. Ghassemi's comment about AI just being another tool, if a more complex one. This help put AI in context with other multivariate approaches that currently exist and present their own challenges.

Stephania Assimopoulos [21:09] This was supplemented by Dr Lerch's compelling comments on the generalizability of these tools, based on the task at hand. Unlike a calculator, it is variable. It's a more complicated tool than others we have used so far. And there are some key requirements to make it work for us, like having good data. Although often presented as such, it's not a magical black box that would always do what we wanted to reliably. There are standards and difficulties associated with each case, and caution is required.

Grace Jacobs [21:36] However, all panelists converged on the complex challenges related to assessing if models are generalizable, and the importance of closing the feedback loop to integrate clinicians and biologists into model development and validation.

Dr. Shreejoy Tripathy [21:49] We've spoken to this point a little bit, but can we hear a bit more about what's the role of human experts, or basically doctors? As these algorithms are coming online as they're improving, do they still have a role? And if so, what is the role like? Marzyeh, do you want to start?

Dr. Marzyeh Ghassemi [22:03] Yeah, I think the machine learning for health community, at least the the people I know who worked primarily in EHR, in hospitals, on longitudinal health records, nobody would ever claim that we're trying to replace the doctor, because that's not anybody's goal. A lot of what we're trying to do is provide better recommendations, or better evidence, or better options. But just like in most fields that automation has touched in some way, you always need a pilot. You always want somebody at the end of the line who makes a decision about something. I think the wrinkle for healthcare is... think about other fields like flying. Altitude is an absolute value, I know what altitude I'm at. The plane, it crashes, or it does not crash, right? You can measure how much bouncing it does. So you can say how good a job your model did at landing the plane. So medicine, healthcare in general, is a setting where experts make decisions about what is the diagnosis. They can decide something is a syndrome one day, they can combine it with something else another day. Decisions about diagnosis and phenotype are made by doctors. There are many conditions that used to be considered one thing, but now are multiple things, because we've learned over time that they should be separated. Science has made an adjustment. So I think that while you know, experts get to decide what a condition is, experts are allowed to disagree. Which is very unique, honestly, in many of the fields that automation has transformed. Usually, if we say that's a chair, it's a chair, and every expert in this room will say it's a chair. But that's not true in medicine. If you ask if somebody is diabetic, and when they were diabetic, people disagree about the onset time. Then even if you could have a complete compendium of these are all the reasons I might have an onset time. Do you have the data to establish that?

Dr. Shreejoy Tripathy [22:03] Yeah, that's really interesting. Maybe real quick, can someone answer the question do we think we hold AI to a higher bar in medicine than we would for other fields?

Dr. Oren Kraus [24:21] Yeah, I'm not from the medicine space. But I think so. Yeah, you can also compare it to, like autonomous driving ...

Dr. Shreejoy Tripathy [24:26] Yeah, right. Uber is, you know, killing people.

Dr. Oren Kraus [24:28] Yeah. So that's right. You know even though, overall, even today, we probably be safer if everyone had an autonomous vehicle, we're still terrified of at least like one of them making a mistake once a year or something. Medicine, like Marzyeh said, it's not like we're gonna be replacing doctors and they're still gonna have the final decision. So I think it's a matter of more of educating doctors about how to use new tools. That could be more of a generational thing. It might take the next generation of doctors to really adopt some of these newer technologies.

Grace Jacobs [24:58] And pause! Before we move on to our next question, we want to lighten things up by asking our panelists to address some common misconceptions about AI in the lightning round.

Dr. Shreejoy Tripathy [25:06] Okay, so I have five questions that I've asked you guys, and you're gonna get a placard that says "agree" or "disagree". So I'm gonna say a statement, and then you're going to agree and disagree. I'm going to quickly tally up the things and maybe ask one of you to say a quick comment about it before we move on to the next question. Okay, so the first question: AI can be more accurate than the decisions of human experts? Four agrees! We don't have our doctor on the panel, she dropped out, I wonder what she would say. Okay, question number two: AI works in the same way that a human brain does? Ooh, Oren why do you say agree?

Dr. Oren Kraus [25:46] Yeah, I think we learned from experience, we just experience things a lot slower than that a computer would, right? So for example, algorithms that are trained on a lot of data to make diagnosis are better just because they see tons of examples from tons of different doctors. Whereas in your own life, you see however many patients you can see a day and you make those decisions. Or in med school, you learn from case studies or textbooks. So the amount of experience you have is very, very limited compared to what you can feed on algorithm.

Dr. Shreejoy Tripathy [26:13] Right. So, in the sense they both learned. But Jason, really quick, why do you say disagree for AI does not work how the human brain works?

Dr. Jason Lerch [26:19] Doesn't AI algorithm have to sleep to consolidate memories?

Dr. Shreejoy Tripathy [26:25] Marzyeh? I don't think so, right? Well, the thing with those cool algorithms that sort of destroy chess is, they can just run all day and all night, and then they destroy our best chess players. Question number three: AI is free of bias. Four solid disagrees. Cool. Moving on. [Question number 4:] AI that is good at one thing will be good at other things. Four disagrees. Okay. And then one sort of controversial question: The use of AI in medicine has already led to the incorrect treatment of someone and their death. Okay, three agrees. And one disagree. We have a couple of maybes. But does someone want to speak more? Marzyeh do you want to speak to agree?

Dr. Marzyeh Ghassemi [27:12] It depends on what you call AI. Risk calculators, our AI, simple logistic regression models, or random forests, the issue is humans deploy these things. So I know of a hospital where they made a risk score that did not include race. The hospital administrator said that they noticed that they were actually having more deaths in black patients, who came in with pneumonia, than white patients, because the risk score was miscalibrated. Not because of any physiological thing, but because in this particular urban setting, I'm trying really hard not to disclose anything, there was a substantial low income, black population. So the assumption by some of the attendings, who I knew personally, was if they came in and they had a fever, and they were shaking, that they were withdrawing. They were on something. So they didn't do the prophylactic administration of the antibiotic, because the risk score didn't take into account that there there are social determinants of health that often overpower your age, your height, your glucose level, your heart rate.

Dr. Shreejoy Tripathy [28:37] That's a great example of bias...

Dr. Marzyeh Ghassemi [28:39] ...it's a sad example.

Dr. Shreejoy Tripathy [28:40] Well, yeah, it's a sad example. That's something to keep in mind as AI proselytizers, in a sense.

Dr. Marzyeh Ghassemi [28:47] And I think it's important when a local hospital approaches me, I want to be really sure about how rigorously we can check our results in many ways.

Dr. Shreejoy Tripathy [28:57] Okay, so we're done with the lightning round, you can put your placards down. So the next question is, do you think there's a gap in the perception of AI in the public as compared to the realities of AI? Jason, do you want to start?

Dr. Jason Lerch [29:09] Probably, but mostly because it's been overhyped at some points in the past as...

Dr. Shreejoy Tripathy [29:13] Who's doing the overhyping?

Dr. Jason Lerch [29:15] I think a lot of it is in the misinterpretation from the more popular news of the academic world.

Audience [29:20] *laughs*

Dr. Shreejoy Tripathy [29:22] Right. So press officers, maybe some of them are ...

Dr. Jason Lerch [29:27] It wasn't really in the context of AI, but there was a wonderful article that I read trying to say that there's a lot of overhyping of scientific results more generally. They then traced it back and realized that most of that actually came from the quotes the scientists given to the local press office, which then was being picked up by the journalist. So we might have to look in the mirror to see whose fault this is as well.

Dr. Shreejoy Tripathy [29:47] How can we better inform the public about some of the realities of AI, especially given that it's still nascent. Marzyeh's work is a great example, but it definitely not perfect. Like self driving cars, they're driving on the road, and they're driving for hours, but they're still killing people. So what do we do in this transition time to help manage the hype in the public.

Dr. Jason Lerch [30:10] I'm not sure if it's different for AI than it is for anything out of medicine or engineering that's coming out. There's often something comes out, everybody's very excited, and then over the years it begins to settle more into here's what it can, and here's what it can't do. And AI comes down to especially this idea that no matter how perfect the algorithm, if you feed in garbage to train it, you're going to get garbage out. And that's where the solutions, and as well as the realism, is going to come in. It's hard to generate perfect data to train anything.

Dr. Shreejoy Tripathy [30:36] Oren, do you want to speak to that?

Dr. Oren Kraus [30:37] I think you can also, depends on the news media, highlight some examples where AI can't do something really, really simple. If you go to some conferences, like NeurIPS, you'll see a poster where they're still training things on handwritten digits, just to prove that some algorithm works with, for example, low data. Only having two or three examples per class. So that's something we can't really do yet with a lot of the machine learning techniques. So communicating those limitations. Also, those are newer ideas, which aren't really fully scaled yet to the real world problems. But just showing where there's still a lot of room for improvement would really help people kind of put everything in context.

Dr. Shreejoy Tripathy [31:12] Joanna do you want to speak a bit to managing some of the hype?

Joanna Yu [31:15] I think some more knowledge translation around what it means when someone says 80% prediction rate, 100% prediction rate, sort of the expectation...

Dr. Shreejoy Tripathy [31:22] Can you just say that again?

Dr. Joanna Yu [31:24] When they say 80% positive prediction, what are the caveats? When does that hold true? What are the limitations? Just in sort of plain language. I get, yes, that's described in publications. But what does that mean to the end user in terms of impact?

Dr. Shreejoy Tripathy [31:40] Marzyeh do you have comments about ...

Dr. Marzyeh Ghassemi [31:42] I had one thing that I think was said by a speaker at the Vector AI rounds yesterday that I really liked. The example she gave was not healthcare, she said there was an intervention done in Tennessee where they reduced classroom sizes. Lo and behold, all of those students did really well. So a paper was published and some calls were made, and the state of California decided to do this, reduce classroom sizes. It failed miserably, the students didn't do better. It's because they have a larger number of students that they have to educate. So smaller classroom sizes meant teachers were going room to room, they had to cover multiple locations, you were splitting up groups of kids who maybe were in teams together previously. So one of the things that Dr. Kleinberg said yesterday in her talk is knowing the causal effect is not good enough. Knowing the 80% positive or predictive value isn't good enough. Knowing the necessary and sufficient conditions to reproduce that causal effect are. And I think that's what our limitations sections usually say. They'll say this only works in this case, under these conditions, we have this kind of algorithm with this kind of training. But that's often not what's spoken about.

Dr. Shreejoy Tripathy [33:01] Just in general, do you think there could be more effort to see whether algorithms generalize across different data sets, and populations? Do you guys think there should be more?

Dr. Marzyeh Ghassemi [33:13] Yes. So I have a student who's currently working on whether algorithms generalize over time and policy changes. So the Affordable Care Act happened, doctors started behaving differently because they were reimbursed differently by the American government...

Dr. Shreejoy Tripathy [33:26] ...AKA Obamacare, and it brought health care to most of America, unlike Canada where things are great.

Dr. Marzyeh Ghassemi [33:33] Yeah. Sorry, guys. Sorry. So when this policy change happened, our models did really poorly. You can imagine that would be true, right? AUC that is a 0.9 is suddenly 0.6. So from a 90% accurate, to 60% accurate. It's because something changed, and the model was not aware of it, and you didn't tell it anything. It's assuming that it can go the same way.

Grace Jacobs [33:57] So, all of our panelists agree the communication between the scientific community regarding advancements such as an AI is currently lacking and needs to improve.

Stephania Assimopoulos [34:06] There are a number of reasons why there's often a divide between science and public perspective, that our experts touched upon. Scientists usually spend a lot more time speaking to other scientists about their work. It can be difficult to take off that hat when addressing a broader community. We need to keep in mind the end user when we talk about how an algorithm performs and what that really means.

Grace Jacobs [34:26] This is an issue we really care about here at Raw Talk, where our mission is to communicate scientific discoveries and the stories behind them in an effective and accessible way, within and outside of the scientific community. This discussion was also reminiscent of last year's lab event, focusing on science communication and public engagement. Be sure to check that out on episodes 44 and 45.

Dr. Shreejoy Tripathy [34:46] So we have probably have a lot of people who are super interested in AI, and they want to start using it in medicine, in their own work. Can you guys just provide a couple of words about what should they consider before they start incorporating this to their work? If you could give them one or two pieces of advice. Maybe imagine yourself 10, 20, 30 years younger, what would you tell yourself if you were starting out today? Oren, do you want to start?

Dr. Oren Kraus [35:13] Sure. I'd get to know the basics. So you know, there's good resources online, like Coursera, for learning the ins and outs of machine learning, and then deep learning. But also, there's an explosion of papers of pretty much AI applications in almost every single subfield. So whatever data you're interested in applying it to, or whatever your field is, you can also see what's been done more recently, and that'll give you a lot of tips on how to get started applying AI to your own problems.

Dr. Shreejoy Tripathy [35:36] Marzyeh?

Dr. Marzyeh Ghassemi [35:38] I mean, my recommendation to you 10 years ago is buy real estate. But I would say for you guys, if you want to focus on models, invest really heavily in understanding what you're doing. I was just discussing that this past semester, I taught this machine learning for health graduate course in the computer science department. They gave it to me because it's like a light load. You expect, you know, 10 to 20 students in a graduate seminar course and 100 students showed up the first day. So I had a quiz, a pop quiz, about probability, statistics, inference, things that should have been covered in an introduction machine learning course. And half of the students dropped. Which I think tells you that there are a lot of automatic things right now. You can download TensorFlow, or PyTorch, or Keras, or whatever you want. There are pre trained ResNet models, it's easy to do. But if you're not aware of how these things are trained, it's really easy to misuse them. Like shockingly easy to misuse them. I don't want somebody to download code that I've written, or a student of mine has written, and use it to discriminate against women in application essays, Blacks in probation hearings, or Southeast Asians in medical care. So take a lot of courses.

Dr. Shreejoy Tripathy [37:12] Learn the basics. Jason?

Dr. Jason Lerch [37:15] After that, think about your measurements. The learning algorithm can't overcome bad data that goes in, or can't completely overcome it. The better the data is, the better the outcome will be no matter what the algorithm is at the end of the day.

Dr. Shreejoy Tripathy [37:29] Joanna?

Dr. Joanna Yu [37:29] Yeah, I would just add to that, really understand the provenance around your data. How it was...

Dr. Shreejoy Tripathy [37:33] ...can you explain what provenance is?

Dr. Joanna Yu [37:35] So the context around your data, how it was collected, if possible, who collected it, when it was collected. Just basically all the details that you would want to know about the time point, and anything surrounding that, the process that was used. All of that to just help paint a better picture, so you can understand the quality of the data that you're working with.

Dr. Shreejoy Tripathy [37:57] Right? So just to quickly summarize, understand your data.

Dr. Joanna Yu [38:01] Yeah, and don't be afraid to ask questions from the people who collected the data if it wasn't yourself.

Dr. Shreejoy Tripathy [38:05] And understand your algorithms. We didn't really say much about the specific algorithms, but maybe those two ideas are kind of take homes? Algorithms will improve, data will improve, but there will always be bad data and there always will be bad algorithms.

Stephania Assimopoulos [38:19] When we opened up the floor to the audience, we got some great questions about interpretation of AI models, when we should and shouldn't use them, where responsibility lies when things go wrong, and patient confidentiality.

Audience Member 1 [38:31] When you're applying AI, or deep learning, or machine learning techniques, or models on top of the clinical data, one of the problems that we have with the clinicians is that they are complaining that the data is not interpretable. The problem is just getting more complicated when we are going to use...

Dr. Shreejoy Tripathy [38:53] ...do you mean the data? Or do you mean the results of the model?

Audience Member 1 [38:57] The results of the model. They are not interpretable. The problem gets worse when you're using a deep learning model.

Dr. Shreejoy Tripathy [39:08] So can you guys speak to the interpretability of machine learning? Joanna, do you want to start?

Dr. Joanna Yu [39:11] First and foremost, I think when you're working with clinicians, a lot of times there's data overload. Even just in terms of technology, and what they have to enter in paperwork for electronic medical records, or just medical records themselves. So first and foremost, and this is the approach that we're taking in working with them, is to co-design tools that allow them to use the data that's being collected in an easy, accessible, meaningful way for them, before we even get to the stage of returning these advanced algorithms. But really making it in a way that is interpretable.

Dr. Shreejoy Tripathy [39:46] Jason, you spoke a bit to interpretability. Are there recent efforts to make the outputs of machine learning elements more interpretable and more interpretable to end users like clinicians?

Dr. Jason Lerch [39:55] I think there are quite a few, especially in terms of visualization of the output, trying to understand what's happening, which patients are being classified one way or another, what are the features that seem to have the most weight, the most importance. So I think there's quite a bit of effort. But I do think it can be a problem, because there's cases where you simply want the algorithm to be as accurate as you can. Then there are cases where accuracy matters less than the understanding that comes from the physical output, and it's in that where there's still a challenge. I think there's work, and it's fascinating work, on it. But I think it'll remain a challenge for a while. But it's not that different from the more classic statistics as you move away from one or two variable linear model into a multivariate canonical correlation analysis. Really understanding what that's doing isn't particularly easy, either.

Dr. Shreejoy Tripathy [40:41] Yeah for sure.

Audience Member 2 [40:44] Hi, just a quick question, what should we not use AI on?

Dr. Shreejoy Tripathy [40:49] I like that.

Dr. Oren Kraus [40:51] Yeah, data sets where there's very limited data. So if there's only a few samples, you can't really apply AI reasonably.

Dr. Shreejoy Tripathy [40:58] What's a few samples?

Dr. Oren Kraus [40:59] Depends on the problem. For images, if it's less than 10s to hundreds of images per category, then it's hard. Also, for cases where there's not clear classes, it's really hard to figure out how to use AI for that. You could use some of the machine learning techniques, but not necessarily deep learning. Also, just understanding what your problem is first before jumping to use AI is important. Sometimes something really simple will work, or just looking at the data will be a lot faster than then trying to use AI.

Dr. Marzyeh Ghassemi [41:29] You should not use AI to communicate end of life decisions to families, you should not use AI to have doctors interface with certain patients. I think in any field, you can probably segment out the things where you think, "I could probably just make that easier, if I had somebody else do it." I don't really care about how it gets done. I just want it to get done. Those things are things that AI probably can and should do. If it's something where you think, "I really want to make sure that I communicate that correctly, that I do that right." That's probably something that you should be doing. So it's fine to me if, for example in my course, if I have an algorithm that goes through and grades all the code that the students wrote. But when it comes to deciding who plagiarized, I make that call.

Audience Member 3 [42:30] So I'm interested in when things go wrong. So you mentioned in the hospital, in the United States, where there was an issue because of not calculating in various variables, people died. So I'm interested in who's responsible?

Dr. Marzyeh Ghassemi [42:49] I think, and I didn't say very much on this on the previous question, that people hold AI, because of the hype cycle, to a higher standard than they do other tools. So in that particular story, they were using an add 'em up model. It wasn't a deep learning model, random forest, it was just, "check this on the patient, give them a score from zero to five, check that, give them a score from zero to five." They added that all up, and they had a total score, and then they had ranges for decisions. That's very normal in most medical care. A lot of medical practice has these checklists, or these risk scores, that doctors are hand adding on sheets of paper. So I don't think that AI should be held to a different standard than another tool, just because it has higher capacity. I think it should be held to the same standards. Doctors are professional advice givers, just like lawyers and accountants and everybody else. We regulate professional advice givers. Giving a professional advice giver a tool, I don't think makes them a worst advice giver.

Audience Member 4 [43:56] Hi, I'm wondering, is patient confidentiality a barrier to the true power of AI? If yes, what could be a potential solution?

Dr. Marzyeh Ghassemi [44:10] This is possibly the worst burning question to ask me, because I feel very strongly about this. So, in the United States, we have HIPAA. HIPAA regulates the use of clinical data. If you de-identify, under HIPAA standards there are 23 fields that are PHI, you get certified that you de-identified them to this standard. You were then allowed to release that data to researchers, like me, so that I can run whatever model I want on my own GPUs, and it's HIPAA compliant, because I have done the appropriate training. In the United States, there's CITI training. Here it is CPTP training that is very similar, this is the equivalent you should have in your head to map to. Then I signed an End User License Agreement saying, I would never try to re-identify this and I would never try to redistribute this. And then me, and my whole lab, are able to use this data. And this is fantastic! The US is definitely moving in this direction, where they say, we'll de-identify things, and then we'll say that researchers should be able to use this in their work. I don't think that Canada is quite there in terms of social acceptance. I'm a newcomer here, so I don't know how quickly that's going to change. I'm hoping it's going to change really quickly. I'm hoping that because at this point, I can easily work with many American datasets and British datasets. But it's been very challenging to work with Canadian datasets.

Dr. Shreejoy Tripathy [45:34] Joanna, do you want to add to that?

Dr. Joanna Yu [45:36] So that certainly is a challenge. On the clinical side, if you're looking to do initiatives that are striving for quality improvement, then you are able to access that data for that purpose. On the research side, I think she covered all the points really there, we also have key HIPAA as well. You are required to de-identify it, and different models are used in different places. So for example, when you're in a research study, if you additionally want to gain access to data that was collected for clinical purposes, the research ethics board would have to give their approval and you would also have to obtain consent from the patient. So going forward, the question becomes, in our healthcare system, in hospitals, do we want to empower patients, inform patients that we are using their data for additional research questions, beyond just providing them clinical care? So that's a model that we're currently exploring is to make patients aware as well, and ask for their consent to be able to do these things.

Grace Jacobs [46:33] There's no doubt that AI is making an impact in medicine. Although we're still discovering its potential limitations and how we can use it in an optimal way. It's definitely important to keep in mind, as Dr. Lerch shared, if we put garbage into AI models, we're going to get garbage out.

Stephania Assimopoulos [46:47] As AI continues to be applied and implemented in medicine, it is essential that we improve knowledge translation to scientists interested in working with AI, to healthcare providers integrating AI, and to the public trying to navigate and understand the implications of AI. We hope this event and the episodes we're sharing are a positive step towards this goal.

Grace Jacobs [47:06] A special thank you to our fantastic panelists and moderators, everyone on the Raw Talk team, who made this year's live event possible, and our sponsors for their support. Be sure to check out our next episode discussing the second panel of the event on the future and ethical considerations of artificial intelligence. Until next time, keep it raw!