WEBVTT 1 00:00:12.280 --> 00:00:14.730 Dr. Suchi Saria: Mike, can I ask a quick question while we're waiting. 2 00:00:14.960 --> 00:00:22.050 Michael Littman: Yeah, yeah, though, there are people's site. People are arriving. So we're we're talking at the front of the auditorium while people filter in, go ahead. 3 00:00:22.050 --> 00:00:27.830 Dr. Suchi Saria: Okay, got it. I was just curious, is it? Are we sort of aiming for an hour with QA. 4 00:00:29.187 --> 00:00:33.469 Michael Littman: I think we have. The slot goes for an hour 15, including Q. And a. 5 00:00:33.470 --> 00:00:34.599 Dr. Suchi Saria: Okay, perfect. 6 00:00:35.770 --> 00:00:37.890 Michael Littman: I can double check that in my calendar. 7 00:00:38.250 --> 00:00:39.340 CISE AD | Gregory Hager: No, that's correct. 8 00:00:39.610 --> 00:00:41.609 Michael Littman: All right. We have confirmation. 9 00:00:42.280 --> 00:00:47.400 Michael Littman: and then there's a there's a follow up meeting with Nsf. Staff. That's a separate thing. 10 00:00:48.140 --> 00:00:48.970 Dr. Suchi Saria: Perfect. 11 00:00:49.930 --> 00:00:50.919 Michael Littman: Cool, cool, cool. 12 00:00:51.930 --> 00:00:55.939 Michael Littman: all right. So looks like people are still filtering in. 13 00:00:56.190 --> 00:00:57.202 Michael Littman: so I will 14 00:00:57.730 --> 00:01:00.480 Michael Littman: wait a moment or 2 for the numbers to level off. 15 00:01:01.000 --> 00:01:02.520 Dr. Suchi Saria: Are you there in person. 16 00:01:03.000 --> 00:01:05.500 Michael Littman: I am in the building at Nsf right now. Yeah. 17 00:01:05.970 --> 00:01:09.779 Michael Littman: And I was actually I was on campus at Hopkins on Monday. 18 00:01:10.410 --> 00:01:11.150 Dr. Suchi Saria: Nice. 19 00:01:11.150 --> 00:01:13.140 Michael Littman: Were you? Are you there, or are you in New York? 20 00:01:13.140 --> 00:01:15.990 Dr. Suchi Saria: San Diego right now, speaking from a hotel. 21 00:01:16.120 --> 00:01:18.150 Dr. Suchi Saria: because I'm in a different meeting. 22 00:01:19.320 --> 00:01:20.150 Dr. Suchi Saria: Sadly. 23 00:01:20.150 --> 00:01:20.710 Michael Littman: Joining. 24 00:01:21.350 --> 00:01:30.910 Michael Littman: Okay, I feel like we're at sort of trickle level of increase at the moment in terms of participants. So I will kick this thing off. 25 00:01:30.960 --> 00:01:38.399 Michael Littman: Thank you. Everybody for coming to the National Science Foundation Size Directorate Distinguished Lecture Series. 26 00:01:38.570 --> 00:01:52.139 Michael Littman: and I'm very honored for today's speaker to introduce Professor Suchi Saria. She is the John C. Malone, Associate Professor of Computer Science and of Statistics, and also Health policy at the Johns Hopkins University. 27 00:01:52.220 --> 00:02:01.900 Michael Littman: and one of the reasons that we were very excited to invite Suchi as an Nsf. Size distinguished lecturer was because she's working on so many interesting things, and of course. 28 00:02:01.900 --> 00:02:26.529 Michael Littman: that also made scheduling her talk a bit of a challenge because she's doing so many interesting things. And in fact, I found out that she's calling in today from San Diego, because there's another set of meetings that she's involved in there, and she's taking time out to kind of talk with us. But we're here now, and I'm really glad that she's able to join us. Her work has received recognition in numerous forms, including best paper awards at machine learning, informatics and medical venues. 29 00:02:26.530 --> 00:02:50.390 Michael Littman: But since this is an Nsf. Event, I'll highlight her Nsf. Computing innovation fellowship which she got in 2011, and it's also worth mentioning that she is the founder and CEO of Bayesian health, which is just a great name. I love that name, and in the words of Forbes Magazine founder, Suchi Saria's startup Bayesian health offers software to help hospital staff identify high-risk patients. 30 00:02:50.420 --> 00:03:01.789 Michael Littman: its products, evaluate health history and medical records to empower healthcare providers to take timely lifesaving actions for patients at risk of critical conditions, like sepsis, deterioration and pressure injuries. 31 00:03:01.940 --> 00:03:11.389 Michael Littman: The company has raised 15 million dollars from and from investors, including Andreessen, Horowitz, and another 15 million in grants from the National Institutes of Health. 32 00:03:11.420 --> 00:03:29.219 Michael Littman: Darpa, and, most importantly, and I'm sure that it says that in the Forbes magazine blurb the National Science Foundation. She's given various flavors. This is now me. Again. She's given various flavors of Ted Talks as well as keynote slots at many major conferences in AI machine learning and medicine. 33 00:03:29.220 --> 00:03:42.949 Michael Littman: I, personally have been following Suchi's work since she was a Phd. Student with Daphne Kohler, and I'm eagerly anticipating hearing what she'll share with us today. So let's all give a warm, albeit silent, welcome to Dr. Suchi Saria. Thank you for being here. Suchi. 34 00:03:43.910 --> 00:03:51.869 Dr. Suchi Saria: Thank you for having me. I'm very excited to be able to join. I wish I could see the folks who are there in person or the folks who are attending 35 00:03:52.772 --> 00:03:55.339 Dr. Suchi Saria: I structured my talk. 36 00:03:55.540 --> 00:03:56.709 Dr. Suchi Saria: you know, when 37 00:03:57.180 --> 00:04:04.570 Dr. Suchi Saria: And as I've sent the invite out, I got a whole bunch of folks from completely different areas of computing and math and Cs 38 00:04:04.590 --> 00:04:09.950 Dr. Suchi Saria: and other areas, who wrote to me saying, Oh, you're giving this lecture, which made me think the audience that is 39 00:04:10.674 --> 00:04:18.479 Dr. Suchi Saria: hearing this wise, the intended audience is very broad and not just. People who do AI machine learning research all day long. 40 00:04:18.519 --> 00:04:25.269 Dr. Suchi Saria: So with that in mind, I've structured my talk where it won't be entirely boring to people who don't do aiml all day. 41 00:04:25.300 --> 00:04:29.029 Dr. Suchi Saria: It will be a collection of broad description of problems. 42 00:04:29.080 --> 00:04:34.889 Dr. Suchi Saria: a little bit of an overview of sort of a 10 year journey and sort of bringing some ideas to life. 43 00:04:34.970 --> 00:04:40.990 Dr. Suchi Saria: And then in some areas we'll do a deep dive. In recent state of the art ideas 44 00:04:41.170 --> 00:04:43.120 Dr. Suchi Saria: on the aiml front. 45 00:04:44.450 --> 00:04:55.089 Dr. Suchi Saria: with that in mind, I'll jump in. So here's a here's a sort of a paper generally. You know. I wouldn't have had the dates there and done a real time. Poll 46 00:04:55.200 --> 00:04:58.750 Dr. Suchi Saria: on this, and and the paper is basically talking about 47 00:04:58.780 --> 00:05:06.219 Dr. Suchi Saria: how rapid advances information science, computer science is going to absolutely completely transform the field of medicine and the practice of medicine. 48 00:05:06.410 --> 00:05:10.640 Dr. Suchi Saria: and you know, reimagine the role of caregivers. 49 00:05:10.940 --> 00:05:21.420 Dr. Suchi Saria: and you know, when you read it, you think at least for me, I feel like it's the kind of thing I would expect to be written today because of the amount of excitement for what computer science can do 50 00:05:21.480 --> 00:05:25.559 Dr. Suchi Saria: except the article was written in 1970, 51 00:05:25.570 --> 00:05:41.399 Dr. Suchi Saria: which you know as a researcher, I want to kind of learn from history. So when I when I read that, I was kind of disappointing to me because I'm like Holy Majulie. What if 30 years from now, you know, we've made all these promises now and 30 years from now we look back, and we sort of still feel like. 52 00:05:41.800 --> 00:05:46.980 Dr. Suchi Saria: you know, we were so excited. And then, you know, not as much materialized. Here's another 53 00:05:47.130 --> 00:05:50.300 Dr. Suchi Saria: article from 1994, 54 00:05:50.660 --> 00:05:55.500 Dr. Suchi Saria: which says a report card of computer assisted diagnosis grade C, 55 00:05:55.530 --> 00:06:13.879 Dr. Suchi Saria: so as a young researcher or younger researcher, when I was younger, looking into this, this sort of really bothered me. And so one of the things that I often think about is how to make sure we don't repeat this and that sort of inspired a lot of the trajectory of my work. But part of 56 00:06:13.920 --> 00:06:40.000 Dr. Suchi Saria: part of what's also exciting is, it's not just about the methods work. It's not just about the tech itself. It's also about the ecosystem within which the tech is be getting embedded, and a lot has changed about the ecosystem in this, in which the tech is getting embedded, which is, I think, finally, we're at this turnaround point where we are going to see this crazy, rapid adoption, or at least I can palpably see 57 00:06:40.010 --> 00:06:42.200 Dr. Suchi Saria: very meaningful difference in the last 58 00:06:42.290 --> 00:06:45.530 Dr. Suchi Saria: 4 to 5 years. So 1st 59 00:06:45.640 --> 00:06:58.650 Dr. Suchi Saria: evidence of that is this slide is already outdated. There are upwards of about 500, maybe 600. Now, maybe more. AI enabled devices that have been approved by the FDA in the last. 60 00:06:59.180 --> 00:07:01.770 Dr. Suchi Saria: you know, in the last few years. 61 00:07:02.172 --> 00:07:09.480 Dr. Suchi Saria: Many of the these are in imaging. I'll talk a little bit more about, and sort of opportunities and multimodal AI. 62 00:07:09.530 --> 00:07:10.164 Dr. Suchi Saria: But 63 00:07:10.890 --> 00:07:18.180 Dr. Suchi Saria: there's definite evidence, for you know, ideas going from what we would call the lab bench 64 00:07:18.230 --> 00:07:19.540 Dr. Suchi Saria: to the bedside 65 00:07:20.340 --> 00:07:26.710 Dr. Suchi Saria: in terms of the opportunity. Real world practical impact these ideas can have like, here are some real 66 00:07:26.840 --> 00:07:29.398 Dr. Suchi Saria: technologies today that are live. 67 00:07:29.940 --> 00:07:34.590 Dr. Suchi Saria: all invented in the last, like 5 to 6, 7 years 68 00:07:34.730 --> 00:07:36.849 Dr. Suchi Saria: and really core fundamental 69 00:07:37.260 --> 00:07:47.990 Dr. Suchi Saria: nsf funded technology translating. So the 1st one is basically the ability. Today, you know, when you have today's ultrasound machines, often very expensive. 70 00:07:48.260 --> 00:07:51.609 Dr. Suchi Saria: They're hard to access, but turns out like 71 00:07:51.670 --> 00:08:05.559 Dr. Suchi Saria: there are new kinds of low cost, handheld ultrasounds getting built coupled with AI on it, which now makes it so that you don't have to go for an ultrasound necessarily to a very expensive like facility 72 00:08:05.580 --> 00:08:10.780 Dr. Suchi Saria: and slice in, you know, ultrasound experts or specialists. 73 00:08:10.830 --> 00:08:40.799 Dr. Suchi Saria: you can start to get screening for complications that could have been caught earlier at the urgent care center, which is sort of your nearby urgent care center which you know, is, you're much more likely to frequently visit. So in in a very good example of how point of like AI, combined with novel point of care. Diagnostics is dramatically improving access to high quality care by both making it cheaper and making it easier to access. 74 00:08:41.360 --> 00:08:56.100 Dr. Suchi Saria: The 3rd example here of diabetic retinopathy is one where today patients have to go to again, a specialist to get diagnosed for diabetic retinopathy again, making it less and less likely that they know about it. 75 00:08:56.280 --> 00:09:04.819 Dr. Suchi Saria: you'd have to have a premeditated reason for being there. In the 1st place, this is a fully autonomous device that you can now put in a primary care provider's office. 76 00:09:04.900 --> 00:09:31.120 Dr. Suchi Saria: And what the device does is it allows you to take a retinal scan and automatically identify. If the person has diabetic retinopathy again. The whole thing is cheaper than you would have gone. It would have been going to a specialist. But what's also exciting is again you go to a primary care provider way more often than you go to a specialist. So your chance of basically accessing the technology, knowing if you're at risk is much higher, which means you can more proactively treat. 77 00:09:31.120 --> 00:09:43.750 Dr. Suchi Saria: Last example is an example. In the middle of stroke diagnosis, where essentially, you know, in in stroke, time is money every every, you know, like minute matters. 78 00:09:43.750 --> 00:10:02.480 Dr. Suchi Saria: So the ability to take an image for AI to automatically interpret it, flag high risk cases and make sure those patients are getting transferred to the right center in a timely fashion means you're not waiting for the specialist to be able to have the time to go read it, it dramatically accelerates our ability to get treatments, and as a result. 79 00:10:03.062 --> 00:10:04.780 Dr. Suchi Saria: can be lifesaving 80 00:10:05.240 --> 00:10:15.591 Dr. Suchi Saria: so really compelling examples of solutions that leverage AI to improve you know, meaningfully improve 81 00:10:16.850 --> 00:10:19.349 Dr. Suchi Saria: quality of care and patient outcomes. 82 00:10:20.570 --> 00:10:47.679 Dr. Suchi Saria: I'm going to talk a little bit more about multimodal AI. So like another super exciting thing that's happened since 2010, which is not that far back in the future in the past is the advent of electronic health records. Right? So prior to 2010, a lot of our infrastructure in the Us. Was very focused on paper records. You go to a clinician's office. They would be writing your notes and paper records, or some local infrastructure, and then effectively. 83 00:10:47.680 --> 00:11:12.719 Dr. Suchi Saria: you'd go the next time they would hear you again and respond. But what's very, very much changed in the last 14 to 15 years is now most health systems in this country, like hospitals, patient practices, etc. Provider practices have gone electronic, which means all the all of the information about like where you know your history, what was given to you, how you responded, etc, is now in computable form. 84 00:11:12.810 --> 00:11:19.372 Dr. Suchi Saria: and what that now makes possible is suddenly giving us a view into what is, 85 00:11:19.870 --> 00:11:23.709 Dr. Suchi Saria: you know, not just a patient's journey, but also the quality of care. 86 00:11:23.810 --> 00:11:50.370 Dr. Suchi Saria: new opportunity for use, this data to invent new treatments for earlier diagnosis, more precise targeting of treatments, but then also deliver care in a much more precision. Way, by using this information to, you know, in real time change the way we practice. So this is the what I view as one of the most interesting, compelling ecosystem shifts that have happened, that suddenly made medicine, which is a massive enterprise. 87 00:11:50.370 --> 00:11:59.680 Dr. Suchi Saria: Suddenly the use of computing within medicine to meaningfully transform both the way care is delivered and human outcomes. 88 00:12:00.690 --> 00:12:27.110 Dr. Suchi Saria: After this slide I'll start going into more of the details of like the technical details. This is my last high level overview. But a lot of what you'll see is very heavily colored from also my real world experience of taking research, almost a decade of research and methodological ideas. We developed in the lab and then translated it through this company that Michael described as Bayesian Health 89 00:12:27.488 --> 00:12:53.180 Dr. Suchi Saria: when we call it Bayesian, why do we call it Bayesian? It's not necessarily because it's using purely Bayesian techniques. It's because the way to think about this is, you know, the Bayesian way of reasoning, when possible, is the most sound optimal way of reasoning where you can incorporate prior data with real time data do uncertainty, quantification, know what data to trust, what not to trust and update as new information arise, and the idea is to be able to give this kind of 90 00:12:53.641 --> 00:13:03.240 Dr. Suchi Saria: quality and capability to our frontline care providers in, you know, one of the most areas important areas that matters to all of us, which is our human health. 91 00:13:03.480 --> 00:13:25.730 Dr. Suchi Saria: And how do we do this in a way that uses state of the art technology to do that? And so that's sort of the company. We started Bayesian health and through Bayesian I've had the chance to learn and develop a lot of real world experience partnering with health systems nationally, both big and small. Not just sort of like, you know, very rich 92 00:13:25.870 --> 00:13:43.070 Dr. Suchi Saria: academic medical centers, but also rural community hospitals, where, you know, there's really great need for these kinds of technologies to up level the quality of care. And so you'll hear a little bit of like some learnings there that has then informed a lot of the methodologic research. You'll see that I'll talk about. 93 00:13:43.420 --> 00:13:49.909 Dr. Suchi Saria: Okay. So with that in mind, I'll dive in into sort of more of a concrete example. I've worked. 94 00:13:49.990 --> 00:13:52.700 Dr. Suchi Saria: I've been really lucky to be funded by Nsf. 95 00:13:52.880 --> 00:14:03.469 Dr. Suchi Saria: And on my methodologic work, and have partnered in a variety of flying these methods in a variety of practical disease areas and autoimmune diseases and 96 00:14:03.570 --> 00:14:16.259 Dr. Suchi Saria: neurologic diseases and infectious diseases, and so on so forth. As inspiration for the foundational work we do today. For the purpose of this talk, I will pick one example area called sepsis. 97 00:14:16.430 --> 00:14:20.080 Dr. Suchi Saria: I've done about a you know, really gone deep in this space 98 00:14:20.190 --> 00:14:34.979 Dr. Suchi Saria: and done about a you know, a decade of work to build sort of really solutions that are going to, you know, meaningfully advanced standard of care. So I'll I'll use this as a an example to kind of describe how 10 years of like 99 00:14:35.070 --> 00:14:41.840 Dr. Suchi Saria: bunch of ideas fitting together to to get you to where we are today. So just a little bit of background 100 00:14:41.850 --> 00:14:48.440 Dr. Suchi Saria: sepsis is an example of a diagnostic error today. Diagnostic errors are still considered the 3rd leading cause of death. 101 00:14:48.797 --> 00:14:55.970 Dr. Suchi Saria: These are often because the patient didn't get the right diagnosis. They got a delayed diagnosis, which then, you know, maybe the treatments were there. Had they gotten diagnosed. 102 00:14:56.110 --> 00:15:09.340 Dr. Suchi Saria: and there would be an opportunity to meaningfully change the outcome. Why does it occur? All sorts of reasons, right like, it's easy to miss data. It's easy to be biased in interpreting results. It's easier to over rely on impression. Burnout 103 00:15:09.410 --> 00:15:11.020 Dr. Suchi Saria: failure to order tests. 104 00:15:11.270 --> 00:15:24.529 Dr. Suchi Saria: Sepsis is an example of an area where delays in diagnosis or misdiagnosis leads to a massive toll on human outcomes, and it turns out in hospitals. This is the leading cause of hospital death. 105 00:15:25.950 --> 00:15:32.510 Dr. Suchi Saria: for sepsis in particular. Once you're in this area, once you, you know again, light stroke in sepsis time is money. 106 00:15:32.630 --> 00:15:51.880 Dr. Suchi Saria: Every hour in sepsis. There's data showing has meaningful association with increase in mortality. So turns out in sepsis, we do have some available treatments, but the treatments are more effective. You can identify the condition early. So the big bottleneck we want to solve is. 107 00:15:51.880 --> 00:16:18.479 Dr. Suchi Saria: how do we make sure we identify a patient who is septic as early as possible in the course, so that we basically can have a shot at rescuing them, one in 3 patients once they reach septic shock which is like a patient gets infection. Infection leads to systemic response which starts to attack your organ systems leading to sometimes organ failure and organ damage and death, if not controlled in a timely fashion. 108 00:16:18.480 --> 00:16:32.140 Dr. Suchi Saria: This is your immune system becoming overactive in response to tackling your infection. Now it turns out when you're in that reactive state where you reach shock mortality rates are like one in 3. So really, really high. 109 00:16:33.970 --> 00:16:42.800 Dr. Suchi Saria: here, we we started working this about in 2013, 2014, and 2015, wrote one of the earliest papers showing how you could take 110 00:16:42.840 --> 00:17:02.179 Dr. Suchi Saria: emr data like electronic health record infrastructure that is now commonplace or becoming commonplace at the time, combined with machine learning techniques that make sense of this data to be able to identify subst early. So that was like, you know, 2015. It showed promise. But then, to get it from 111 00:17:02.240 --> 00:17:13.369 Dr. Suchi Saria: this could be promising to get it to something that is very practical, deployable, and usable was another 10 years. And I'll talk a little bit about what it did take. 112 00:17:14.450 --> 00:17:25.760 Dr. Suchi Saria: And here I have photos here of some of the students who contributed a great deal, and now have very good careers that they're pursuing post their graduation. 113 00:17:26.230 --> 00:17:37.339 Dr. Suchi Saria: And this work has now led to like, you know a huge amount of downstream, like thousands of now publications. The site is now outdated in terms of the amount of activity in this space. 114 00:17:37.430 --> 00:17:47.519 Dr. Suchi Saria: I'll start with an end result which is from a recent result. This is a study. These are 3 studies we published on the cover of Nature Medicine. They came out in 2220, 20, through summer. 115 00:17:47.770 --> 00:17:51.729 Dr. Suchi Saria: What the paper showed was super high, level. 116 00:17:51.820 --> 00:18:17.569 Dr. Suchi Saria: massive. These were pretty large studies. These were pragmatic studies. They were done across 5 hospital sites, both academic and community overall, spanning nearly 3 quarters of a million patients. So 750,000 patients. This was work funded by an sbir grant, in part to do the pragmatic study and some of the analyses of the pragmatic study. 117 00:18:17.880 --> 00:18:26.509 Dr. Suchi Saria: There were 4,400 clinicians who participated as part of the study. It's 1 of the biggest practical trials in medicine with AI, 118 00:18:26.710 --> 00:18:47.120 Dr. Suchi Saria: and what we showed was the 1st one was, you know, clinicians would often say, especially at a place like Hopkins. Right? Ranks very well in safety quality. Well, we're already very good. I don't know the extent to which AI can really help me. And so here what we showed was essentially using AI running in the background. At 1st the ability to identify sepsis 119 00:18:47.190 --> 00:18:58.390 Dr. Suchi Saria: meaningfully earlier than when physicians did on average 5.7 h earlier on patients who ended up becoming septic and eventually died in the hospital. So, like 120 00:18:58.540 --> 00:19:24.420 Dr. Suchi Saria: the frontline physicians. So this is the system running in the background. It's not impacting standard of care yet. So that's sort of like 10 x higher performance than alternative tools that exist and current standard of care. The next thing was, well, okay, well, maybe this is possible to identify. The next question is, will clinicians adopt. Right? So the bottleneck of clinician adoption or physician adoption has been a big one, and part of it is like physician trust 121 00:19:24.420 --> 00:19:38.799 Dr. Suchi Saria: in this technology. And how can it be delivered in a way that they trust? So with Nsf, we're funded on this grant called human machine teaming. How do we enable humans and machines to be more effective together than each individually alone. 122 00:19:38.810 --> 00:19:49.019 Dr. Suchi Saria: And using some of the ideas in that grant we build this sort of, we we did sort of both a pragmatic human factor study as well as quantitative evaluations, and showed 123 00:19:49.040 --> 00:20:11.710 Dr. Suchi Saria: over 2 and a half year period in a large, you know, 4,400 clinicians using the platform and tool, nearly 89% physician adoption. This is like, you know, typical adoption for Cds tools or alerts, alarms. And the Emr is something like 10%. So a very meaningful improvement over current current approaches. 124 00:20:11.710 --> 00:20:27.640 Dr. Suchi Saria: And then in sepsis, we know if you have earlier detection, and you have meaningful adoption. Then you expect to see changes in outcome, and some of the outcomes you expect to see. Change is reductions in mortality reductions in complication rates which tends to also then cut 125 00:20:28.128 --> 00:20:40.490 Dr. Suchi Saria: utilization. And and that's sort of what we saw really, personally rewarding to me to see the real impact on human life with sort of nearly 20% reductions in mortality 126 00:20:40.590 --> 00:20:42.770 Dr. Suchi Saria: in this programmatic study. 127 00:20:43.850 --> 00:20:44.750 Dr. Suchi Saria: So 128 00:20:45.113 --> 00:20:59.019 Dr. Suchi Saria: let's talk a little bit about sort of what was some learnings along the way, like, what did it take us to get there? And I could be here for 2 days giving lots of talks and lots of components. So I'll highlight just sort of a few example ideas of the rest of this talk. 129 00:20:59.190 --> 00:21:03.819 Dr. Suchi Saria: but super high level. I put sort of challenges in 4 big buckets. 130 00:21:03.830 --> 00:21:07.559 Dr. Suchi Saria: The 1st were around modeling, which is, you know. 131 00:21:07.740 --> 00:21:35.910 Dr. Suchi Saria: ultimately, we're looking for. So there were a huge number of modeling challenges we needed to solve from sort of how current state of the art looked like. So the 1st one was, you know. Ultimately, sepsis is still 2 to 3% of the population. It's not like something that you see in 30% or 40%. So it's a what I would call a needle in a haystack problem. So you've got to find the signal to noise problem is real here, because one, it's not that prevalent, but 2. There are lots of things that mimic 132 00:21:35.910 --> 00:21:39.889 Dr. Suchi Saria: sepsis, so that it's an easy way to miss, and 133 00:21:39.890 --> 00:22:04.109 Dr. Suchi Saria: you know, to misidentify or to have a very high, false, alerting rate which basically makes the system not very useful in practice, and very likely low likelihood of adoption. So how do we solve the signal to noise problem? The second thing was that you know a lot of times when we are learning from data having high quality access to ground. Truth is very important. But what should ground truth look like. 134 00:22:04.120 --> 00:22:25.777 Dr. Suchi Saria: And how do we develop ground truth in a scalable fashion? Right? So we can build high quality AI models. But part of the challenge is if they need millions of labels, and we need our physicians to sit down and chart review each chart and each chart review could easily take 30 min. How do you then get access to high quality labels from which you can learn? So the opportunity to do 135 00:22:26.240 --> 00:22:54.180 Dr. Suchi Saria: you know, build AI replay machines that basically are human in the loop that allow you to generate high quality gold standard data that is the equivalent of human adjudicated data or expert adjudicated data. It's very easy to overfit to multimodal data. So a lot of the opportunities and leveraging ideas from causal inference, combined with high dimensional learning or high, dimensional, multimodal learning to be able to build models that 136 00:22:54.250 --> 00:23:04.809 Dr. Suchi Saria: are more grounded in causal reasons, we know, can impact, which then allows to build models that are more intelligible and actionable and higher accuracy as well. 137 00:23:05.333 --> 00:23:26.800 Dr. Suchi Saria: Then, there are lots of things around bias and data collection missingness. And so, you know, like, for instance, a common common assumption we often make when we do machine learning algorithms is, think about data as missing. You know, missing. Not at sorry missing at random, or missing completely at random. 138 00:23:26.810 --> 00:23:30.687 Dr. Suchi Saria: turns out in these scenarios. Often, when a measurement is made 139 00:23:31.460 --> 00:23:59.110 Dr. Suchi Saria: it was intentional or measurement is not made, it can be intentional, isn't informative. So you want to take into account missingness. So the missingness, completely at random or at the Mcar Assumption is often false in practice, and you can actually do a whole lot better if you think harder about the missingness patterns. And so real opportunities for leveraging, for creating methodology, that leverage the complexity of the domain, to be able to build methods that are far more accurate. 140 00:24:00.610 --> 00:24:29.020 Dr. Suchi Saria: and a lot of these ideas led to a collection of papers in Europe, and Icml, and ancient medicine, and new and general medicine, and you'll hear about a few of these. The second bucket of challenges was around what I would call safe, reliable. And now, robust learning, which is, it was one thing to take data from a lab and learn these models, and to be able to show an analysis that the method performs really well. But when we move these systems from the lab to the real world, and across many sites 141 00:24:29.160 --> 00:24:39.789 Dr. Suchi Saria: you suddenly have to think about. You know. How do we monitor for ships and drifts? How do we monitor for things that are changing. And how do we update over time? 142 00:24:40.100 --> 00:24:42.140 Dr. Suchi Saria: And so I'll go a lot deeper into that. 143 00:24:42.260 --> 00:24:49.530 Dr. Suchi Saria: The next piece is delivery. How do we deliver it away. So the human machine teaming component, how do we deliver it in a way that builds clinician trust? 144 00:24:49.650 --> 00:24:51.259 Dr. Suchi Saria: And finally. 145 00:24:51.290 --> 00:24:59.780 Dr. Suchi Saria: how do we do real world pragmatic evaluations, where I think it's such an exciting opportunity, or historically, we've relied on like massive. 146 00:24:59.860 --> 00:25:10.049 Dr. Suchi Saria: premeditated Rcts that are ridiculously expensive, and as a result extremely hard to do, and a key bottleneck for bringing new ideas 147 00:25:10.050 --> 00:25:31.860 Dr. Suchi Saria: into the field. So one of the questions is, now that we have Emrs. We have the ability to embed technologies within the Emr, there's a lot of variation in provider practice pattern. Can we take advantage of the variation to be able to design trials that are far more statistically efficient, but also can run in real world scenarios that more represent the real world. 148 00:25:32.235 --> 00:25:44.650 Dr. Suchi Saria: And can still give you really high quality evidence. So that now, suddenly, the cost of the trials that you're implementing go way down and the ability to leverage these trials to show impact 149 00:25:44.740 --> 00:25:48.709 Dr. Suchi Saria: improves. The speed at which we can show impact improves dramatically. 150 00:25:49.530 --> 00:25:51.989 Dr. Suchi Saria: So I'll start with the 1st component first.st 151 00:25:52.240 --> 00:26:12.700 Dr. Suchi Saria: So let's talk a little bit about the modeling pieces first.st So traditional state of the art. Traditionally, what people were doing was, I kind of can bucket them into, you know, 4 key areas, you know the quality of the inputs, which is what are the kinds of data you're putting into your machine the quality of the labels, which is how much label data. You have 152 00:26:12.800 --> 00:26:35.810 Dr. Suchi Saria: the quality of the learning strategies, which is, how are you building learning strategies that like, really push for the kinds of complexity you have in this domain. And then, finally, rather than a frozen system, the ability to build in monitoring, tuning and bias, mitigation strategies, to be able to get to systems that are far more accurate than what exists. So current state of the art. Typically, you have 153 00:26:35.810 --> 00:26:44.230 Dr. Suchi Saria: limited set of inputs because they're often thought of as diagnostics. You want to collect only a small number of inputs that you can collect in a very controlled fashion. 154 00:26:44.380 --> 00:26:53.000 Dr. Suchi Saria: and then you're combining them often with labels. Very often you have very limited data to train from because of the burden of label acquisition. 155 00:26:53.080 --> 00:27:02.790 Dr. Suchi Saria: And then when people have tried to do large label studies. They're basically collecting really noisy labels based on what is called billing codes, which is not very good. 156 00:27:03.030 --> 00:27:25.819 Dr. Suchi Saria: 3, rd they're often using off the shelf learning strategies which are generic learning strategies and not really tuning and tailoring to the kinds of complexity we see in this data. And, 4, th most existing models are frozen, and they don't really sort of adapt to the real world. So we push on all 4 of these dimensions to be able to see the quality of results. I showed you up front. 157 00:27:26.440 --> 00:27:54.710 Dr. Suchi Saria: And so here's a really simple like pragmatic, like understanding of like, how much each of these dimensions matter. So here. What I'm showing you on the X-axis is sensitivity, which is your detection rate, and on the y-axis is positive predictive value, which means false, true, alerting rate or the opposite of false, alerting rate. Right? We want a positive predictive value to be high. We want a sensitivity to be high and ideally. We want to operate in a region here 158 00:27:54.740 --> 00:28:09.699 Dr. Suchi Saria: and here. What we're showing is basically, I took those 4 axes, and I said, all else equal. If all we did was go from these sort of curated, narrow set of inputs to dramatically expand the set of inputs to include the vast complexity of multimodal data that exists. 159 00:28:09.850 --> 00:28:26.739 Dr. Suchi Saria: how would we do in terms of improved performance? And what we're showing here is nearly at 80 to 90% sensitivity range, the ability to improve alarm rates or reduce false alarm rates by nearly 20 to 30%. So very, very meaningful. 160 00:28:27.000 --> 00:28:42.750 Dr. Suchi Saria: The second slide. What I'm showing here now is that? Okay? All else equal. If I kept the inputs the same, the learning strategy. Of course, you know, learning strategies interact with all of these. But learning strategy is the same. And what I did is change the quality of the labels, the quality of the data I'm learning from. 161 00:28:42.860 --> 00:28:48.050 Dr. Suchi Saria: And so this isn't changing the size of the data. But the quality of the actual label data. 162 00:28:48.260 --> 00:28:56.369 Dr. Suchi Saria: What we're seeing here is basically in the same sensitivity range, nearly 40 to 70% improvements in positive predictive value. So very meaningful. 163 00:28:56.420 --> 00:29:11.150 Dr. Suchi Saria: Now, in this 3rd slide, what I'm showing here is that what if I kept all else equal and combined them. So both had better, richer inputs and better labels. The 2 actually interact. And we almost see 200 to 300% improvements in Dpv. 164 00:29:11.400 --> 00:29:13.959 Dr. Suchi Saria: And then finally, sort of 165 00:29:16.350 --> 00:29:21.160 Dr. Suchi Saria: what other questions I can hear a little bit of noise. I assume I should ignore it and keep going. 166 00:29:22.240 --> 00:29:25.329 Michael Littman: Yes, it turns out I was unmuted the entire time, and 167 00:29:25.860 --> 00:29:26.610 Dr. Suchi Saria: Okay? Bye. 168 00:29:26.610 --> 00:29:28.059 Michael Littman: There with background. I'm so sorry. Please. 169 00:29:28.060 --> 00:29:30.269 Dr. Suchi Saria: No worries at all, Michael. We couldn't hear anything. 170 00:29:30.690 --> 00:29:42.400 Dr. Suchi Saria: So so that's so I spoke a little bit about the inputs and labels. I'm going to talk a little bit about the learning strategies. This seems most intuitive to this audience that the quality of the learning strategy should make a huge impact. 171 00:29:42.480 --> 00:29:48.190 Dr. Suchi Saria: though in practice what is really interesting is, you know, if you remember this very famous Peter Norwick code. 172 00:29:48.450 --> 00:30:03.459 Dr. Suchi Saria: Peter Norwig is at Google in the 2,010 s. Sort of as Google grew, there sort of grew this understanding that actually, I don't know the extent to which models really matter and the extent to which the learning strategies really matter. 173 00:30:03.650 --> 00:30:07.759 Dr. Suchi Saria: Ultimately, maybe it's just about the amount of data you have and the size of the data you have 174 00:30:07.970 --> 00:30:17.659 Dr. Suchi Saria: and turns out in this domain because the data is so complex because there's so much richness in the data. If you can truly exploit it it dramatically 175 00:30:17.750 --> 00:30:22.873 Dr. Suchi Saria: in impacts the quality of the models. So here's a paper 176 00:30:23.750 --> 00:30:27.290 Dr. Suchi Saria: and I'm just gonna present a couple examples to show. 177 00:30:27.510 --> 00:30:55.420 Dr. Suchi Saria: This was a paper published in Tpami. What it talked about was intelligently leveraging missingness. So typically the way people think about it is, you take missing data and you either do last one carry forward, or you basically you know, ignore the samples where there's, you know, either you don't include the modalities where there's a lot of missingness. Or if you include them, you basically use some simple imputation strategies for filling in the data. 178 00:30:55.987 --> 00:31:11.830 Dr. Suchi Saria: By contrast, if you're able to basically leverage, the ability to use context to do intelligent interpolation where you're not just filling it in. But you're maintaining uncertainty over what that value might have been. So that's number one and number 2. 179 00:31:11.900 --> 00:31:22.159 Dr. Suchi Saria: You leverage the uncertainty in each of the missing measurements to then project your uncertainty into the actual output. So let's say in this example, you're trying to forecast 180 00:31:22.180 --> 00:31:51.650 Dr. Suchi Saria: sepsis. No sepsis. You're giving your uncertainty calibrated uncertainty intervals around your prediction in this your probability of having sepsis. And then what's and you can leverage that to now create what is an optimal stopping problem? Right? So you can think about in a base decision theory. Theoretic way. Is it better for me to, you know? Stop! Which is to alert, or am I unsure enough that maybe I should collect a little bit more data, and 181 00:31:51.750 --> 00:31:55.029 Dr. Suchi Saria: then reconsider whether or not to alert it the next time. 182 00:31:55.060 --> 00:32:07.850 Dr. Suchi Saria: And when you take an approach like this turns out you can dramatically improve again, accuracy. So here what we're showing is compared to standard approaches. We used a whole slew of standard approaches. So I've simplified the slide greatly. 183 00:32:08.070 --> 00:32:20.429 Dr. Suchi Saria: Results and paper. We show that this sort of more intelligent weight and watch type strategy that leverages. you know, intelligent uncertainty, quantification. Can actually. 184 00:32:20.560 --> 00:32:22.247 Dr. Suchi Saria: at the same 185 00:32:22.820 --> 00:32:31.090 Dr. Suchi Saria: at the same ppv. Dramatically improves increased sensitivity. So almost increased sensitivity by 30 to 40%. 186 00:32:31.100 --> 00:33:01.070 Dr. Suchi Saria: Sorry, 300 to 400%. So 3 x to 4 x improvement over standard approaches. So just using this one example, one type of strategy as an example, this is another example here. What we did was basically say, when you're looking at large scale observational data from Hrs, what we are really looking at is sequential data. And when we're looking at sequential data. A lot of the ways people would often learn predictive models is, they would say, Okay, let me look at the outcome 187 00:33:01.220 --> 00:33:09.209 Dr. Suchi Saria: and then work backwards to say, what is the data? Let's say you wanted to predict, should this person is this person at risk for mortality? 188 00:33:09.300 --> 00:33:26.210 Dr. Suchi Saria: So patient comes into the hospital at a given time. You're trying to predict risk of mortality. The standard practice was to say, let me look at whether or not they died in the hospital. That's the outcome. Let me look at all the inputs, all the variables that exist to date as the inputs and then train a model to do prediction. 189 00:33:26.480 --> 00:33:27.360 Dr. Suchi Saria: The 190 00:33:28.340 --> 00:33:58.069 Dr. Suchi Saria: now this has become well understood, though wasn't well understood at the time, is that you know this is a scenario where the actions the physicians take dramatically impact the next step and the next step and the next step. So you're really doing learning in a sequential decision-making, setting, but sequential decision making setting where there's a model that is driving the actions. So if we can instead train sort of a risk objective, which is y given x, but instead do a counterfactual objective, which is 191 00:33:58.310 --> 00:33:59.480 Dr. Suchi Saria: why 192 00:33:59.570 --> 00:34:08.248 Dr. Suchi Saria: given X. And do what actions they chose to do, which means you can now start to think about what was some alternative actions? 193 00:34:08.620 --> 00:34:11.950 Dr. Suchi Saria: you can actually 1. 1st of all, it's sort of 194 00:34:12.010 --> 00:34:33.500 Dr. Suchi Saria: theoretically the right thing to do empirically. You learn models that are way more sensical because it now actually understands the impact of actions on how that impacts predictive state. And now your forecast on the predictive state isn't just about the inputs that you happen to be in at the time of prediction, but also the possible action sequences that might go into impacting the outcome. 195 00:34:34.380 --> 00:35:00.379 Dr. Suchi Saria: So the ability to leverage, causal inference and counterfactual objectives, to be able to improve training, another strategy dramatically improve both the intelligibility, actionability, and quality of the models, and then, finally, the last slide on modeling. This is more recent work from my lab. This is on taking transform based models that we have today which have been shown to be very effective in language data and really adapting it to multimodal data. 196 00:35:00.380 --> 00:35:15.000 Dr. Suchi Saria: And the adaptation to multimodal data with this kind of high dimensional multimodal data isn't just as simple as let's just take the models. Let's just take all our streaming data, turn it into inputs and go 197 00:35:15.200 --> 00:35:17.170 Dr. Suchi Saria: either regress or classify. 198 00:35:18.170 --> 00:35:21.400 Dr. Suchi Saria: But the challenge here is again taking into account 199 00:35:21.870 --> 00:35:33.489 Dr. Suchi Saria: the complexities of the individual modalities, and how they relate in order to improve the quality of the resulting models and the intelligibility and actionability of the existing models. This is very new work. 200 00:35:34.650 --> 00:35:38.339 Dr. Suchi Saria: We have a number of papers here, but also lots in progress. 201 00:35:38.700 --> 00:35:40.327 Dr. Suchi Saria: Okay, so that's 202 00:35:41.160 --> 00:35:47.099 Dr. Suchi Saria: a little bit on the modeling front. I'm going to now talk a little bit about generalization when you move from lab to the real world setting. 203 00:35:48.284 --> 00:35:49.780 Dr. Suchi Saria: So in 2020, 204 00:35:50.080 --> 00:35:54.270 Dr. Suchi Saria: we wrote this paper. This was an invited perspective in the New England Journal of Medicine. 205 00:35:54.380 --> 00:35:59.239 Dr. Suchi Saria: and what it talked about was today in medicine, we're starting to consider. 206 00:35:59.450 --> 00:36:09.679 Dr. Suchi Saria: You know, there's been a history of applying data driven tools in the form of like what is called clinic clinic clinical decision support tools. But these Cds tools. 207 00:36:09.820 --> 00:36:17.350 Dr. Suchi Saria: you know, we don't have. We don't really think about them as like tools that like performance can really change and adapt and degrade over time. 208 00:36:17.540 --> 00:36:24.749 Dr. Suchi Saria: As the you know, the shifts in the environment shifts in the population shifts in practice patterns that concept in 209 00:36:25.682 --> 00:36:35.399 Dr. Suchi Saria: sort of medicine was relatively underutilized, and not, as broadly understood. So in 2020, this perspective sort of 210 00:36:35.530 --> 00:37:03.129 Dr. Suchi Saria: took our experience in the AI front and married it with the experience in deploying practical tools to say in practice, here are the 20 ways in which things can actually drift and shift. And these are different kinds of drifts and shifts with examples. And therefore, in order to build a way to really deploy AI robustly, we need the ability to think about more holistically and systematically, how we put a monitoring system in place that detects these 211 00:37:03.280 --> 00:37:13.478 Dr. Suchi Saria: and put a correction loops in place that allows us to really give guarantees either on our system that it won't drift or shift, or that it can auto tune in order to get to a level of performance. 212 00:37:14.190 --> 00:37:24.180 Dr. Suchi Saria: that is as expected, and it's not behaving in unreliable ways. So it's a paper that's then spawned off a whole bunch of subsequent work. 213 00:37:25.440 --> 00:37:28.419 Dr. Suchi Saria: Here are some examples of shifts in practice. 214 00:37:28.750 --> 00:37:30.199 Dr. Suchi Saria: So, for instance. 215 00:37:30.380 --> 00:37:35.040 Dr. Suchi Saria: this, all this work in sepsis, in around around 2016, 216 00:37:35.060 --> 00:37:49.270 Dr. Suchi Saria: 2017, led to a big shift in policy where health systems started the center for Medicare Medicaid started realizing we actually should be able to improve sepsis outcomes. If we can figure out a way to incentivize providers to be more vigilant. And so 217 00:37:49.430 --> 00:38:04.090 Dr. Suchi Saria: they put requirements in place where they wanted to make sure there's there's certain tests were being measured in order to show that the providers were being vigilant, which means that the frequency with which certain tests were ordered when 218 00:38:04.710 --> 00:38:06.419 Dr. Suchi Saria: and the patterns of 219 00:38:06.430 --> 00:38:11.020 Dr. Suchi Saria: practice in how those tests were ordered, and when those tests were ordered changed. 220 00:38:11.080 --> 00:38:15.999 Dr. Suchi Saria: So a system that leveraged as inputs test values 221 00:38:16.530 --> 00:38:39.900 Dr. Suchi Saria: and the frequency and measurement of those test values would suddenly no longer be like reliable. So here below is a paper where we talk about this, and we show how, if you had instead implemented systems with more robust and learning algorithms that leverage contractual objectives and causal inference. You could have built systems that are more robust to these kinds of practice pattern changes. 222 00:38:40.684 --> 00:38:42.740 Dr. Suchi Saria: I've got another example. 223 00:38:42.950 --> 00:38:59.749 Dr. Suchi Saria: This is a paper from Mount Sinai, where they showed that a system in imaging by mistake learned sort of a lot of local patterns around the device rather than the disease itself, and as a result, led to a model. That sort of did very well in Washington Hospital but didn't generalize. 224 00:39:00.500 --> 00:39:17.289 Dr. Suchi Saria: So in order to be able to tackle these, there's sort of like 2 classes of categories 2 categories of solutions. One is, what can we do prior to deployment to mitigate these kinds of shifts? The second is, what can we do? 225 00:39:17.300 --> 00:39:23.829 Dr. Suchi Saria: Post deployment to monitor, detect, and enable learning for these kinds of shifts. 226 00:39:24.150 --> 00:39:37.210 Dr. Suchi Saria: We've spent quite a bit of time in both, and I'll talk. Give a little bit of glimpse into these areas. So the 1st area is sort of in machine learning. AI, we often think of it as invariant learning or 227 00:39:37.720 --> 00:39:51.250 Dr. Suchi Saria: or shift stable learning or robust learning where you want model, the idea being you want model performance to not deter in unanticipated ways. You want to be able to give guarantees about a model's performance under shifts. 228 00:39:51.470 --> 00:40:03.220 Dr. Suchi Saria: And sort of the question is, how do we do that? And so, under invariant learning or shift stable learning, there's a vast variety of methods that people have invented in order to be able to do that. 229 00:40:03.320 --> 00:40:04.530 Dr. Suchi Saria: One big 230 00:40:04.580 --> 00:40:14.640 Dr. Suchi Saria: class of these methods of what are called data driven domain unaware. So essentially, the way it works is, you know, you might say, Okay, let me go collect data from lots of different environments 231 00:40:14.800 --> 00:40:38.430 Dr. Suchi Saria: by collecting data from lots of different environments. I can. I can. My data inherently captures the kind of variation that exists. And I can now take my learning algorithm in order for it to generalize across these different environments by construction, it'll be forced to learn ideas that are invariant or ideas that generalize across these different domains so pros super easy to use. 232 00:40:38.430 --> 00:40:56.659 Dr. Suchi Saria: Cons, you really require data collection from lots of different environments which in medicine turns out to be a really expensive enterprise. But also you don't have sort of this is mostly empirically motivated. Right? Like you don't have guarantees on performance. The guarantees are basically reliant on you collecting enough data 233 00:40:57.050 --> 00:41:03.790 Dr. Suchi Saria: that you can capture the desired invariances. So a different parallel type of approaches 234 00:41:03.840 --> 00:41:14.080 Dr. Suchi Saria: actually using domain knowledge to be able to in enforce invariances. So here the idea is using calls and graphs. You can use domain knowledge to specify 235 00:41:14.100 --> 00:41:22.632 Dr. Suchi Saria: shifts of interest, but also that you want to be invariant too, and then learn models that are stable against them. So 236 00:41:23.916 --> 00:41:34.679 Dr. Suchi Saria: as an example like here. So there's a class of work we did in this area where essentially you take data from. And so very often, when anybody wants to do causal inference 237 00:41:34.820 --> 00:41:42.120 Dr. Suchi Saria: practically, or will people get very scared because they're like causal influence is very theoretical. We very rarely have access to a causal tag. 238 00:41:42.390 --> 00:42:04.920 Dr. Suchi Saria: It's not very relevant, you know. It's very hard to use in practice. It also requires understanding your data and domain very painful exercise. So, as a result, historically, these kinds of roaches have had less traction in practical environments, since I have access to both the methods and the practice. I'm in a unique position to challenge that myth. So here, what we did was basically 239 00:42:05.320 --> 00:42:32.530 Dr. Suchi Saria: we took data, real world practical data and learned partial dags from that data. So while you don't run the full causal graph, you can get partial dags from the data that, combined with your own understanding of the domain to start identifying desired invariances. So, for instance, I won't go into the details of this. But basically, in the examples that I just gave you. You can identify what are dependencies in the Pdag that you would like to be invariant to. 240 00:42:32.640 --> 00:42:38.199 Dr. Suchi Saria: and you can. And then, essentially, you can think of any learning problem. 241 00:42:38.230 --> 00:42:58.670 Dr. Suchi Saria: any learning model like this. This approach is basically model agnostic it, you know, it doesn't matter whether you're using transformer based models or just simple logistic regression. A hierarchical mixture of experts, independent of the choice of model used. Sort of the intuition of the method is as follows, you learn a pdac, you specify the dependencies you don't want to learn. 242 00:42:59.250 --> 00:43:08.139 Dr. Suchi Saria: Typically people think of this as pruning. So then you remove all those variables. But here, instead of removing those variables, what we're gonna do is remove those edges on the graph. 243 00:43:08.300 --> 00:43:13.109 Dr. Suchi Saria: which means remove the learning methods, ability to learn those dependencies. 244 00:43:13.300 --> 00:43:14.400 Dr. Suchi Saria: And now 245 00:43:14.660 --> 00:43:19.260 Dr. Suchi Saria: effectively, what you end up learning is, instead of learning one big consolidated model. 246 00:43:19.570 --> 00:43:29.630 Dr. Suchi Saria: which is one big joint distribution using whatever advanced or fancy model you want to use. You are instead learning a collection of models that stitch together. 247 00:43:29.680 --> 00:43:36.220 Dr. Suchi Saria: And by doing so you can actually build, give guarantees that the resulting method. 1st of all, it's a wrapper method. It can be 248 00:43:36.330 --> 00:43:40.119 Dr. Suchi Saria: used with any state of the art model that exists. 249 00:43:40.567 --> 00:43:54.052 Dr. Suchi Saria: So it's very flexible. You can. You can give technical guarantees. You can give theoretical guarantees around the procedure sound, which means the return distribution is going to be invariant to the ships that we just specified as 250 00:43:54.650 --> 00:44:02.290 Dr. Suchi Saria: using the invariant spec diagram is desired. The procedure is complete means. If it fails, then there's no estimate estimable 251 00:44:02.777 --> 00:44:16.372 Dr. Suchi Saria: in varying distribution. And you have to basically relax your constraints. You basically want to be able to say, Okay, well, I can't get something that is, you know. You you're willing to take in more. 252 00:44:17.122 --> 00:44:33.797 Dr. Suchi Saria: you're willing to take in more willingness to for the to be noisy if you will, as you move from one environment to the other, and it's distribution and it's also efficient. It's the best, most efficient possible estimation estimator. You can get 253 00:44:34.450 --> 00:44:35.340 Dr. Suchi Saria: so 254 00:44:35.480 --> 00:44:38.049 Dr. Suchi Saria: super exciting, practical. 255 00:44:38.070 --> 00:44:40.890 Dr. Suchi Saria: flexible, and can be applied in practice. 256 00:44:41.216 --> 00:44:45.959 Dr. Suchi Saria: I link to a collection of papers on this topic here, if anyone is curious. 257 00:44:46.110 --> 00:44:58.540 Dr. Suchi Saria: And you know, there's been a fair amount of interest and follow up in this. And so then, moving to sort of the second topic here, which is more around mitigation post deployment, an area where we're doing a whole lot of active work now. 258 00:44:58.540 --> 00:45:17.740 Dr. Suchi Saria: And so I'll talk a little bit about this. So here's kind of the idea. We have systems that are making prediction. The question is, can we put uncertainty, use uncertainty, quantification methods to figure out, you know, predictive confidence interval around the prediction. How much do I trust it? 259 00:45:17.940 --> 00:45:21.149 Dr. Suchi Saria: But then the exciting question, then here is 260 00:45:21.260 --> 00:45:42.040 Dr. Suchi Saria: one. Can we get? You know, you could imagine if you could do this well, I already gave you all the examples of how you can improve accuracy by basically building more intelligent policies that leverage this to change system behavior. But I'll show you later results. But you can also use this to now change the way the humans are teaming with the system right? Like you can. Essentially the 261 00:45:42.360 --> 00:45:49.400 Dr. Suchi Saria: enable intelligent suppression you can enable more transparency. And I'll talk a little bit about that. 262 00:45:50.410 --> 00:45:57.639 Dr. Suchi Saria: So here again. There's been sort of in the last 4 to 5 years. Great interest in again wrapper methods. So 263 00:45:57.800 --> 00:46:12.519 Dr. Suchi Saria: 1015 years ago, uncertainty, quantification, we should often think of it as Bayesian methods. Right? So the model is Bayesian. We're able to, you know. If you can take a fully Bayesian view, you can propagate uncertainty all the way into any variable that you care to estimate. 264 00:46:12.720 --> 00:46:20.989 Dr. Suchi Saria: but turns out like a lot of you know, we want to be able to. You know, doing full Bayesian inference is extremely expensive in practice. 265 00:46:21.020 --> 00:46:40.649 Dr. Suchi Saria: We want to be able to leverage, you know, other forms of modeling approaches that don't give you that aren't Bayesian in nature, and still be able to do some level of uncertainty, quantification. So these wrapper methods, the benefit is they do not constrain the model to a particular form. 266 00:46:42.190 --> 00:46:44.690 Dr. Suchi Saria: you, in these rival methods. 267 00:46:44.750 --> 00:46:51.840 Dr. Suchi Saria: You want these wrapper methods to be also accurate and informative. So what does it mean to be accurate, informative? It should contain the true label. 268 00:46:51.850 --> 00:46:54.250 Dr. Suchi Saria: We want to make sure that the 269 00:46:54.550 --> 00:47:03.160 Dr. Suchi Saria: well containing the true label is easy. If I say my uncertainty estimate is like, you know, very get wide. So you also want the uncertainty estimate to be not too wide. 270 00:47:03.750 --> 00:47:08.319 Dr. Suchi Saria: And so you want it to be narrow and narrow is informative. 271 00:47:08.930 --> 00:47:22.059 Dr. Suchi Saria: and then you you want it to be sample efficient, which means, if you know, different shifts do occur. You want to be able to recognize it as quickly as possible. You want it to be computationally efficient means. You can implement these kinds of practices and approaches in real world. 272 00:47:22.120 --> 00:47:24.320 Dr. Suchi Saria: Otherwise it's expensive. 273 00:47:24.330 --> 00:47:30.110 Dr. Suchi Saria: and you want to have guarantees to the extent possible, which means you want to be able to say something about the quality of your estimates. 274 00:47:31.330 --> 00:47:49.310 Dr. Suchi Saria: So in practice, sort of the the challenges and the methods that exist has been twofold. First, st a lot of the methods, you know, where people were working in theory were often assuming data. Iid. But you know, real world data like I showed you with the drifts and shifts are not Iid. 275 00:47:49.390 --> 00:48:16.969 Dr. Suchi Saria: And then, second, is that you want to be able to to be able to use in real time world in real world. You kind of need to do a trade off between the computational demands as well as the statistical demands means. If something does drift as early as possible, you identify it. And the reason you want that is because, you know, in many of these high stakes applications we're talking about, there could be patient harm. So the ability to know early or as quickly as possible is actually quite helpful. So 276 00:48:17.490 --> 00:48:42.560 Dr. Suchi Saria: here, what I'm doing is basically the work in our lab, essentially worked in a few different areas. So what I'm doing here is kind of laying out the landscape approaches in distribution, free uncertainty, quantification here on the Y-axis is statistical efficiency on the X-axis is computational efficiency, and traditional methods were either statistically very efficient, but computationally inefficient or computationally very efficient. But statistically inefficient. 277 00:48:43.012 --> 00:48:47.700 Dr. Suchi Saria: So the question was, is there a way to like, you know, trade off the 2 278 00:48:47.800 --> 00:49:07.169 Dr. Suchi Saria: number one? So that's sort of one area where we pushed, and second, to be able to do it in a way where we can relax assumptions around the data not being Iid, but that there are other forms of shifts. And so in work published here in papers in Europe and Icml and a few others. What we showed is methods that allow us to both relax 279 00:49:07.764 --> 00:49:28.885 Dr. Suchi Saria: assumptions. And you know, we point to other related work here, super exciting area, lots of exciting work to be done in terms of how we can achieve both better trade-offs, but also relax the assumptions to be more favorable to real world scenarios like standard covariorship, feedback, covariate shift, and then, more recently, 280 00:49:30.010 --> 00:49:32.669 Dr. Suchi Saria: in a recent paper that came out in Icml 281 00:49:33.029 --> 00:49:46.929 Dr. Suchi Saria: talk about how we can actually really do this in to get these kinds of estimates to be high quality, with guarantees under not just standard and covariate, but multi feedback, covariate, shift, and really other kinds of more complex 282 00:49:46.940 --> 00:49:48.999 Dr. Suchi Saria: drifts and shifts that can occur. 283 00:49:49.258 --> 00:50:00.950 Dr. Suchi Saria: I know, in the interest of time. I won't have the chance to go into any one of these papers in great levels of detail, but I'll sort of tie it all together, and how these methods can now be in practice. What does this enable? So 284 00:50:01.520 --> 00:50:02.913 Dr. Suchi Saria: here? Essentially 285 00:50:04.050 --> 00:50:13.625 Dr. Suchi Saria: one way to think about this is you know, we're we're now able to use these methods to start implementing more intelligent interfaces for suppression. So 286 00:50:14.280 --> 00:50:23.080 Dr. Suchi Saria: here's actually, I'll I'll come back to that in 2 2 seconds. I want to talk about this one more work around, drift and shift detection before we talk about the use of these methods. For 287 00:50:23.160 --> 00:50:25.139 Dr. Suchi Saria: in practice. So 288 00:50:25.450 --> 00:50:32.011 Dr. Suchi Saria: another another type of example, so previously, we were talking about uncertainty, quantification, and 289 00:50:32.560 --> 00:50:37.470 Dr. Suchi Saria: getting access to predictive intervals, we which we can now use for designing more intelligent interfaces. 290 00:50:37.803 --> 00:50:43.956 Dr. Suchi Saria: This is another piece of work. This is with my postdoc. You are, Walt, where what we basically looked at is 291 00:50:44.350 --> 00:51:01.060 Dr. Suchi Saria: in during Covid. One of the interesting things that happened was, you know, like, there's a an algorithm for subset detection that many hospitals used and turns out covid mimics a lot of the rules that these or predictors that the existing algorithm was using. So as a result. 292 00:51:01.070 --> 00:51:09.509 Dr. Suchi Saria: there was it, you know. It started firing all over the place leading to alarm fatigue, but also risk of harm to patients. And so 293 00:51:10.130 --> 00:51:17.339 Dr. Suchi Saria: one of the so naturally, you can think of this as open, set domain adaptation, the idea of like, can we identify novel classes 294 00:51:17.450 --> 00:51:42.419 Dr. Suchi Saria: and novel classes that are coming up and identify them again as quickly as possible. But traditionally, when people are talking about novel category detection or open set domain adaptation there again, thinking about it in scenarios where there's very rigid assumptions about the background distribution, which is the background. Distribution isn't shifting at all. But that's again not true in the real world right? When you go from the winter to the summer, there's shift in populations, 295 00:51:42.750 --> 00:51:48.970 Dr. Suchi Saria: to like. There could be other kinds of shifts and practice patterns, etc. So the idea here was to think about. 296 00:51:49.290 --> 00:51:51.758 Dr. Suchi Saria: can we relax this assumption around? 297 00:51:52.530 --> 00:52:10.186 Dr. Suchi Saria: you know the background distribution to enable or allow benign shifts or shifts that are more reflective of the real world. And so here, what we came up with was, some really nice or more like you have came up with those really really nice ways to 298 00:52:10.830 --> 00:52:14.190 Dr. Suchi Saria: what we call the scar assumption, which is the the 299 00:52:14.440 --> 00:52:17.550 Dr. Suchi Saria: let's assume that in the background distribution. 300 00:52:17.580 --> 00:52:37.562 Dr. Suchi Saria: the appearance of novel subgroups or certain kinds of anomalies are rare, and so, if we can say that certain kinds of anomalies are rare, then under that scenario. We can basically give guarantees on our ability to detect and detect as quickly and early as possible. 301 00:52:38.120 --> 00:52:47.691 Dr. Suchi Saria: and it's called scarcity of unicorn unicorns assumption. And dramatically improve again, our ability to detect in real world scenarios these kinds of changes in 302 00:52:49.640 --> 00:52:55.289 Dr. Suchi Saria: okay, so let's now talk a little bit about how we doing on time. I can't see my clock. 303 00:52:55.802 --> 00:52:59.860 Michael Littman: Yeah, it's a it's 7 min before one pm. East coast time. So. 304 00:52:59.860 --> 00:53:00.250 Dr. Suchi Saria: But if you. 305 00:53:00.250 --> 00:53:01.860 Michael Littman: Kind of land. The plane. 306 00:53:02.020 --> 00:53:15.150 Dr. Suchi Saria: Lovely. Okay, so I'll try to land the plane in the next 7 min. I think I should be successful. So here, what? We're going to talk a little bit about is we talked a lot about like methods in terms of translation from the lab to the real world. 307 00:53:15.547 --> 00:53:19.240 Dr. Suchi Saria: Now, a little bit about human machine teaming and use. And so 308 00:53:20.020 --> 00:53:29.330 Dr. Suchi Saria: here, essentially, this was a paper that came out in nature digital medicine, where we basically spoke about where we sort of did a study on 309 00:53:29.720 --> 00:53:38.309 Dr. Suchi Saria: what are mental models of physician trust, and how you know what are barriers to getting Physician trust. And you know, what are some roadblocks we need to uncover 310 00:53:38.400 --> 00:53:41.795 Dr. Suchi Saria: who come in order to build trust with 311 00:53:42.370 --> 00:53:44.040 Dr. Suchi Saria: machine learning systems 312 00:53:44.060 --> 00:53:47.319 Dr. Suchi Saria: super interesting because it taught us a lot about 313 00:53:47.780 --> 00:53:55.850 Dr. Suchi Saria: both models of trust, but also, like some barriers we would need to think about in order to be able to build systems that would get adopted. 314 00:53:55.880 --> 00:54:09.609 Dr. Suchi Saria: We tackled several of those learnings, and then building a system that we then presented within that pragmatic study and trial that I spoke about and showed that, and in a quantitative study then showed we were able to drive very high adoption. 315 00:54:10.130 --> 00:54:15.239 Dr. Suchi Saria: And so I'll talk a little bit about. There are some kinds of learnings, which was more around. 316 00:54:15.808 --> 00:54:21.940 Dr. Suchi Saria: The the you. You need the system to be accurate. If the system 317 00:54:22.060 --> 00:54:27.380 Dr. Suchi Saria: has very high, false, alerting rate, or it doesn't give valuable insight early enough. 318 00:54:27.390 --> 00:54:35.960 Dr. Suchi Saria: Ultimately, clinicians are very smart. They're not going to use it because they don't see the data being informative. So that's sort of one very high level number 2, 319 00:54:36.260 --> 00:54:39.529 Dr. Suchi Saria: you know, like they have mental models of how 320 00:54:39.560 --> 00:54:47.290 Dr. Suchi Saria: the disease works. So if it contradicts the way it works. It's, you know, they're less likely to trust it. So again, thinking about 321 00:54:47.340 --> 00:55:04.990 Dr. Suchi Saria: intelligibility is very important. Turns out they don't care a whole lot about interpretability. So, for instance, the example they would often give is, I don't need to know how every bit of my Fmri machine works, or I don't need to know. I don't know what neurons are firing for my colleague. 322 00:55:05.090 --> 00:55:13.679 Dr. Suchi Saria: who, I trust to give me a diagnosis or to help me with the case, but I do understand the way they think and why they think what they think, or 323 00:55:13.880 --> 00:55:19.510 Dr. Suchi Saria: I have the ability to take what they're telling me and use it in enhancing my own way of reasoning. 324 00:55:19.530 --> 00:55:38.920 Dr. Suchi Saria: And so those were some really interesting ideas and basically being able to build a teaming model where there's a very clear role for what information the system is providing, and how is it providing, and does the clinician understand what information is being provided in a manner that they know how to ingest and incorporate into their own reasoning procedure. 325 00:55:39.350 --> 00:55:42.730 Dr. Suchi Saria: and then that allows them to be more effective with it. 326 00:55:42.820 --> 00:55:51.549 Dr. Suchi Saria: Now, when you go down, that line turns out, there are also lots of ways to in which you can create under reliance. So under reliance is basically the system, does it? 327 00:55:51.640 --> 00:56:00.060 Dr. Suchi Saria: The clinician doesn't trust the system and won't use it. But there's also over reliance, which is, you start trusting it too much, which is a problem when the AI is not right. 328 00:56:00.070 --> 00:56:13.020 Dr. Suchi Saria: So this piece of these collections of work kind of stem from that which is starting to think about under reliance, normal reliance. And you know, interventions we can do based upon the outputs we've generated to 329 00:56:13.110 --> 00:56:23.269 Dr. Suchi Saria: improve, you know, to study over over and under reliance. And so a couple of quick takeaways again in the interest of time, won't go into depth. But here, what we're doing is again. 330 00:56:23.890 --> 00:56:45.469 Dr. Suchi Saria: it's an Rct. It's a randomized trial with like where we are presenting AI advice versus we're presenting the AI, and we're presenting different forms of advice, correct advice versus incorrect advice. And then we're also changing the interface which is in, we're presenting different types of explanations along with the advice we're also changing whether or not we give 331 00:56:45.798 --> 00:56:55.010 Dr. Suchi Saria: the the whether the AI's confidence is exposed to the to the user. This is a paper in radiology that's coming up. If for anyone who's curious. 332 00:56:55.010 --> 00:57:17.770 Dr. Suchi Saria: a bunch of really interesting experiments here. We also from a user perspective, had experts and non-experts. So experts means in medicine. They see this. These are specialists. They see this as the domain non-experts, meaning they're still medical experts, but not maybe specialists in this domain. So like emergency medicine physicians would be considered non-task experts, radiologists would be considered task experts for imaging data. 333 00:57:17.810 --> 00:57:20.659 Dr. Suchi Saria: And so some couple of interesting takeaways here. 334 00:57:21.440 --> 00:57:25.849 Dr. Suchi Saria: So turns out, basically, there's some kinds of explanations. 335 00:57:26.000 --> 00:57:34.840 Dr. Suchi Saria: It physicians. like. So we measured impact on diagnostic accuracy. And then what we saw was basically under correct advice. 336 00:57:35.090 --> 00:57:58.139 Dr. Suchi Saria: They will, you know, diagnostic performance improved a great deal more when the explanation was a local type of explanation as opposed to a global type of explanation. So local type of explanation means that localizes. Where in the image something is wrong. And why? And global is it says, you know, because of other images like this image. Let me give you some examples of other images where this thought to be true. Now, local image turns out local advice. They both. 337 00:57:58.260 --> 00:58:05.329 Dr. Suchi Saria: they they know how to ingest it better. They know how to reason it with it better, and as a result their performance is impacted more. 338 00:58:05.400 --> 00:58:14.400 Dr. Suchi Saria: Now the good news is when the AI is correct, it actually improves the diagnostic performance because they're likely to rely on it compared to when you're given global advice. 339 00:58:15.520 --> 00:58:19.669 Dr. Suchi Saria: So high high confidence, local AI explanations now even more interesting. 340 00:58:19.750 --> 00:58:40.110 Dr. Suchi Saria: They're more likely to persuade they're more likely to sway non-task experts than task experts. There's, you know, now, evidence in the literature that task experts. They're more likely to be biased against AI, which is, if they know it's AI advice they don't take to tend to take it as seriously. But if the same advice came, but you told them it's from a human, they're more likely to take it. 341 00:58:40.680 --> 00:59:08.249 Dr. Suchi Saria: Turns out non-task experts have less of this bias and non-task experts. The local explanations are very, very helpful. So in a lot of these access type applications where we're moving applications into a setting like the primary care, primary care scenario, or the Ed scenario where we are trying to move from specialist to there's sort of this exciting insight here that you can design them in a way that non-task experts are likely to rely and use. That helps improve diagnostic performance. 342 00:59:08.540 --> 00:59:14.550 Dr. Suchi Saria: There's also sort of this second really interesting concept of like trust and simple trust. 343 00:59:14.580 --> 00:59:20.784 Dr. Suchi Saria: And the way we measured simple trust was this, you know, in a given image there's a whole bunch of concepts. 344 00:59:21.600 --> 00:59:31.339 Dr. Suchi Saria: you know, they're they're basically annotating the image and not annotating, but giving the impression on this image of a chest X-ray of a number of different clinical things that could be going on. 345 00:59:31.440 --> 00:59:45.789 Dr. Suchi Saria: So what we're measuring is basically the AI, let's say there are 6 things going on. Let's say the AI identified 4 of them, and the expert identified 2 of them, and they agreed on 2 of the 6. So that would be percentage. Alignment is 2 out of 6. 346 00:59:45.860 --> 00:59:51.000 Dr. Suchi Saria: And then the question is, how quickly did they align like, how long did this hover on the image? 347 00:59:51.110 --> 00:59:58.239 Dr. Suchi Saria: And if you could take the percentage alignment divided by the amount of time it took. It's a metric for measuring simple trust. 348 00:59:58.634 --> 01:00:00.700 Dr. Suchi Saria: And so the idea is like. 349 01:00:00.850 --> 01:00:12.730 Dr. Suchi Saria: if the explanations make sense to them, and they're able to get to the right answers faster. That's good. That's that makes the numerator go up. If it takes them, they can get there quickly. That makes the denominator go down. Simple trust goes up. 350 01:00:13.250 --> 01:00:14.250 Dr. Suchi Saria: And so 351 01:00:14.780 --> 01:00:31.609 Dr. Suchi Saria: I think this is the last data slide. But basically again, interesting takeaways here. Local explanations, easier drive, simple trust means physicians are more likely to both agree, but also get there faster compared to global explanations 352 01:00:31.630 --> 01:00:40.910 Dr. Suchi Saria: on the flip side. There's it's also possible to be overly reliant. And one of the interesting next steps from here is something like 353 01:00:42.160 --> 01:01:02.040 Dr. Suchi Saria: Can we like maybe take other mechanisms, other mechanisms like maybe thinking of the uncertainty, interval, or the confidence of the AI as way to modulate, you know, when to trust and when not to trust, so that they're less likely to over align the scenarios where the AI is wrong, and then they are more likely to, you know, they continue to adopt and 354 01:01:02.270 --> 01:01:11.921 Dr. Suchi Saria: perhaps improve reliance in the scenarios where the AI is correct, because that would lead to a system that would be more performant. So with that, I'll start wrapping up, which is to say, you know, 355 01:01:12.700 --> 01:01:37.399 Dr. Suchi Saria: it's it's extremely hard to do research, but it both has good, sound theoretical, methodologic foundations, but also is married to practical, real world problems. It's interesting. I feel like as a faculty as a professor of computer science. It's extremely easy to be carried away or to be incentivized to just write, you know, papers and Icml and Europe's, and 356 01:01:37.400 --> 01:02:04.969 Dr. Suchi Saria: you know, machine learning venues where you know the bar for the quality in terms of you know our deep understanding of the problem, and whether the quality of the evaluations are really all that good can really sway us into believing something is actually working when it's not, or can sway us into believing. Something is very useful when it's not so. It takes a lot more work to be able to understand the real world. But deep understanding of real world requirements often inspires. 357 01:02:04.980 --> 01:02:31.689 Dr. Suchi Saria: I think, some of the most exciting foundational work. It's slower, it's more painful. But I think it's it's just exciting what is possible. And so I want to encourage and embrace the marriage of the 2. You know, being able to. The latter is real painful work. But again, I think it's from a reward perspective very valuable in terms of leading us to innovations that really matter in practice. 358 01:02:31.700 --> 01:02:34.510 Dr. Suchi Saria: And then just a last thread about 359 01:02:35.902 --> 01:02:38.857 Dr. Suchi Saria: treating algorithms like prescription drugs. I think. 360 01:02:39.590 --> 01:02:48.670 Dr. Suchi Saria: as I spend more time. You know, moving. My early research was in AI and machine learning, but more sort of focused in robotics or 361 01:02:48.910 --> 01:02:54.330 Dr. Suchi Saria: kind of domain agnostic. And in the last decade I've spent more and more time learning the innards of medicine. 362 01:02:54.470 --> 01:03:07.589 Dr. Suchi Saria: Kind of see the rigor with which people pursue, and you're deeply understanding whether or not something is working. And I think, as the 2 get married, there's such an excite like as we see are getting out in the real world more and more. 363 01:03:08.104 --> 01:03:20.499 Dr. Suchi Saria: Real opportunity to bolster AI safety research. Today, a lot of AI safety research is what is focused on what I call AI alignment, which is, you know, for a very highly capable AI system. 364 01:03:20.984 --> 01:03:31.630 Dr. Suchi Saria: Is it showing undesirable behavior? I mean, like, for instance, it's taking control. It's starting to do things that we want meaning for it to do by taking control. 365 01:03:31.730 --> 01:03:37.900 Dr. Suchi Saria: But I also think in parallel, there's the significant opportunity to understanding. I risk leveraging existing regulatory lens of 366 01:03:37.930 --> 01:03:39.509 Dr. Suchi Saria: risk benefit, trade-off. 367 01:03:39.620 --> 01:04:03.159 Dr. Suchi Saria: And then really thinking about this notion of reliability around intended use. So we had sort of an expectation of what we wanted it to do. The question is, is it doing that in the real world and that sort of notion of reliability is you know where a lot of the second half of my talk focused and where I think there's a lot to be done in accelerating AI adoption. This is my last slide. Thank you very much. 368 01:04:07.110 --> 01:04:32.150 Michael Littman: All right. There's thunderous applause that is being completely filtered out by the Internet. But I feel it in my bones. Thank you so much for working through that with us. Yeah, there's a lot of material, a lot of technical stuff, because there's the whole medicine side of things. There's the whole statistics side of things. And then there's the whole computer science side of things. And you really do bring all those things together to yeah, to have a positive impact. So I think 369 01:04:32.150 --> 01:04:56.469 Michael Littman: it's just delightful to get to hear about it. So I'm very glad that we've got some questions that have cropped up. We've got about 10 min that we could engage with you on on questions. So let me kick things off. One of the things that we ask our speakers to do often is to say, just a little bit about how you got here. So what was the path that you took that brought you to this particular topic, and and studying it in this particular way. 370 01:04:57.560 --> 01:05:07.289 Dr. Suchi Saria: Yeah, so my, my early background. So I grew up in India. And it was perfectly reasonable to be a nerd in India. So I got into computer science and AI research quite early. 371 01:05:07.613 --> 01:05:32.810 Dr. Suchi Saria: And in particular, you know, this is like 96, 97, 98. I know that people in size who are doing it. But it certainly wasn't a very popular field to the same extent it is today. And most of my interested was in building machines that was smart and intelligent, could do things for me, because I'm a pretty lazy person, so that I thought was like the coolest thing ever, and then fast forward, actually had 372 01:05:32.810 --> 01:05:43.780 Dr. Suchi Saria: almost little to no interest in biology and medicine, which is kind of pathetic, given sort of what you heard today on the talk, and then around 2,008, 9 10, while I was at Stanford 373 01:05:44.020 --> 01:05:46.910 Dr. Suchi Saria: High Tech Act was just about to be and sort of 374 01:05:47.150 --> 01:05:55.750 Dr. Suchi Saria: got interested in the sort of got exposed to the kinds of challenges that I thought would become challenges. As these new kinds of data came to be. 375 01:05:56.050 --> 01:06:03.250 Dr. Suchi Saria: and the need for Aiml to work in these kinds of settings. And that was just fascinating to me. The the 376 01:06:03.280 --> 01:06:30.559 Dr. Suchi Saria: potential for impact was fascinating, but also realizing that it would require really getting in and understanding the weeds, a different field, which is highly complicated and often daunting, I think, as a computer scientist. But I was going through an early midlife crisis around, you know. What was I going to be? As I grew up? And it felt like this was a huge area, untapped area where we really ought to be spending more time, anyway. So that's how I got interested and then had fascinating collaborators who brought me along. 377 01:06:31.480 --> 01:06:48.870 Michael Littman: That's great. Thank you. Thanks for sharing that with us. Because, yeah, I think you're right about the really having to roll up your sleeves and become an expert in this other domain is so daunting to so many people. But I think what you've helped highlight is how important it is to actually have that kind of impact. 378 01:06:48.980 --> 01:07:05.949 Michael Littman: It's a computer scientist who just say, Hey, I'm just. I'm inspired by health or or electricity grids, or whatever it is, and then just make kind of a formal model of it, and study that formal model are never going to have the same kind of real world impact as the people who get in there and figure out what's actually going on. So. 379 01:07:05.950 --> 01:07:18.940 Dr. Suchi Saria: I wanna add one little line there which is not really a plug in any way. But I actually felt like for me moving to a place like Hopkins was also very critical for that like once I started to feel like this is an area 380 01:07:19.598 --> 01:07:35.040 Dr. Suchi Saria: I wanted to learn more, I think, what has been really exciting for me at a place like Hopkins to see like they really encourage, like because of Apl, because of all of the history there of collaborating with Darpa and various agencies around real world problems. 381 01:07:35.200 --> 01:07:48.360 Dr. Suchi Saria: the opportunity to like an environment that encouraged that broad curiosity which there are departments where that would not be. You know that that's not valued to the same extent. So that really sort of also helps shape. 382 01:07:48.460 --> 01:07:51.720 Dr. Suchi Saria: You know how I developed as a, as a researcher. 383 01:07:52.090 --> 01:07:59.520 Michael Littman: Do you have any advice for people who are out there who want to study these things? But they're afraid that it's going to have negative career repercussions. 384 01:07:59.680 --> 01:08:01.419 Dr. Suchi Saria: I mean, I think, like. 385 01:08:01.610 --> 01:08:09.419 Dr. Suchi Saria: so, this is gonna also sound, probably quite terrible on a call on this kind of audience. But sort of. I used to often say. 386 01:08:09.510 --> 01:08:15.329 Dr. Suchi Saria: I care about the work and the results almost more than I care about being a researcher professor and a successful. 387 01:08:15.450 --> 01:08:22.039 Dr. Suchi Saria: you know, like, I was okay. If I didn't get tenure, I was also completely okay. If I didn't. 388 01:08:22.060 --> 01:08:28.979 Dr. Suchi Saria: 6, you know, sort of the the I mean. It all turned out to be highly like in my favor. 389 01:08:29.000 --> 01:08:43.660 Dr. Suchi Saria: Having said that I felt like that helped relax some of the anxieties. But sort of the advice is, just really go find other people you can partner with and come along and then find mentors, senior mentors 390 01:08:43.700 --> 01:08:48.369 Dr. Suchi Saria: who have been successful in this way, and sort of you know, try to understand. 391 01:08:48.420 --> 01:09:12.400 Dr. Suchi Saria: because there are also lots of ways in which you can do applied research gone wrong. Right? So the scenarios where you're basically not able to do impactful work in either, because you've spent so much time being so far diluted that you weren't able to really leverage your strengths in either, and that's the failure scenario you're trying to avoid. And so, you know, if you can have the right mentors, and you're sort of being very. It's certainly the harder thing to do, but it's doable. 392 01:09:13.630 --> 01:09:35.959 Michael Littman: Outstanding. I we got a couple of questions specifically about the recording of this talk and the slides and if you didn't see it. Edgar, who? Who's our our it specialist kind of handling these talks said that the those will become available on the website. So so know that those are covered. We're not gonna ask such to directly email each with the slides. So we we have them. We'll have them posted 393 01:09:36.109 --> 01:09:54.660 Michael Littman: all right. So a lot of the questions came in anonymously. But this one has a name on it, Syed? Ramis Nakvi asked. I keep coming across discussion on AI's impact on healthcare, especially patient safety. The impact can be negative or positive. What do you think the most important negative impact or concern in this context is nowadays. 394 01:09:55.200 --> 01:10:07.369 Dr. Suchi Saria: I think the most likely negative impact is basically over reliance when the AI is incorrect. Right? So we want to build AI in a way where it's deeply validated to be correct. 395 01:10:07.470 --> 01:10:16.810 Dr. Suchi Saria: We want the build, the to build the interfaces in a way that we can promote reliance and reliance in the right scenarios, and an over reliance. Avoid over reliance in the wrong scenarios. 396 01:10:18.200 --> 01:10:18.890 Michael Littman: Very good 397 01:10:19.801 --> 01:10:28.759 Michael Littman: this. This one, maybe, is more technical than I actually understand. But hope maybe you'll understand it. What were the performance metrics to compare against Cds tools. 398 01:10:28.760 --> 01:10:37.780 Dr. Suchi Saria: Yeah, so interestingly, it's extremely common when we try to compare performance to use like what I call lab metrics. 399 01:10:37.840 --> 01:10:45.510 Dr. Suchi Saria: So lab metrics are like typical model metrics like accuracy, and things like Roc curve Auc Roc and things like that. 400 01:10:46.016 --> 01:10:56.140 Dr. Suchi Saria: We definitely want to do that. But that alone is not enough. And that was actually one of our very early learnings that shaped the methods that we developed. So, for instance. 401 01:10:56.500 --> 01:11:03.619 Dr. Suchi Saria: So we so I have. My answer to this is actually pretty long if I go into the details of every metrics. But the super short answer was. 402 01:11:03.860 --> 01:11:06.800 Dr. Suchi Saria: we really wanted to understand current standard of care 403 01:11:07.040 --> 01:11:13.070 Dr. Suchi Saria: and think about working backwards from current standard of care like, how do we measure performance against 404 01:11:13.110 --> 01:11:22.360 Dr. Suchi Saria: physician today's standard of care, which is physician performance. So that's number one number 2 against other alternative tools. So other tools they might be using. 405 01:11:22.430 --> 01:11:49.010 Dr. Suchi Saria: And then based on that, come up with meaningful metrics that actually quantify impact. So that means, is it about earlier detection? Is it about earlier detection in all types of cases? Is it about reducing a false alarm. Why is that a problem? Maybe because it's causing context switching, which is costing time. So we work backwards to come up with sort of metrics that are more relevant in practice. And then we kept going backwards to say, How does this, then, translate into model metrics. 406 01:11:49.090 --> 01:11:52.779 Dr. Suchi Saria: and how we measure it in terms of model metrics to be able to get to the result. 407 01:11:52.810 --> 01:11:56.720 Dr. Suchi Saria: And then ultimately, you can do all the positing you want. There's 408 01:11:56.800 --> 01:12:09.329 Dr. Suchi Saria: eventually, because of the lack of very clean gold standards, the real world implementations where you actually get to see in the diversity of cases, what happened in practice. And so that was sort of the way the real World Trial really came to be. 409 01:12:10.380 --> 01:12:22.370 Michael Littman: So there's a couple versions of this question, and I think this related to your answer to the previous question. So maybe this is a quick one. But how are physicians convinced to accept the technology thought readily. 410 01:12:22.800 --> 01:12:27.950 Michael Littman: Like you said, 89% versus normal, 10% for other tools. I believe you said. 411 01:12:27.950 --> 01:12:32.150 Dr. Suchi Saria: Yeah, that I think that's actually a pretty like there are. Probably 412 01:12:32.480 --> 01:12:40.100 Dr. Suchi Saria: if there was one hill you could climb, then we would all climb that hill and we'd be done and turns out there are like 15 or 16 hills we have to climb along the way. 413 01:12:40.290 --> 01:12:46.349 Dr. Suchi Saria: and some of the intuitions I gave was around improving performance. Some of it was around the way the information was delivered. 414 01:12:46.740 --> 01:12:48.750 Dr. Suchi Saria: What was the teaming interface? 415 01:12:48.870 --> 01:13:02.599 Dr. Suchi Saria: How you implement matters? A lot in terms of gaining trust. Is it easy to use? Is it in the way they expect it to be the things that are novel? Do they understand why it is normal novel like, is it in a language that they understand? Does it follow their trust model? 416 01:13:02.800 --> 01:13:11.390 Dr. Suchi Saria: And that's those are sort of the kinds of things we have to do to drive adoption. And you know that paper we have that I pointed to would be an interesting one to look at just a starting point. 417 01:13:11.570 --> 01:13:12.170 Michael Littman: Nice. 418 01:13:13.130 --> 01:13:32.160 Michael Littman: all right, we're almost out of time. But let me see if I can squeeze in another question Sanjana Mendu asked. Thank you for the insightful talk. How do you recommend navigating health-related Ml. Problems with a much narrower time window for both monitoring behavior and adapting to model responses, for example, how do you foresee these models impact clinical practice in domains like surgery? 419 01:13:32.370 --> 01:13:49.220 Dr. Suchi Saria: I actually think the setting in which the the setting that I use is actually quite close to the setting where it's real time impacts in real time, and the window of opportunity is like short, relatively short on the typical, like, you know, this chronic disease, like diabetes, etc. 420 01:13:49.370 --> 01:13:57.960 Dr. Suchi Saria: or Copd, where maybe there's short term effects, but there's lots of long term effects. But in the scenarios I spoke about like surgery would be an example. 421 01:13:58.293 --> 01:14:08.799 Dr. Suchi Saria: You know, there's lots of shorter term things you're looking at so hopefully. The work I presented here was actually more of a model where, even in the most toughest on environments where you need rapid response. 422 01:14:08.840 --> 01:14:10.220 Dr. Suchi Saria: You can actually 423 01:14:10.240 --> 01:14:12.599 Dr. Suchi Saria: kind of make these kinds of methods come to life. 424 01:14:14.780 --> 01:14:20.769 Michael Littman: Do you? This is a variation of a question from mutin Yilmaz. 425 01:14:20.970 --> 01:14:25.589 Michael Littman: Uncertainty. Modeling is really important in the work that you're doing. And of course, in this domain as a whole. 426 01:14:26.080 --> 01:14:36.270 Michael Littman: where, in the modeling process, do you think are the most important places to consider uncertainty? So can you imagine, considering uncertainty at the very beginning, before you've even done the modeling. 427 01:14:37.390 --> 01:14:39.550 Dr. Suchi Saria: Oof I think 428 01:14:39.710 --> 01:14:46.845 Dr. Suchi Saria: pretty much everywhere, is my answer to this question. Like, the more you do, the better you do. So it's more like, 429 01:14:47.290 --> 01:14:53.640 Dr. Suchi Saria: how much money do you have from the various agencies to think about what to model? And then you can kind of push your 430 01:14:53.690 --> 01:14:55.639 Dr. Suchi Saria: limit on taking advantage of it. 431 01:14:56.400 --> 01:14:58.059 Michael Littman: Alright! Fair enough. 432 01:14:58.260 --> 01:15:05.179 Michael Littman: fair enough, all right. So we are out of time. I wanna again thank you so much for for coming, participating in this and sharing your insights with folks. There are.