You now have access to the webinar AI-Driven Insights from Real-World Data, where experts discuss how artificial intelligence applied to real-world data can help accelerate the identification, diagnosis, and treatment of undiagnosed patients. We explore a specific case study in which Volv Global, in collaboration with Takeda, Cleveland Clinic, and Komodo, applied AI-driven insights from real-world data (RWD) to rapidly identify previously undiagnosed alpha-1 antitrypsin deficiency (AATD) patients enabling earlier diagnosis, more equitable care, and personalised interventions at scale.
You can watch the full webinar below:
You will also receive the webinar link by email for future access.
If you have any questions, feel free to contact us at [email protected].
And this is where, of course, detection becomes important, because registrational trials for such drugs will require a big patient cohort that's currently unavailable given the under-recognition of alpha-1. So, in a future state, one could well imagine indications to treat individuals who have yet to develop disease, as is currently not the case. Currently, the indications for augmentation therapy, quite concordant across multiple guidelines, are that this is indicated for individuals who have severe deficiency of alpha-1 and demonstrable emphysema. Most patients with AATD-related emphysema have the PI*ZZ genotype, although there are other genotypes that are rare – Mmalton, Siiyama, etc. – that predispose to emphysema. Many AATD deficient individuals with emphysema have fixed airflow obstruction on pulmonary function tests, but some have preserved airflow with emphysema on imaging.
Vickram Tejwani answer: Yes, I agree. As Dr. Stoller alluded to, there are a number of pipeline therapies. And then the other kind of pipeline component is predictive models, typically omics-based models, to understand who's going to do poorly and who's going to do well, because even among ZZs, there's a huge spectrum of heterogeneity in terms of outcomes, and MZs as well, which we're studying quite intensively, locally. I think it comes down to this jugular point, which is: if you don't know that they have this, they'll never be included in these things moving ahead. And then on a more practical, just immediately implementable component, we know actually from work Dr. Stoller and I did, that in a delayed diagnosis, these patients end up doing worse.
Anecdotally, we've had patients where they get identified, they're perhaps asymptomatic at that time – and we're certainly not treating them, given normal lung function and no emphysema – but on subsequent surveillance, they do develop airflow obstruction. And now we're not dealing with this delay on the back-end of them getting tested. We're already aware of their ZZ status and able to imminently start guideline indicated therapy as opposed to an additional delay, later.
Marie Sanchirico answer: I'm not a clinician, but I would probably even step back and just simply say that being aware of their genetic risk factors, physicians can work with their patients and really minimise maybe behaviours that would lead to onset of pathogenic disease or pathologic disease – reducing smoking and taking care of your lungs and environmental exposures – that awareness can change whether or not the disease is going to onset, but definitely when it might onset.
Datavant tokenisation and master certification to ensure that populations of interest aren't re-identified or individuals aren't re-identified.
Even if you build a strong predictive model, using lab values or imaging results, it may not generalise well to other clinical settings because that data may not be available or might be recorded differently. So then you have a difficulty in moving that model from claims data to other clinical setting, which was key, I think, to this project and important in general.
We are also collaborating with Komodo Health on another project focused on real-world natural history of diseases using claims data. At first, it might sound surprising, but, as Scott mentioned, we have ways to link EMRs or other data sources to that. The challenge is, as soon as we start to do that, that overlap often becomes small and you end up with a smaller cohort it’s harder to track the patient journey.
So if your goal is representative data and meaningful conclusions, whether it is outcome prediction modelling, disease learning, finding undiagnosed patients, I think claims data could be fit for purpose. And we are now stretching that use case to going even beyond that and performing real-world evidence generation, health economics research, as well as natural history studies.
Marie Sanchirico answer: And I would say, Vahid, it comes back to – I think you mentioned it early on – asking the right questions. You do need to think deeply about whether or not the disease that you're exploring has the right data and you have to think about whether the right information is being entered in the data that would inform a model. I have worked in other therapeutic areas and with other models where it hasn't worked, and there are a myriad of reasons. Vahid, you mentioned you have to think about the biases, you have to think about what data is being entered, is the data actually being entered into structured data that you would need to diagnose the patient if you're going to try to accelerate diagnosis? Is it in the structured data or is it somewhere really hidden in messy notes? So really, you do have to do some deep thinking before you kick off a project and ask if your disease state is the right disease state to deploy an AI tool or to choose claims data or whatever approach that you might be thinking about taking.
And then, the question is: will the physician – as Vick mentioned – will the physician behave compliantly on that basis? And if he or she does, does the test demonstrate the presence or absence of alpha-1 – the 6% prevalence that Dr. Tejwani mentioned before. So in that context, because the algorithm is prompting the caring / treating physician, it is not independently generating a test without the physician-patient relationship. This approach is, in our IRB's view , perfectly acceptable.
I'll say one other thing, because I'm aware that one of the other questions regarded newborn screening. And, of course, there is no intent for this AI algorithm to engage with newborn screening. As the questioner is aware, if we, in fact, had a population-based approach to newborn screening, which is a very complicated issue, that would pre-empt the need for testing patients with an AI algorithm, because in that regard, much like cystic fibrosis or phenylketonuria or congenital hypothyroidism, which are routinely tested in the United States at birth, newborn screening would likely obviate the need for an AI algorithm. But again, newborn screening is a very complicated issue with important guardrails on psychological well-being for the patient and for the parents.
That said, this was in silico comparison. In Phase 2, as we scale up into Cleveland Clinic, we plan to track confirmed diagnoses and directly compare age at diagnosis by the model versus the current baseline. So we can make a more confirmatory real-world assessment of how much earlier diagnosis occurs.
We did not start from scratch. We reused the core features and signals learned from claims and then recalibrated the model in the EMR setting. As part of that recalibration, the relative importance of features shifted; which was the purpose of recalibrating on more “gold-standard” labelled patients and cohorts, as Dr. Stoller mentioned.
So the core signals were robust and we were able to migrate them, but we still needed fine-tuning and recalibration to reflect the new environment. This is critical, because one common pitfall is that models developed solely on claims can fail when moved into a clinical setting if that end-to-end adaptation isn’t planned from the start. That’s why we considered the final deployment setting from the beginning..
Is the MZ population also included or only ZZ
Jamie Stoller answer: The algorithm is designed to identify individuals with any deficiency of AAT, including PI*MZ individuals.
When I worked in newborn screening, there was a phenomenon where parents of babies with a false positive screen still believed that their baby was not healthy. Do you have any concerns that the algorithm would lead to people thinking they are at increased risk even if they test negative.
Jamie Stoller answer: This is a good question. I will remind that the AI algorithm does not regard neonatal screening, so the issue that you raise falls outside the study issues. I would regard the test results as being definitive and am hopeful that clinicians discussing the results with patients or their parents could allay concerns is the AAT test shows no deficiency.
Measuring test positivity rate is a core endpoint for the next phase: in Phase 2 we will run a prospective study and track AATD test orders and results among model-flagged patients versus baseline practice, with testing performed at the clinician’s discretion. That will allow us to report the test positivity rate before and after deployment in a clinically meaningful way.
Jamie Stoller / Vickram Tejwani answer: This is a good question. Bronchiectasis does occur in association with AATD and we anticipate, based on Phase 1 data, will be a feature that factors in to increasing the probability of any single individual’s likelihood of AATD. Imaging studies suggest that radiographic evidence of bronchiectasis is very common and that clinical bronchiectasis – with copious purulent phlegm, etc. – occurs less commonly, i.e., in approximately one quarter of individuals with severe deficiency of AATD. Bronchiectasis is usually associated with some evidence of emphysema, in which case augmentation therapy might be indicated. I am unaware if data supporting augmentation therapy in the presence of “pure” bronchiectasis.