Case Study: Detecting signs of Fabry and Pompe disease in UK clinical data

Volv, supported by Sanofi, and working with Optimum Patient Care, and collaborating with a specialist Consultant Clinician, is performing research to build algorithms in the UK which are aimed at finding ways to better identify people living with Fabry or Pompe disease. This novel and innovative methodology, inTrigue, is highlighting ways in which we can be much more precise in detecting people living with either disease much earlier. Are you a Fabry or Pompe specialist in the UK and want to know more, or collaborate? Please contact us. inTrigue: helping people living with disease get better outcomes In the sections below you will find an overview of how we create models to help predict which people might be at risk of disease, some of the current performance metrics, and also some background information on both Fabry and Pompe disease. By using the inTrigue methodology in collaboration with Optimum Patient Care (OPC) in the UK and the OPC Research Database and supported by Sanofi, we are learning novel patterns of disease, we do this because using published medical criteria does not help find the patients that remain undiagnosed and in fact highlights many more patients that do not in fact have disease (false positives). The inTrigue approach looks for people that cannot be found using those methods. inTrigue is designed to help clinicians detect the people who are living with a rare or difficult-to-diagnose disease and help uncover those people who are therefore otherwise unlikely to get a diagnosis. Importantly, this is a research project that focusses on a limited population at first works with a population of clinicians that have signed up for the OPC quality improvement (QI) programme to improve the quality of care for patients in general practice aims to use the feedback from clinicians to improve the approach This is a completely different level of performance that promises to reduce the time to a diagnosis, and also importantly, uncover the undiagnosed patients. OPC quality improvement (QI) programme: (https://www.primescholars.com/articles/strategies-that-promote-sustainability-in-quality-improvement-activities-for-chronic-disease-management-in-healthcare-se-100520.html) Volv, Sanofi and OPC: collaborating for people living with disease Volv, supported by Sanofi, and leveraging the data from OPC in the UK, is creating a unique collaboration that does not stop here. Introduction The first phase of this project was to collaborate to build new types of models for two rare diseases: Fabry and Pompe. To do this, we focussed on primary health care records, i.e. the records that general practitioners use. Both diseases are difficult to diagnose for primary care clinicians, and as a result, remain underdiagnosed. For Pompe disease in the UK, it is estimated that 50% of people with the disease are not being diagnosed, leading to a longer delay until they eventually do get diagnosed. This data is managed by Optimum Patient Care, which provides de-identified data, of around 8.5 million patient records, for research purposes. Data security and protection are paramount. This means that the data remains anonymous and secure during the disease model development process. The data complies with: GDPR/ DPA 2018 compliant Secured EHR data extraction Data is de-identified (no PID) Data is pseudonymised SHA256 Secure data encryption AES256 Secure data transfer via HSCN NHS DSP Toolkit (ref: 8HR5) Non-identifiable data is contributed to OPCCRD for ethically approved research NHS IHRA REC (ref: 20/EM/0148) Phase 1: Learn an algorithm/model for the diseases and validate with expert clinicians The first phase of the inTrigue methodology involved an iterative process of finding a way to determine what makes patients with Fabry and Pompe disease stand out from all other patients. We used a combination of data science (or AI) approaches to get to a list of patients that plausibly have a disease. Within this phase, crucially and differentiatingly, we also needed to validate whether the approach has worked by checking the inTrigue results with an expert clinician. We did this with a consultant in a specialist Fabry and Pompe department in a UK teaching hospital. The results of this evaluation can be seen in the results section. Once the clinician’s validation was complete, we then take those inputs and optimise the algorithm, which will again boost the performance. Once this is done, we are ready to move to Phase 2. Phase 2: Clinical follow-up on plausible patients, more accurately and earlier In this second phase, the algorithm is applied to the data, and clinicians are asked if they want to participate in the model deployment programme. The clinicians need to give their consent to be part of this quality improvement programme. Several QI programmes are already in place and if they agree, they can then check to see if any of the patients in their practice are at risk of these diseases. This is done through the remote installation of reports in the GP system. We can then monitor to see if there is an improvement in terms of quality of clinical care. More results on this aspect of the deployment of the models will be published at a later stage, but the optimisation steps post clinician validation shows significant improvement on these results presented here. Later phases After this programme, consideration is being given to deploying the models more widely by embedding them into GP systems nationwide. Initial metrics on model performance Model performance: Fabry disease in UK Task Use model learned via Algorithm SLSL to find undiagnosed FD patients in OPCRD EHR database GP-EHR-DB-UK (18M patients). Evaluation procedure Request that FD specialist practicing in UK review EHRs of top 50 candidate patients (candidates have predicted probabilities exceeding FD threshold FD). Evaluation outcome Results are very promising showing that out of 50 patients the top 25 have a precision of 88%, and when the total 50 patients are considered the precision remains high at 76% using the precision@k metric. Model performance: Pompe disease in UK Task Use model learned via Algorithm SLSL to find undiagnosed PD patients in OPCRD EHR database GP-EHR-DB-UK (18M patients). Evaluation procedure Request that PD specialist practicing in UK review
White Paper: The Path to Rare Disease Clinical Trial Innovation

By Volv Global SA and WODC EU contributors Executive Summary For decades, the pharmaceutical industry has faced the same recurring problems with clinical development: the struggle to fully recruit and retain enough patients, meet target timelines, and have trials conclude on time. Certainly, the industry does overestimate its ability to recruit, but a bigger issue is that study designs and protocol development seemingly fail to truly reflect patients’ lives, or account for the reality in the clinic. In fact, data shows the probability of success for any clinical development effort is 6.2% for orphan drug trials, compared with 13.8% overall, which translates to a 93.8% failure rate for orphan drug development efforts. Given the often progressive and irreversible nature of rare diseases, there is a need to increase efforts to find those undiagnosed patients, diagnose them earlier, and bring them into the frame when developing new treatment options. To achieve this, collectively as an industry, we must do more research into the rare disease patient population to characterise and better understand both the already diagnosed and the undiagnosed. We need this deeper understanding before deciding on the best clinical development strategy, finalising clinical trial design, and starting the enrolment of the patient population in a clinical study. To do that, clinical researchers and drug developers need to include much more knowledge and understanding of those people who are unknowingly living with the disease in the design of clinical development plans and study protocols. To find those people, there is a need to consult more extensively on the design of protocols, not just with the key opinion leaders, but also with physicians that are typically seeing and treating larger numbers of patients. One crucial factor with rare diseases is that the diagnostic journey is arduous and lengthy, often with many patients not being correctly diagnosed. As an example, a study found that 58% of Ehlers-Danlos syndrome (EDS) patients consulted more than five doctors, and 20% consulted more than 20[i]. So, when designing and recruiting for clinical trials, drug developers must first learn where the “as yet undiagnosed patients” are “hidden” – in other words, where they may be in the healthcare system, and which specialists they are seeing. It is those specialisms that need to be brought along in the diagnostic journey, so they can learn to identify rare disease patients within their practice. This is very well illustrated in the case of acute hepatic porphyria (AHP), where the view is that patients reside in the gastroenterology world, but, in fact, an even larger group is residing in other specialties. Another example is cited in Chapter 2. With novel approaches, such as the use of Machine Learning (ML), we can now highlight people who are not yet diagnosed as patients but are likely to be living with a disease, for their clinicians’ attention. Subtle indicators are derived from health care records by using ML, which would be difficult or nigh impossible for a doctor to recognise amidst the wealth of data already in front of them. Conducting thorough natural history studies of patients living with disease, but also including those wider populations of people suspected of living with disease but currently undiagnosed, can help to uncover sentinel events or detectable physiologic changes that are key predictors of disease progression or that are clinically important. These can provide an understanding of which subgroups of people living with the disease might benefit from a drug in development and should therefore be targeted for inclusion in the clinical trial. And, importantly, clinical researchers need to scrutinise the data and adopt insights gained by using ML models which will enable better clinical development strategy, design, and patient stratification. First, though, we need to understand the barriers and misconceptions about the art of the possible and address those directly. This paper explores the changing expectations of the regulators, the challenges the health industry continues to face, and the ways in which we can rethink the entire clinical development process – from development strategy to protocol design, to patient identification and recruitment – to achieve real breakthroughs in rare disease research and development. Chapter 2: Misconceptions and industry challenges The path to rare disease innovation begins with a better understanding of the complexity of each disease – a point well understood by the health authorities. As the US Federal Food and Drug Administration (FDA) has identified in its guidance on natural history studies, rare diseases can have substantial genotypic and/or phenotypic heterogeneity. As such, the natural history of each subtype, if it exists at all, may be poorly understood or inadequately characterised. Above all, a typical natural history study certainly does not include those people living with the disease that – in rare – often remain undiagnosed. There are two levels of undiagnosed patients: those who have had no diagnosis at all and have therefore not been matched with a disease, and those who have had a partial diagnosis but whose symptoms are not well characterised and therefore do not belong in a defined subgroup. As researchers learn more about rare diseases, they are starting to understand that different phenotypes may present with the involvement of different organ systems, with varying degrees of severity or rate of deterioration. As noted earlier, ML can help to elicit subtle indicators from electronic health records or claims data. However, during panel debates at recent orphan drug conferences, there seemed a strong bias towards the use of registries for research and patient characterisation, and there were clear misconceptions from both industry and regulators about the usability of primary care electronic medical records (or electronic claims data) for the purpose of early disease detection, be it in a traditional manner, or ML assisted. The limitations of registries While disease registries have a clear purpose, they are constrained by the fact that they tend only to contain data on patients that are known to have a given disease. By focusing only on rare disease data that already exists in patient registries, research