612 - AI-assisted variant scoring to improve diagnosis of rare genetic diseases
Monday, April 27, 2026
8:00am - 10:00am ET
Publication Number: 4599.612
Klaus Schmitz-Abe, University of Miami, Miami, FL, United States; Madesh Chinnathevar Ramesh, University of Miami Leonard M. Miller School of Medicine, Miami, FL, United States; Qifei Li, University of Miami Miller School of Medicine and Holtz Children's Hospital, Miami, FL, United States; Shiyu Luo, University of Miami Leonard M. Miller School of Medicine, Miami, FL, United States; Pankaj Agrawal, Holtz Children's Hospital Jackson Memorial Hospital, Miami, FL, United States
Associate professor University of Miami Miami, Florida, United States
Background: Diagnosing rare genetic disorders is especially challenging due to the wide range and complexity of symptoms. Affected children and their families often experience long periods of uncertainty and multiple tests before an accurate diagnosis. A major barrier to improving diagnosis is the reliance on manual interpretation of genetic variants, which is time-consuming, expertise-dependent and inconsistent across laboratories. Lack of standardization in this process leads to variability in diagnostic yield and delays in appropriate patient care. Objective: We are refining our original neural network (NN) model using advanced artificial intelligence (AI) methods evaluating multiple model architectures. Design/Methods: In 2017, we developed Variant Explorer Pipeline (VExP), a validated pipeline that efficiently scores and prioritizes candidate pathogenic variants. Originally trained using a small neural network (NN) model on 250 solved cases with whole exome sequencing data (WES) , VExP integrates genomic annotations and clinical features to enhance variant prioritization accuracy. The pipeline processes SNVs, INDELs, CNVs, and STRs using ~135 features including CADD, SpliceAI, HPO, gnomAD, and GTEx annotations. VExP uniquely supports integrated scoring of CNVs and STRs alongside SNVs and INDELs. Results: VExP on 791 additional WES solved cases demonstrated superior prioritization efficiency compared to two widely used pipelines in the scientific community, Exomiser [Robinson et al., 2014] and Xrare [Li et al., 2019]. VExP successfully ranked the disease-causing variant among the top 20 in 84% of cases, significantly outperforming Exomiser (49%) and Xrare (42%). This robust performance highlights VExP’s effectiveness across inheritance models (recessive and dominant). To date, VExP has contributed to over 25 published studies, facilitating both novel gene discoveries and expansions of genotype-phenotype correlations. In prior use, VExP increased diagnostic yield by over 15% in reanalyzed exomes from previously unsolved cases.
Conclusion(s): We are refining our original neural network (NN) model using advanced artificial intelligence (AI) methods. Our current goal is to develop an enhanced AI model, trained on a larger dataset (~1,500 solved cases) with updated databases/annotations further improving its accuracy and clinical utility. To ensure unbiased evaluation, validation for the new AI model is planned on three independent cohorts (~900 solved cases): Undiagnosed Disease Network (UDN), Deciphering Developmental Disorders (DDD) and Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR).
Variant Explorer Pipeline (VExP) Picture0.pdfIn 2017, we created Variant Explorer Pipeline (VExP) [Schmitz-Abe et al., 2019 & Schmitz-Abe et al., 2020] and it is an innovative and validated custom-built pipeline to narrow down potential candidate variants. VExP uses a classification/scoring ranking method for disease causing variant and it was based on a small Neural Network (NN) model trained by 250 samples from the Gene Discovery Core of the Manton Center.
VExP results: neural network model. VExPvsExomiservsXrare.pdfThe graph show the diagnostic genes rank within top 1 to top 100 positions among VExP and other 2 machine learning models, Exomiser and Xrare. Both models are widely used by the scientific community, and they are free software available on internet. VExP outperforms the other methods using 791 solved cases. Using only the first 20 variants, VExP achieves 84.1% of the solved cases comparing to 49.1% and 42.2% for Exomiser and Xrare respectively.
VExP results: artificial intelligence model AIvsVExPvsExomiservsXrare_Rec.pdfThe graphs show the diagnostic genes rank within top 1 to top 50 positions among VExP, our work-in-progress AI model, and other 2 machine learning models, Exomiser and Xrare (recessive solved cases)