Research project

Evaluation of in silico tools for pathogenicity prediction of missense variants in the diagnostic laboratory

Clinical Bioinformatics - Genomics
Verity Fryer
Training location
Royal Devon & Exeter NHS Foundation Trust

Abstract  The use of in silico tools for pathogenicity prediction as supporting evidence for variant classification is recommended by the American College of Medical Genetics and Genomics/American College of Pathologists (ACMG/AMP) and Association for Clinical Genomic Science (ACGS) variant interpretation guidelines.  The use of ensemble prediction methods (“meta-predictors”) have been suggested by ACGS as a replacement for the current practice of using multiple individual algorithms that must be concordant to be used as evidence. Generally meta-predictors are not being widely used in UK diagnostic laboratories, and the evidence in this study shows the influence that inclusion of bioinformatics tools in variant interpretation software has on their use in UK diagnostic laboratories.  Using a dataset of 1,413 pathogenic and 4,790 benign unique variants that have been filtered to exclude all variants used to train the tools to be tested, the widely used in silico prediction algorithms Align-GVGD, PolyPhen-2 and SIFT are compared with the ensemble prediction methods GAVIN, MPC and REVEL. The considerations for and process of creating a test dataset suitable for this comparison are described in detail.  The meta-predictors tested have greater sensitivity and specificity than currently used tools, with higher positive and negative predictive values. For those tools that do not provide an easy to interpret, descriptive binary prediction such a “benign” or “pathogenic”, thresholds for scores have been suggested which were used to generate descriptive predictions and enable comparative analysis. The importance of establishing threshold values to enable these tools to be used diagnostically is discussed. The use of Align-GVGD, SIFT and PolyPhen-2 in combination is shown not to perform as well as at least two of the three meta-predictors and is discouraged as a combination for in silico pathogenicity prediction.  GAVIN achieved the best performance overall, with sensitivity and specificity of 93.7% and 98.6% respectively.

Last updated on 10th September 2020