Event Details
Noncoding regions that do not encode protein are the majority of the genome, e.g., about 99% of the human genome is noncoding DNA. Mutations in the noncoding genome have been crucial to understand disease mechanisms through dysregulation of disease-associated genes. One key element in gene regulation that noncoding mutations mediate is the binding of proteins to DNA sequences. Insertion and deletion of bases (InDels) are the second most common type of mutations that may impact DNA-protein binding. However, no existing methods could be utilized to determine the quantitative effects on DNA-protein binding driven by InDels. We develop a novel statistical test, named binding changer test (BC test), to evaluate the impact of InDels on DNA binding changes using DNA-binding motifs and single sequence modeling. The test predicts binding changer InDels of regulatory importance with an efficient importance sampling algorithm in generating background sequences from an importance distribution more weighting large binding affinity changes. We derive the importance distribution with the optimal tilting parameter. The BC test provides a general statistical framework for any disease types in any species genomes. Simulation studies demonstrate its excellent performance. The application to genome sequencing datasets in human leukemia samples uncovers candidate pathologic InDels by modulating MYC binding in leukemic genomes.
Coffee will be served in the alcove outside FO 2.406 from 10.30am to 11am.