Mutational signatures and their impact on protein structures
In this project we ask whether different mutational signatures (combinations of mutable DNA motifs) can pose different impact on the affected proteins. Different proteins have different structural folds; these features and the local physicochemical environment dictates the distribution of amino acids along the protein sequence and hence should provide different availability of DNA motifs for mutational process to act upon. In this project I undertook a computational analyses of observed somatic mutations as well as mutational scanning of human coding sequences to ask whether some mutable DNA motifs can be potentially more harmful than others, and its delicate balance with conservation of protein sequence, structure and function.
I began working in this area during my PhD and contributed towards ZoomVar, an analysis of mapping protein missense variants from health and disease to protein structure, protein stability and abundance data.
Publications
Ng JCF, Fraternali F. Protein structural consequences of DNA mutational signatures: A meta-analysis of somatic variants and deep mutational scanning data. bioRxiv 2021 doi: https://doi.org/10.1101/2021.05.27.445950
Laddach A, Ng JCF, Fraternali F. Pathogenic missense protein variants affect different functional pathways and proteomic features than healthy population variants. PLoS Biol. 2021 (Accepted)
Materials
Repository
On BitBucket.
R markdown
HTML knitted version of R markdown documents containing all code & analysis undertaken in this project. The raw R markdown files can be found in the links provided below.
Annotating TCGA variants in terms of protein (structural) consequences | HTML notebook | R markdown code |
Selection pressure in protein surface and core | HTML notebook | R markdown code |
Annotating mutational signatures in terms of protein (structural) consequences | HTML notebook | R markdown code |
Protein impact of SNVs occuring in different mutational contexts | HTML notebook | R markdown code |
Compare homologues in terms of availability of mutatable motives | HTML notebook | R markdown code |
Web-app
ZoomvarTCGA is a R shiny web application which allows user to browse pre-computed mapping of TCGA, ICGC Metastatic tumour and HipSci iPSCs variants to protein structure. It provides data presentation and visualisation tools to browse data generated in this project, and interactively visualise missense variations mapped to structures from the Protein Data Bank (PDB).