Title Amino acid variation in human proteome
Title (croatian) Aminokiselinska raznolikost ljudskog proteoma
Author Kristijan Vuković
Mentor Tomica Hrenar (mentor)
Mentor Michael Inouye https://orcid.org/0000-0001-9413-6520 (mentor)
Committee member Tomica Hrenar (predsjednik povjerenstva)
Committee member Zlatko Mihalić (član povjerenstva)
Committee member Marko Močibob (član povjerenstva)
Committee member Tajana Preočanin (član povjerenstva)
Granter University of Zagreb Faculty of Science (Department of Chemistry) Zagreb
Defense date and country 2016-09-30, Croatia
Scientific / art field, discipline and subdiscipline NATURAL SCIENCES Chemistry
Abstract Amino acids are important biological molecules that influence human health and disease
through their function as structural units of proteins. Recent advancements to the genome
sequencing technologies enabled an indirect, large-scale exploration of human proteomic
diversity by mapping of nucleotide sequences in protein coding genomic regions to their
corresponding amino acid sequences in proteome. In this thesis, data from the 1000
Genomes and several other sequencing projects was used for the construction of amino
acid substitution maps that explore their occurrence frequency and pathogenicity.
Biochemical structural classification of amino acids was identified as an important element
in predicting both of these, with most class transitions showing the inversely proportional
relationship between the two. Codon distribution of the underlying genetic code was
compared with substitution maps and proved insufficient to account for the observed
frequencies. Classification accuracy of pathogenic variants from SIFT and PolyPhen, two
bioinformatical tools used in the 1000 Genomes Project was assessed, and frequency
resolved substitution maps for 5 population groups defined in the project were constructed
and analyzed. Finally, Trp Ser, a substitution that continuously showed high pathogenicity
signal was further explored through a series of molecular dynamics simulations.
Abstract (croatian) Aminokiseline su važne biološke molekule koje kao monomerne jedinice sudjeluju u izgradnji proteina.
Nedavni razvoj tehnologija za sekvenciranje genoma omogućio je indirektno određivanje njihovih sekvenci
mapiranjem nukleotidnih sljedova protein-kodirajućih regija genoma na odgovarajuće aminokiselinske
sljedove proteoma. Takva metoda višestruko je brža i jeftinija od direktnog sekvenciranja proteina, a nove
tehnologije revolucionarizirale su primjenu genetskog koda u medicinskim i znanstvenim istraživanjima. U
ovom radu, podatci iz projekta “1000 Genomes” i nekoliko sličnih projekata iskorišteni su za konstrukciju mapa
aminokiselinskih supstitucija s obzirom na njihovu učestalost i patogenost. Biokemijska klasifikacija
aminokiselina pokazala se ključnom u objašnjavanju dobivenih rezultata, a tranzicije između strukturnih grupa
u ove dvije mape imale su značajnu obrnuto proporcionalnu korelaciju koja je u skladu s evolucijskim
selektivnim pritiscima. Distribucija kodona pojedinih aminokiselina pokazala se nedostatnom za objašnjavanje
dobivenih supstitucijskih mapa, a s obzirom na nukleotidne pozicije unutar kodona, uočena je povećana
patogenost aminokiselinskih supstitucija uzrokovanih promjenom 2. te smanjena patogenost supstitucija
uzrokovanih promjenom 3. nukleotida. Testirana su dva bioinformatička alata za klasifikaciju patogenosti
aminokiselinskih supstitucija korištena u projektu “1000 Genomes” te se alat PolyPhen pokazao nešto boljim
od SIFT-a u detekciji patogenih supstitucija. Također, konstruirane su i analizirane frekvencijski razriješene
supstitucijske mape 5 populacijskih grupa definiranih u tom projektu. Konačno, supstitucija Trp Ser, koja je
u više analiza pokazala značajan signal patogenosti, detaljnije je strukturno analizirana kroz seriju molekulsko
dinamičkih simulacija. Simulirani su proteini u kojima je dotična supstitucija detektirana i to uz pouzdanu
klasifikaciju patogenosti.
Keywords
1000 Genomes Project
bioinformatics
human proteome
molecular dynamics
sequencing technologies
statistical hypothesis testing
Keywords (croatian)
projekt 1000 Genomes
bioinformatika
humani proteom
molekulska dinamika
sekvenciranje genoma
statističko testiranje hipoteza
Language english
URN:NBN urn:nbn:hr:217:538899
Study programme Title: Graduate university studies in chemistry; specializations in: Research programme Course: Research programme Study programme type: university Study level: graduate Academic / professional title: magistar/magistra kemije (magistar/magistra kemije)
Type of resource Text
File origin Born digital
Access conditions Open access Embargo expiration date: 2017-09-30
Terms of use
Created on 2017-03-14 17:03:10