Title Uporaba AI alata AlphaFolda 2 za modeliranje strukture proteina
Title (english) The use of the AI tool AlphaFold 2 for protein structure modeling
Author Ivana Dilber
Mentor Višnja Stepanić (mentor)
Mentor Nela Malatesti (komentor)
Committee member Višnja Stepanić (član povjerenstva)
Committee member Nela Malatesti (član povjerenstva)
Committee member Željka Maglica (predsjednik povjerenstva)
Committee member Ivan Gudelj (član povjerenstva)
Committee member Mladen Merćep (član povjerenstva)
Granter University of Rijeka (Faculty of Biotechnology and Drug Development) Rijeka
Defense date and country 2023-10-26, Croatia
Scientific / art field, discipline and subdiscipline BIOTECHNICAL SCIENCES Biotechnology
Abstract Proteins are biological macromolecules composed of amino acids linked by peptide bonds. Their three-dimensional (3D) structures are still challenging to determine and the number of proteins with resolved tertiary structures is rather small compared to the number of known protein sequences. The 3D structures of proteins are essential for understanding their function, and thus biological processes orchestrating health and diseases. The 3D protein structure allows us to identify "binding pockets" and functionally relevant regions of the protein. Nowadays innovative approaches have been developed for fast determination of protein conformations. These include computer algorithms that predict the 3D structure of the protein from its polypeptide primary sequence. In this thesis, we use AlphaFold 2, an open-source software that uses available protein datasets and artificial intelligence (AI), to predict the 3D structure of proteins. In this study, AlphaFold 2 structure models were analyzed for randomly generated amino acid sequences and for well-known industrial biocatalysts halohydrin dehalogenases HheC and HheA. The random sequences were generated by the tool RandSeq, while the FASTA inputs for HheA and HheC were formed from the crystal structures 1ZMO and 1ZMT, respectively, downloaded from the Protein Data Bank. The AlphaFold 2 conformations were analyzed using PyMOL and ChimeraX visualization software. While AlphaFold 2 could not reliably predict the structures of random sequences, as expected, the structures of the enzymes HheA and HheC in their monomeric and tetrameric states were predicted with high reliability. However, structural peculiarity like the entry of the C-terminal tail into the diagonal subunit of the HheC tetramer was not predicted. This study shows that AlphaFold 2 structures can be good starting conformations for molecular dynamics simulations while their use for molecular docking calculations should be taken with caution.
Abstract (croatian) Proteini su biološke makromolekule sastavljene od aminokiselina povezanih peptidnim vezama. Njihove trodimenzionalne (3D) strukture još uvijek je teško odrediti, a broj proteina s rješenim tercijarnim strukturama prilično je malen u usporedbi s brojem poznatih proteinskih sekvenci. 3D strukture proteina bitne su za razumijevanje njihove funkcije, a time i bioloških procesa koji upravljaju zdravljem i bolestima. 3D proteinska struktura omogućuje nam identificiranje "veznih džepova" i funkcionalno relevantnih regija proteina. Danas se razvijaju inovativni pristupi za brzo određivanje konformacija proteina, a to uključuje i računalne algoritme koji predviđaju 3D strukturu proteina iz primarne sekvence polipeptida. U ovom diplomskom radu koristimo AlphaFold 2, softver otvorenog koda koji koristi dostupne skupove podataka o proteinima i umjetnu inteligenciju (AI) za predviđanje 3D strukture proteina. U ovoj studiji pomoću AlphaFolda 2 predviđene se strukture za nasumično generirane sekvence aminokiselina i za dobro poznate industrijske biokatalizatore halohidrin dehalogenaza HheC i HheA. Nasumične sekvence generirao je alat RandSeq, dok su FASTA ulazi za HheA i HheC formirani iz kristalnih struktura 1ZMO, odnosno 1ZMT, preuzetih iz baze Protein Data Bank (PDB). Predviđene konformacije analizirane su pomoću softvera za vizualizaciju PyMOL i ChimeraX. Iako AlphaFold 2 nije mogao pouzdano predvidjeti strukture nasumičnih sekvenci, kao što se i očekivalo, strukture enzima HheA i HheC u njihovim monomernim i tetramernim stanjima predviđene su s visokom pouzdanošću. Međutim AlphaFold 2 nije predvidio strukturnu osobitost ulaska C-terminalnog repa u dijagonalnu podjedinicu tetramera HheC. Ova studija pokazuje da strukture predviđene AlphaFoldom 2 mogu biti dobre početne konformacije za simulacije molekulske dinamike, dok njihovu upotrebu za izračune molekulskog uklapanja treba uzeti s oprezom.
Keywords
AlphaFold 2
3D protein structure
haloalcohol/halohydrin dehalogenase
HheA
HheC
Keywords (croatian)
AlphaFold 2
3D struktura proteina
haloalkohol/halohidrin dehalogenaze
HheA
HheC
Language english
URN:NBN urn:nbn:hr:193:077207
Study programme Title: Biotechnology in medicine Study programme type: university Study level: graduate Academic / professional title: magistar/magistra biotehnologije u medicini (magistar/magistra biotehnologije u medicini)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Created on 2023-11-22 19:53:20