Application of machine learning, molecular modelling and structural data mining against antiretroviral drug resistance in HIV-1
- Sheik Amamuddy, Olivier Serge André
- Authors: Sheik Amamuddy, Olivier Serge André
- Date: 2020
- Subjects: Machine learning , Molecules -- Models , Data mining , Neural networks (Computer science) , Antiretroviral agents , Protease inhibitors , Drug resistance , Multidrug resistance , Molecular dynamics , Renin-angiotensin system , HIV (Viruses) -- South Africa , HIV (Viruses) -- Social aspects -- South Africa , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115964 , vital:34282
- Description: Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis.
- Full Text:
- Authors: Sheik Amamuddy, Olivier Serge André
- Date: 2020
- Subjects: Machine learning , Molecules -- Models , Data mining , Neural networks (Computer science) , Antiretroviral agents , Protease inhibitors , Drug resistance , Multidrug resistance , Molecular dynamics , Renin-angiotensin system , HIV (Viruses) -- South Africa , HIV (Viruses) -- Social aspects -- South Africa , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115964 , vital:34282
- Description: Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis.
- Full Text:
Computer aided approaches against Human African Trypanosomiasis
- Authors: Kimuda, Magambo Phillip
- Date: 2020
- Subjects: African trypanosomiasis , African trypanosomiasis -- Chemotherapy , Genomics , Macrophage migration inhibitory factor , Trypanosoma brucei , Pteridines , Tetrahydrofolate dehydrogenase , Adenylic acid , Molecular dynamics , Principal components analysis , Bioinformatics , Single nucleotide polymorphisms , Single Nucleotide Variants , Candidate Gene Association Study (CGAS)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/142542 , vital:38089
- Description: The thesis presented here is divided into two parts under a common theme that is the use of computer based tools, genomics, and in vitro experiments to develop innovative ways of tackling Human African Trypanosomiasis (HAT). Part I of this thesis focused on the human host genetic determinants while Part II focused on the discovery of novel chemotherapeutics against the parasite. Part I is further sub-divided into two parts: The first involves a Candidate Gene Association Study (CGAS) on an African population to identify genetic determinants associated with disease and/or susceptibility to HAT. The second involves studying the effects of missense Single Nucleotide Variants (SNVs) on protein structure, dynamics, and function using Macrophage Migration Inhibitory Factor (MIF) as a case study. Part II is also sub-divided into two parts: The first involves a computer based rational drug discovery of potential inhibitors against the Trypanosoma the folate pathway; particularly by targeting Trypanosoma brucei Pteridine Reductase (TbPTR1) which is an enzyme used by trypanosomes to overcome T. brucei Dihydrofolate Reductase (TbDHFR) inhibition. Lastly the derivation of CHARMM force-field parameters that can be used to accurately model the geometry and dynamics of the T. brucei Phosphodiesterase B1 enzyme (TbrPDEB1) bimetallic active site center. The derived parameters were then used in MD simulations to characterise protein-ligand residue interactions that are important in TbrPDEB1 inhibition with the goal of targeting the cyclic Adenosine Monophosphate (cAMP) signalling pathway. In the CGAS we were unable to detect any genetic associations in the Ugandan cohort analysed that passed correction for multiple testing in spite of the study being sufficiently powered. Additionally, our study found no association of the Apo lipoprotein 1 (APOL1) G2 allele association with protection against acute HAT that has been previously reported. Future investigations for example, Genome Wide Association Studies using larger samples sizes (>3000 cases and controls) are required. Macrophage migration inhibitory factor (MIF) is a cytokine that is important in both innate and adaptive immunity that has been shown to play a role in T. brucei pathogenicity using murine models. A total of 27 missense SNVs were modelled using homology modelling to create MIF protein mutants that were investigated using in silico effect prediction tools, molecular dynamics (MD), Principal Component Analysis (PCA), and Dynamic Residue Network (DRN) analysis. Our results demonstrate that mutations P2Q, I5M, P16Q, L23F, T24S, T31I, Y37H, H41P, M48V, P44L, G52C, S54R, I65M, I68T, S75F, N106S, and T113S caused significant conformational changes. Further, DRN analysis showed that residues P2, T31, Y37, G52, I65, I68, S75, N106, and T113S are part of a similar local residue interaction network with functional significance. These results show how polymorphisms such as missense SNVs can affect protein conformation, dynamics, and function. Trypanosomes are auxotrophic for folates and pterins but require them for survival. They scavenge them from their hosts. PTR1 is a multifunctional enzyme that is unique to trypanosomatids that reduces both pterins and folates. In the presence of DHFR inhibitors, PTR1 is over-expressed thus providing an escape from the effects of DHFR inhibition. Both TbPTR1 and TbDHFR are pharmacologically and genetically validated drug targets. In this study 5742 compounds were screened using molecular docking, and 13 promising binding modes were further analysed using MD simulations. The trajectories were analysed using RMSD, Rg, RMSF, PCA, Essential Dynamics Analysis (EDA), Molecular Mechanics Poisson–Boltzmann surface area (MM-PBSA) binding free energy calculations, and DRN analysis. The computational screening approach allowed us to identify five of the compounds, named RUBi004, RUBi007, RUBi014, RUBi016 and RUBi018 that exhibited antitrypanosomal growth activities against trypanosomes in culture with IC50 values of 12.5 ± 4.8 μM, 32.4 ± 4.2 μM, 5.9 ± 1.4 μM, 28.2 ± 3.3 μM, and 9.7 ± 2.1 μM, respectively. Further when used in combination with WR99210 a known TbDHFR inhibitor RUBi004, RUBi007, RUBi014 and RUBi018 showed antagonism while RUBi016 showed an additive effect. These results indicate that the four compounds might be competing with TbDHFR while RUBi016 might be more specific for TbPTR1. These compounds provide scaffolds that can be further optimised to improve their potency and specificity. Lastly, using a systematic approach we derived CHARMM force-field parameters to accurately describe the TbrPDEB1 bi-metal catalytic center. For dynamics, we employed mixed bonded and non-bonded approach. We optimised the structure using a two-layer QM/MM ONIOM (B3LYP/6-31(g): UFF). The TbrPDEB1 bi-metallic center bonds, angles, and dihedrals were parameterized by fitting the energy profiles from Potential Energy Surface (PES) scans to the CHARMM potential energy function. The parameters were validated by means of MD simulations and analysed using RMSD, Rg, RMSF, hydrogen bonding, bond/angle/dihedral evaluations, EDA, PCA, and DRN analysis. The force-field parameters were able to accurately reproduce the geometry and dynamics of the TbrPDEB1 bi-metal catalytic center during MD simulations. Molecular docking was used to identify 6 potential hits, that inhibited trypanosome growth in vitro. The derived force-field parameters were used to simulate the 6 protein-ligand complexes with the aim of elucidating crucial protein-ligand residue interactions. Using the most potent ligand RUBi022 that had an IC50 of 14.96 μM we were able to identify key residue interactions that can be of use in in silico prediction of potential TbrPDEB1 inhibitors. Overall we demonstrate how bioinformatics tools can complement current disease eradication strategies. Future work will focus on identifying variants identified in Genome Wide Association Studies and partnering with wet labs to carry out further enzyme-ligand activity relationship studies, structure determination or characterisation of appropriate protein-ligand complexes by crystallography, and site specific mutation studies
- Full Text:
- Authors: Kimuda, Magambo Phillip
- Date: 2020
- Subjects: African trypanosomiasis , African trypanosomiasis -- Chemotherapy , Genomics , Macrophage migration inhibitory factor , Trypanosoma brucei , Pteridines , Tetrahydrofolate dehydrogenase , Adenylic acid , Molecular dynamics , Principal components analysis , Bioinformatics , Single nucleotide polymorphisms , Single Nucleotide Variants , Candidate Gene Association Study (CGAS)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/142542 , vital:38089
- Description: The thesis presented here is divided into two parts under a common theme that is the use of computer based tools, genomics, and in vitro experiments to develop innovative ways of tackling Human African Trypanosomiasis (HAT). Part I of this thesis focused on the human host genetic determinants while Part II focused on the discovery of novel chemotherapeutics against the parasite. Part I is further sub-divided into two parts: The first involves a Candidate Gene Association Study (CGAS) on an African population to identify genetic determinants associated with disease and/or susceptibility to HAT. The second involves studying the effects of missense Single Nucleotide Variants (SNVs) on protein structure, dynamics, and function using Macrophage Migration Inhibitory Factor (MIF) as a case study. Part II is also sub-divided into two parts: The first involves a computer based rational drug discovery of potential inhibitors against the Trypanosoma the folate pathway; particularly by targeting Trypanosoma brucei Pteridine Reductase (TbPTR1) which is an enzyme used by trypanosomes to overcome T. brucei Dihydrofolate Reductase (TbDHFR) inhibition. Lastly the derivation of CHARMM force-field parameters that can be used to accurately model the geometry and dynamics of the T. brucei Phosphodiesterase B1 enzyme (TbrPDEB1) bimetallic active site center. The derived parameters were then used in MD simulations to characterise protein-ligand residue interactions that are important in TbrPDEB1 inhibition with the goal of targeting the cyclic Adenosine Monophosphate (cAMP) signalling pathway. In the CGAS we were unable to detect any genetic associations in the Ugandan cohort analysed that passed correction for multiple testing in spite of the study being sufficiently powered. Additionally, our study found no association of the Apo lipoprotein 1 (APOL1) G2 allele association with protection against acute HAT that has been previously reported. Future investigations for example, Genome Wide Association Studies using larger samples sizes (>3000 cases and controls) are required. Macrophage migration inhibitory factor (MIF) is a cytokine that is important in both innate and adaptive immunity that has been shown to play a role in T. brucei pathogenicity using murine models. A total of 27 missense SNVs were modelled using homology modelling to create MIF protein mutants that were investigated using in silico effect prediction tools, molecular dynamics (MD), Principal Component Analysis (PCA), and Dynamic Residue Network (DRN) analysis. Our results demonstrate that mutations P2Q, I5M, P16Q, L23F, T24S, T31I, Y37H, H41P, M48V, P44L, G52C, S54R, I65M, I68T, S75F, N106S, and T113S caused significant conformational changes. Further, DRN analysis showed that residues P2, T31, Y37, G52, I65, I68, S75, N106, and T113S are part of a similar local residue interaction network with functional significance. These results show how polymorphisms such as missense SNVs can affect protein conformation, dynamics, and function. Trypanosomes are auxotrophic for folates and pterins but require them for survival. They scavenge them from their hosts. PTR1 is a multifunctional enzyme that is unique to trypanosomatids that reduces both pterins and folates. In the presence of DHFR inhibitors, PTR1 is over-expressed thus providing an escape from the effects of DHFR inhibition. Both TbPTR1 and TbDHFR are pharmacologically and genetically validated drug targets. In this study 5742 compounds were screened using molecular docking, and 13 promising binding modes were further analysed using MD simulations. The trajectories were analysed using RMSD, Rg, RMSF, PCA, Essential Dynamics Analysis (EDA), Molecular Mechanics Poisson–Boltzmann surface area (MM-PBSA) binding free energy calculations, and DRN analysis. The computational screening approach allowed us to identify five of the compounds, named RUBi004, RUBi007, RUBi014, RUBi016 and RUBi018 that exhibited antitrypanosomal growth activities against trypanosomes in culture with IC50 values of 12.5 ± 4.8 μM, 32.4 ± 4.2 μM, 5.9 ± 1.4 μM, 28.2 ± 3.3 μM, and 9.7 ± 2.1 μM, respectively. Further when used in combination with WR99210 a known TbDHFR inhibitor RUBi004, RUBi007, RUBi014 and RUBi018 showed antagonism while RUBi016 showed an additive effect. These results indicate that the four compounds might be competing with TbDHFR while RUBi016 might be more specific for TbPTR1. These compounds provide scaffolds that can be further optimised to improve their potency and specificity. Lastly, using a systematic approach we derived CHARMM force-field parameters to accurately describe the TbrPDEB1 bi-metal catalytic center. For dynamics, we employed mixed bonded and non-bonded approach. We optimised the structure using a two-layer QM/MM ONIOM (B3LYP/6-31(g): UFF). The TbrPDEB1 bi-metallic center bonds, angles, and dihedrals were parameterized by fitting the energy profiles from Potential Energy Surface (PES) scans to the CHARMM potential energy function. The parameters were validated by means of MD simulations and analysed using RMSD, Rg, RMSF, hydrogen bonding, bond/angle/dihedral evaluations, EDA, PCA, and DRN analysis. The force-field parameters were able to accurately reproduce the geometry and dynamics of the TbrPDEB1 bi-metal catalytic center during MD simulations. Molecular docking was used to identify 6 potential hits, that inhibited trypanosome growth in vitro. The derived force-field parameters were used to simulate the 6 protein-ligand complexes with the aim of elucidating crucial protein-ligand residue interactions. Using the most potent ligand RUBi022 that had an IC50 of 14.96 μM we were able to identify key residue interactions that can be of use in in silico prediction of potential TbrPDEB1 inhibitors. Overall we demonstrate how bioinformatics tools can complement current disease eradication strategies. Future work will focus on identifying variants identified in Genome Wide Association Studies and partnering with wet labs to carry out further enzyme-ligand activity relationship studies, structure determination or characterisation of appropriate protein-ligand complexes by crystallography, and site specific mutation studies
- Full Text:
Cyclooxygenase-1 as an anti-stroke target: potential inhibitor identification and non-synonymous single nucleotide polymorphism analysis
- Authors: Muronzi, Tendai
- Date: 2020
- Subjects: Cerebrovascular disease , Cerebrovascular disease -- Treatment , Cerebrovascular disease -- Chemotherapy , Cyclooxygenases , High throughput screening (Drug development) , Drug development , Molecular dynamics , South African Natural Compounds Database , ZINC database
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/143404 , vital:38243
- Description: Stroke is the third leading cause of death worldwide, with 87% of cases being ischemic stroke. The two primary therapeutic strategies to reduce post-ischemic brain damage are cellular and vascular approaches. The vascular strategy aims to rapidly re-open obstructed blood vessels, while the cellular approach aims to interfere with the signalling pathways that facilitate neuron damage and death. Unfortunately, popular vascular treatments have adverse side effects, necessitating the need for alternative chemotherapeutics. In this study, cyclooxygenase-1 (COX-1), which plays a significant role in the post- ischemic neuroinflammation and neuronal death, was targeted for identification of novel drug compounds and to assess the effect of nsSNPs on its structure and function. In a drug discovery part, ligands from the South African Natural Compounds Database (SANCDB-https://sancdb.rubi.ru.ac.za/) and ZINC database (http://zinc15.docking.org/) were used for high-throughput virtual screening (HVTS) against COX-1. Additionally, five nsSNPs were being investigated to assess their impact on protein structure and function. Three of these SNPs were in the COX-1 dimer interface. Molecular docking and molecular dynamics simulations revealed asymmetric nature of the protein. Several ligands, peculiar to each monomer, exhibited favourable binding energies in the respective active sites. SNP analysis indicated effects on inter-monomer interactions and protein stability.
- Full Text:
- Authors: Muronzi, Tendai
- Date: 2020
- Subjects: Cerebrovascular disease , Cerebrovascular disease -- Treatment , Cerebrovascular disease -- Chemotherapy , Cyclooxygenases , High throughput screening (Drug development) , Drug development , Molecular dynamics , South African Natural Compounds Database , ZINC database
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/143404 , vital:38243
- Description: Stroke is the third leading cause of death worldwide, with 87% of cases being ischemic stroke. The two primary therapeutic strategies to reduce post-ischemic brain damage are cellular and vascular approaches. The vascular strategy aims to rapidly re-open obstructed blood vessels, while the cellular approach aims to interfere with the signalling pathways that facilitate neuron damage and death. Unfortunately, popular vascular treatments have adverse side effects, necessitating the need for alternative chemotherapeutics. In this study, cyclooxygenase-1 (COX-1), which plays a significant role in the post- ischemic neuroinflammation and neuronal death, was targeted for identification of novel drug compounds and to assess the effect of nsSNPs on its structure and function. In a drug discovery part, ligands from the South African Natural Compounds Database (SANCDB-https://sancdb.rubi.ru.ac.za/) and ZINC database (http://zinc15.docking.org/) were used for high-throughput virtual screening (HVTS) against COX-1. Additionally, five nsSNPs were being investigated to assess their impact on protein structure and function. Three of these SNPs were in the COX-1 dimer interface. Molecular docking and molecular dynamics simulations revealed asymmetric nature of the protein. Several ligands, peculiar to each monomer, exhibited favourable binding energies in the respective active sites. SNP analysis indicated effects on inter-monomer interactions and protein stability.
- Full Text:
Prediction of mass spectra for natural products using an ab initio approach
- Authors: Novokoza, Yolanda
- Date: 2020
- Subjects: Molecular dynamics , Molecular dynamics -- Computer simulation , Mass spectroscopy , Electron impact ionization
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/167166 , vital:41443
- Description: Mass spectrometry (MS) is a technique that measures the fragmentation of molecules, dependent on the molecule’s chemical composition and structure, by first introducing a charge on the molecules. The instrument records the mass to charge ratio, but the energy from the ionization process causes the molecule to fragment. The resultant mass spectrum is highly indicative of not only the molecule analyzed, but also its chemical composition. MS is used in research and industry for both routine and research purposes. One such way to ionize molecules for MS is by bombarding the molecule with electrons which is the basis of electron impact mass spectrometry (EIMS). Although EIMS is widely used, prediction of electron impact mass spectra from first principles is a challenging problem due to a need to accurately determine the probability of different fragmentation pathways of a molecule. Ab initio molecular dynamics based methods are able to explore in an automatic fashion the energetically available fragmentation paths thus give reaction mechanisms in an unbiased way. The mass spectra of five molecules have been explored in work-flows leading to the prediction of mass spectra. These molecules include three natural products alpha-hispanolol, PFB oxime derivative and boronolide (for which experimental mass spectra were not available) and two compounds from the NIST database (for which experimental mass spectra were available). For each of these systems many random conformations were generated using the RDKit library. To all conformations random velocities were applied to each atom. Ab initio molecular dynamics was performed on each conformer, using these initial random velocities using CP2K software, at DFTB+ level at a variety of highly raised temperatures (to accelerate the formation of fragments) Fragmentation was monitored by iterating through all bonds, and identifying bond breakages during dynamics. Graph theoretical packages were used then to track distinct fragments generated. For each of these fragments, charges were determined from Mulliken analysis for all atoms on the fragment from the QM calculations and sum of atomic spin densities per fragment was also plotted. The fragment with the greatest charge (corresponding to the formation of a cation fragment) was taken for plotting on the mass spectrum. Finally, from the mass of the fragment and its elemental composition, the isotopic distribution for the fragment was determined, and this distribution was included by addition in to the mass spectrum. For all trajectories, the sum of all isotopic distributions determined the final mass spectrum.
- Full Text:
- Authors: Novokoza, Yolanda
- Date: 2020
- Subjects: Molecular dynamics , Molecular dynamics -- Computer simulation , Mass spectroscopy , Electron impact ionization
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/167166 , vital:41443
- Description: Mass spectrometry (MS) is a technique that measures the fragmentation of molecules, dependent on the molecule’s chemical composition and structure, by first introducing a charge on the molecules. The instrument records the mass to charge ratio, but the energy from the ionization process causes the molecule to fragment. The resultant mass spectrum is highly indicative of not only the molecule analyzed, but also its chemical composition. MS is used in research and industry for both routine and research purposes. One such way to ionize molecules for MS is by bombarding the molecule with electrons which is the basis of electron impact mass spectrometry (EIMS). Although EIMS is widely used, prediction of electron impact mass spectra from first principles is a challenging problem due to a need to accurately determine the probability of different fragmentation pathways of a molecule. Ab initio molecular dynamics based methods are able to explore in an automatic fashion the energetically available fragmentation paths thus give reaction mechanisms in an unbiased way. The mass spectra of five molecules have been explored in work-flows leading to the prediction of mass spectra. These molecules include three natural products alpha-hispanolol, PFB oxime derivative and boronolide (for which experimental mass spectra were not available) and two compounds from the NIST database (for which experimental mass spectra were available). For each of these systems many random conformations were generated using the RDKit library. To all conformations random velocities were applied to each atom. Ab initio molecular dynamics was performed on each conformer, using these initial random velocities using CP2K software, at DFTB+ level at a variety of highly raised temperatures (to accelerate the formation of fragments) Fragmentation was monitored by iterating through all bonds, and identifying bond breakages during dynamics. Graph theoretical packages were used then to track distinct fragments generated. For each of these fragments, charges were determined from Mulliken analysis for all atoms on the fragment from the QM calculations and sum of atomic spin densities per fragment was also plotted. The fragment with the greatest charge (corresponding to the formation of a cation fragment) was taken for plotting on the mass spectrum. Finally, from the mass of the fragment and its elemental composition, the isotopic distribution for the fragment was determined, and this distribution was included by addition in to the mass spectrum. For all trajectories, the sum of all isotopic distributions determined the final mass spectrum.
- Full Text:
A dynamics based analysis of allosteric modulation in heat shock proteins
- Authors: Penkler, David Lawrence
- Date: 2019
- Subjects: Heat shock proteins , Molecular chaperones , Allosteric regulation , Homeostasis , Protein kinases , Transcription factors , Adenosine triphosphatase , Cancer -- Chemotherapy , Molecular dynamics , High throughput screening (Drug development)
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115948 , vital:34273
- Description: The 70 kDa and 90 kDa heat shock proteins (Hsp70 and Hsp90) are molecular chaperones that play central roles in maintaining cellular homeostasis in all organisms of life with the exception of archaea. In addition to their general chaperone function in protein quality control, Hsp70 and Hsp90 cooperate in the regulation and activity of some 200 known natively folded protein clients which include protein kinases, transcription factors and receptors, many of which are implicated as key regulators of essential signal transduction pathways. Both chaperones are considered to be large multi-domain proteins that rely on ATPase activity and co-chaperone interactions to regulate their conformational cycles for peptide binding and release. The unique positioning of Hsp90 at the crossroads of several fundamental cellular pathways coupled with its known association with diverse oncogenic peptide clients has brought the molecular chaperone under increasing interest as a potential anti-cancer target that is crucially implicated with all eight hallmarks of the disease. Current orthosteric drug discovery efforts aimed at the inhibition of the ATPase domain of Hsp90 have been limited due to high levels of associated toxicity. In an effort to circumnavigate this, the combined focus of research efforts is shifting toward alternative approaches such as interference with co-chaperone binding and the allosteric inhibition/activation of the molecular chaperone. The overriding aim of this thesis was to demonstrate how the computational technique of Perturbation response scanning (PRS) coupled with all-atom molecular dynamics simulations (MD) and dynamic residue interaction network (DRN) analysis can be used as a viable strategy to efficiently scan and accurately identify allosteric control element capable of modulating the functional dynamics of a protein. In pursuit of this goal, this thesis also contributes to the current understanding of the nucleotide dependent allosteric mechanisms at play in cellular functionality of both Hsp70 and Hsp90. All-atom MD simulations of E. coli DnaK provided evidence of nucleotide driven modulation of conformational dynamics in both the catalytically active and inactive states. PRS analysis employed on these trajectories demonstrated sensitivity toward bound nucleotide and peptide substrate, and provided evidence of a putative allosterically active intermediate state between the ATPase active and inactive conformational states. Simultaneous binding of ATP and peptide substrate was found to allosterically prime the chaperone for interstate conversion regardless of the transition direction. Detailed analysis of these allosterically primed states revealed select residue sites capable of selecting a coordinate shift towards the opposite conformational state. In an effort to validate these results, the predicted allosteric hot spot sites were cross-validated with known experimental works and found to overlap with functional sites implicated in allosteric signal propagation and ATPase activation in Hsp70. This study presented for the first time, the application of PRS as a suitable diagnostic tool for the elucidation and quantification of the allosteric potential of select residues to effect functionally relevant global conformational rearrangements. The PRS methodology described in this study was packaged within the Python programming environment in the MD-TASK software suite for command-line ease of use and made freely available. Homology modelling techniques were used to address the lack of experimental structural data for the human cytosolic isoform of Hsp90 and for the first time provided accurate full-length structural models of human Hsp90α in fully-closed and partially-open conformations. Long-range all-atom MD simulations of these structures revealed nucleotide driven modulation of conformational dynamics in Hsp90. Subsequent DRN and PRS analysis of these MD trajectories allowed for the quantification and elucidation of nucleotide driven allosteric modulation in the molecular chaperone. A detailed PRS analysis revealed allosteric inter-domain coupling between the extreme terminals of the chaperone in response to external force perturbations at either domain. Furthermore PRS also identified several individual residue sites that are capable of selecting conformational rearrangements towards functionally relevant states which may be considered to be putative allosteric target sites for future drug discovery efforts Molecular docking techniques were employed to investigate the modulation of conformational dynamics of human Hsp90α in response to ligand binding interactions at two identified allosteric sites at the C-terminal. High throughput screening of a small library of natural compounds indigenous to South Africa revealed three hit compounds at these sites: Cephalostatin 17, 20(29)-Lupene-3β isoferulate and 3'-Bromorubrolide F. All-atom MD simulations on these protein-ligand complexes coupled with DRN analysis and several advanced trajectory based analysis techniques provided evidence of selective allosteric modulation of Hsp90α conformational dynamics in response to the identity and location of the bound ligands. Ligands bound at the four-helix bundle presented as putative allosteric inhibitors of Hsp90α, driving conformational dynamics in favour of dimer opening and possibly dimer separation. Meanwhile, ligand interactions at an adjacent sub-pocket located near the interface between the middle and C-terminal domains demonstrated allosteric activation of the chaperone, modulating conformational dynamics in favour of the fully-closed catalytically active conformational state. Taken together, the data presented in this thesis contributes to the understanding of allosteric modulation of conformational dynamics in Hsp70 and Hsp90, and provides a suitable platform for future biochemical and drug discovery studies. Furthermore, the molecular docking and computational identification of allosteric compounds with suitable binding affinity for allosteric sites at the CTD of human Hsp90α provide for the first time “proof-of-principle” for the use of PRS in conjunction with MD simulations and DRN analysis as a suitable method for the rapid identification of allosteric sites in proteins that can be probed by small molecule interaction. The data presented in this section could pave the way for future allosteric drug discovery studies for the treatment of Hsp90 associated pathologies.
- Full Text:
- Authors: Penkler, David Lawrence
- Date: 2019
- Subjects: Heat shock proteins , Molecular chaperones , Allosteric regulation , Homeostasis , Protein kinases , Transcription factors , Adenosine triphosphatase , Cancer -- Chemotherapy , Molecular dynamics , High throughput screening (Drug development)
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115948 , vital:34273
- Description: The 70 kDa and 90 kDa heat shock proteins (Hsp70 and Hsp90) are molecular chaperones that play central roles in maintaining cellular homeostasis in all organisms of life with the exception of archaea. In addition to their general chaperone function in protein quality control, Hsp70 and Hsp90 cooperate in the regulation and activity of some 200 known natively folded protein clients which include protein kinases, transcription factors and receptors, many of which are implicated as key regulators of essential signal transduction pathways. Both chaperones are considered to be large multi-domain proteins that rely on ATPase activity and co-chaperone interactions to regulate their conformational cycles for peptide binding and release. The unique positioning of Hsp90 at the crossroads of several fundamental cellular pathways coupled with its known association with diverse oncogenic peptide clients has brought the molecular chaperone under increasing interest as a potential anti-cancer target that is crucially implicated with all eight hallmarks of the disease. Current orthosteric drug discovery efforts aimed at the inhibition of the ATPase domain of Hsp90 have been limited due to high levels of associated toxicity. In an effort to circumnavigate this, the combined focus of research efforts is shifting toward alternative approaches such as interference with co-chaperone binding and the allosteric inhibition/activation of the molecular chaperone. The overriding aim of this thesis was to demonstrate how the computational technique of Perturbation response scanning (PRS) coupled with all-atom molecular dynamics simulations (MD) and dynamic residue interaction network (DRN) analysis can be used as a viable strategy to efficiently scan and accurately identify allosteric control element capable of modulating the functional dynamics of a protein. In pursuit of this goal, this thesis also contributes to the current understanding of the nucleotide dependent allosteric mechanisms at play in cellular functionality of both Hsp70 and Hsp90. All-atom MD simulations of E. coli DnaK provided evidence of nucleotide driven modulation of conformational dynamics in both the catalytically active and inactive states. PRS analysis employed on these trajectories demonstrated sensitivity toward bound nucleotide and peptide substrate, and provided evidence of a putative allosterically active intermediate state between the ATPase active and inactive conformational states. Simultaneous binding of ATP and peptide substrate was found to allosterically prime the chaperone for interstate conversion regardless of the transition direction. Detailed analysis of these allosterically primed states revealed select residue sites capable of selecting a coordinate shift towards the opposite conformational state. In an effort to validate these results, the predicted allosteric hot spot sites were cross-validated with known experimental works and found to overlap with functional sites implicated in allosteric signal propagation and ATPase activation in Hsp70. This study presented for the first time, the application of PRS as a suitable diagnostic tool for the elucidation and quantification of the allosteric potential of select residues to effect functionally relevant global conformational rearrangements. The PRS methodology described in this study was packaged within the Python programming environment in the MD-TASK software suite for command-line ease of use and made freely available. Homology modelling techniques were used to address the lack of experimental structural data for the human cytosolic isoform of Hsp90 and for the first time provided accurate full-length structural models of human Hsp90α in fully-closed and partially-open conformations. Long-range all-atom MD simulations of these structures revealed nucleotide driven modulation of conformational dynamics in Hsp90. Subsequent DRN and PRS analysis of these MD trajectories allowed for the quantification and elucidation of nucleotide driven allosteric modulation in the molecular chaperone. A detailed PRS analysis revealed allosteric inter-domain coupling between the extreme terminals of the chaperone in response to external force perturbations at either domain. Furthermore PRS also identified several individual residue sites that are capable of selecting conformational rearrangements towards functionally relevant states which may be considered to be putative allosteric target sites for future drug discovery efforts Molecular docking techniques were employed to investigate the modulation of conformational dynamics of human Hsp90α in response to ligand binding interactions at two identified allosteric sites at the C-terminal. High throughput screening of a small library of natural compounds indigenous to South Africa revealed three hit compounds at these sites: Cephalostatin 17, 20(29)-Lupene-3β isoferulate and 3'-Bromorubrolide F. All-atom MD simulations on these protein-ligand complexes coupled with DRN analysis and several advanced trajectory based analysis techniques provided evidence of selective allosteric modulation of Hsp90α conformational dynamics in response to the identity and location of the bound ligands. Ligands bound at the four-helix bundle presented as putative allosteric inhibitors of Hsp90α, driving conformational dynamics in favour of dimer opening and possibly dimer separation. Meanwhile, ligand interactions at an adjacent sub-pocket located near the interface between the middle and C-terminal domains demonstrated allosteric activation of the chaperone, modulating conformational dynamics in favour of the fully-closed catalytically active conformational state. Taken together, the data presented in this thesis contributes to the understanding of allosteric modulation of conformational dynamics in Hsp70 and Hsp90, and provides a suitable platform for future biochemical and drug discovery studies. Furthermore, the molecular docking and computational identification of allosteric compounds with suitable binding affinity for allosteric sites at the CTD of human Hsp90α provide for the first time “proof-of-principle” for the use of PRS in conjunction with MD simulations and DRN analysis as a suitable method for the rapid identification of allosteric sites in proteins that can be probed by small molecule interaction. The data presented in this section could pave the way for future allosteric drug discovery studies for the treatment of Hsp90 associated pathologies.
- Full Text:
Targeting allosteric sites of Escherichia coli heat shock protein 70 for antibiotic development
- Authors: Okeke, Chiamaka Jessica
- Date: 2019
- Subjects: Heat shock proteins , Escherichia coli , Allosteric proteins , Antibiotics , Molecular chaperones , Ligands (Biochemistry) , Molecular dynamics , Principal components analysis , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/115998 , vital:34287
- Description: Hsp70s are members of the heat shock proteins family with a molecular weight of 70-kDa and are the most abundant group in bacterial and eukaryotic systems, hence the most extensively studied ones. These proteins are molecular chaperones that play a significant role in protein homeostasis by facilitating appropriate folding of proteins, preventing proteins from aggregating and misfolding. They are also involved in translocation of proteins into subcellular compartments and protection of cells against stress. Stress caused by environmental or biological factors affects the functionality of the cell. In response to these stressful conditions, up-regulation of Hsp70s ensures that the cells are protected by balancing out unfolded proteins giving them ample time to repair denatured proteins. Hsp70s is connected to numerous illnesses such as autoimmune and neurodegenerative diseases, bacterial infection, cancer, malaria, and obesity. The multi-functional nature of Hsp70s predisposes them as promising therapeutic targets. Hsp70s play vital roles in various cell developments, and survival pathways, therefore targeting this protein will provide a new avenue towards the discovery of active therapeutic agents for the treatment of a wide range of diseases. Allosteric sites of these proteins in its multi-conformational states have not been explored for inhibitory properties hence the aim of this study. This study aims at identifying allosteric sites that inhibit the ATPase and substrate binding activities using computational approaches. Using E. coli as a model organism, molecular docking for high throughput virtual screening was carried out using 623 compounds from the South African Natural Compounds Database (SANCDB; https://sancdb.rubi.ru.ac.za/) against identified allosteric sites. Ligands with the highest binding affinity (good binders) interacting with critical allosteric residues that are druggable were identified. Molecular dynamics (MD) simulation was also performed on the identified hits to assess for protein-inhibitor complex stability. Finally, principal component analysis (PCA) was performed to understand the structural dynamics of the ligand-free and ligand-bound structures during MD simulation.
- Full Text:
- Authors: Okeke, Chiamaka Jessica
- Date: 2019
- Subjects: Heat shock proteins , Escherichia coli , Allosteric proteins , Antibiotics , Molecular chaperones , Ligands (Biochemistry) , Molecular dynamics , Principal components analysis , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/115998 , vital:34287
- Description: Hsp70s are members of the heat shock proteins family with a molecular weight of 70-kDa and are the most abundant group in bacterial and eukaryotic systems, hence the most extensively studied ones. These proteins are molecular chaperones that play a significant role in protein homeostasis by facilitating appropriate folding of proteins, preventing proteins from aggregating and misfolding. They are also involved in translocation of proteins into subcellular compartments and protection of cells against stress. Stress caused by environmental or biological factors affects the functionality of the cell. In response to these stressful conditions, up-regulation of Hsp70s ensures that the cells are protected by balancing out unfolded proteins giving them ample time to repair denatured proteins. Hsp70s is connected to numerous illnesses such as autoimmune and neurodegenerative diseases, bacterial infection, cancer, malaria, and obesity. The multi-functional nature of Hsp70s predisposes them as promising therapeutic targets. Hsp70s play vital roles in various cell developments, and survival pathways, therefore targeting this protein will provide a new avenue towards the discovery of active therapeutic agents for the treatment of a wide range of diseases. Allosteric sites of these proteins in its multi-conformational states have not been explored for inhibitory properties hence the aim of this study. This study aims at identifying allosteric sites that inhibit the ATPase and substrate binding activities using computational approaches. Using E. coli as a model organism, molecular docking for high throughput virtual screening was carried out using 623 compounds from the South African Natural Compounds Database (SANCDB; https://sancdb.rubi.ru.ac.za/) against identified allosteric sites. Ligands with the highest binding affinity (good binders) interacting with critical allosteric residues that are druggable were identified. Molecular dynamics (MD) simulation was also performed on the identified hits to assess for protein-inhibitor complex stability. Finally, principal component analysis (PCA) was performed to understand the structural dynamics of the ligand-free and ligand-bound structures during MD simulation.
- Full Text:
Bioinformatics tool development with a focus on structural bioinformatics and the analysis of genetic variation in humans
- Authors: Brown, David K
- Date: 2018
- Subjects: Bioinformatics , Human genetics -- Variation , High performance computing , Workflow management systems , Molecular dynamics , Next generation sequencing , Human Mutation Analysis (HUMA)
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/60708 , vital:27820
- Description: This thesis is divided into three parts, united under the general theme of bioinformatics tool development and variation analysis. Part 1 describes the design and development of the Job Management System (JMS), a workflow management system for high performance computing (HPC). HPC has become an integral part of bioinformatics. Computational methods for molecular dynamics and next generation sequencing (NGS) analysis, which require complex calculations on large datasets, are not yet feasible on desktop computers. As such, powerful computer clusters have been employed to perform these calculations. However, making use of these HPC clusters requires familiarity with command line interfaces. This excludes a large number of researchers from taking advantage of these resources. JMS was developed as a tool to make it easier for researchers without a computer science background to make use of HPC. Additionally, JMS can be used to host computational tools and pipelines and generates both web-based interfaces and RESTful APIs for those tools. The web-based interfaces can be used to quickly and easily submit jobs to the underlying cluster. The RESTful web API, on the other hand, allows JMS to provided backend functionality for external tools and web servers that want to run jobs on the cluster. Numerous tools and workflows have already been added to JMS, several of which have been incorporated into external web servers. One such web server is the Human Mutation Analysis (HUMA) web server and database. HUMA, the topic of part 2 of this thesis, is a platform for the analysis of genetic variation in humans. HUMA aggregates data from various existing databases into a single, connected and related database. The advantages of this are realized in the powerful querying abilities that it provides. HUMA includes protein, gene, disease, and variation data and can be searched from the angle of any one of these categories. For example, searching for a protein will return the protein data (e.g. protein sequences, structures, domains and families, and other meta-data). However, the related nature of the database means that genes, diseases, variation, and literature related to the protein will also be returned, giving users a powerful and holistic view of all data associated with the protein. HUMA also provides links to the original sources of the data, allowing users to follow the links to find additional details. HUMA aims to be a platform for the analysis of genetic variation. As such, it also provides tools to visualize and analyse the data (several of which run on the underlying cluster, via JMS). These tools include alignment and 3D structure visualization, homology modeling, variant analysis, and the ability to upload custom variation datasets and map them to proteins, genes and diseases. HUMA also provides collaboration features, allowing users to share and discuss datasets and job results. Finally, part 3 of this thesis focused on the development of a suite of tools, MD-TASK, to analyse genetic variation at the protein structure level via network analysis of molecular dynamics simulations. The use of MD-TASK in combination with the tools developed in the previous parts of this thesis is showcased via the analysis of variation in the renin-angiotensinogen complex, a vital part of the renin-angiotensin system.
- Full Text:
- Authors: Brown, David K
- Date: 2018
- Subjects: Bioinformatics , Human genetics -- Variation , High performance computing , Workflow management systems , Molecular dynamics , Next generation sequencing , Human Mutation Analysis (HUMA)
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/60708 , vital:27820
- Description: This thesis is divided into three parts, united under the general theme of bioinformatics tool development and variation analysis. Part 1 describes the design and development of the Job Management System (JMS), a workflow management system for high performance computing (HPC). HPC has become an integral part of bioinformatics. Computational methods for molecular dynamics and next generation sequencing (NGS) analysis, which require complex calculations on large datasets, are not yet feasible on desktop computers. As such, powerful computer clusters have been employed to perform these calculations. However, making use of these HPC clusters requires familiarity with command line interfaces. This excludes a large number of researchers from taking advantage of these resources. JMS was developed as a tool to make it easier for researchers without a computer science background to make use of HPC. Additionally, JMS can be used to host computational tools and pipelines and generates both web-based interfaces and RESTful APIs for those tools. The web-based interfaces can be used to quickly and easily submit jobs to the underlying cluster. The RESTful web API, on the other hand, allows JMS to provided backend functionality for external tools and web servers that want to run jobs on the cluster. Numerous tools and workflows have already been added to JMS, several of which have been incorporated into external web servers. One such web server is the Human Mutation Analysis (HUMA) web server and database. HUMA, the topic of part 2 of this thesis, is a platform for the analysis of genetic variation in humans. HUMA aggregates data from various existing databases into a single, connected and related database. The advantages of this are realized in the powerful querying abilities that it provides. HUMA includes protein, gene, disease, and variation data and can be searched from the angle of any one of these categories. For example, searching for a protein will return the protein data (e.g. protein sequences, structures, domains and families, and other meta-data). However, the related nature of the database means that genes, diseases, variation, and literature related to the protein will also be returned, giving users a powerful and holistic view of all data associated with the protein. HUMA also provides links to the original sources of the data, allowing users to follow the links to find additional details. HUMA aims to be a platform for the analysis of genetic variation. As such, it also provides tools to visualize and analyse the data (several of which run on the underlying cluster, via JMS). These tools include alignment and 3D structure visualization, homology modeling, variant analysis, and the ability to upload custom variation datasets and map them to proteins, genes and diseases. HUMA also provides collaboration features, allowing users to share and discuss datasets and job results. Finally, part 3 of this thesis focused on the development of a suite of tools, MD-TASK, to analyse genetic variation at the protein structure level via network analysis of molecular dynamics simulations. The use of MD-TASK in combination with the tools developed in the previous parts of this thesis is showcased via the analysis of variation in the renin-angiotensinogen complex, a vital part of the renin-angiotensin system.
- Full Text:
In silico study of Plasmodium 1-deoxy-dxylulose 5-phosphate reductoisomerase (DXR) for identification of novel inhibitors from SANCDB
- Authors: Diallo, Bakary N'tji
- Date: 2018
- Subjects: Plasmodium 1-deoxy-dxylulose 5-phosphate reductoisomerase , Isoprenoids , Plasmodium , Antimalarials , Malaria -- Chemotherapy , Molecules -- Models , Molecular dynamics , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/64012 , vital:28523
- Description: Malaria remains a major health concern with a complex parasite constantly developing resistance to the different drugs introduced to treat it, threatening the efficacy of the current ACT treatment recommended by WHO (World Health Organization). Different antimalarial compounds with different mechanisms of action are ideal as this decreases chances of resistance occurring. Inhibiting DXR and consequently the MEP pathway is a good strategy to find a new antimalarial with a novel mode of action. From literature, all the enzymes of the MEP pathway have also been shown to be indispensable for the synthesis of isoprenoids. They have been validated as drug targets and the X-ray structure of each of the enzymes has been solved. DXR is a protein which catalyses the second step of the MEP pathway. There are currently 255 DXR inhibitors in the Binding Database (accessed November 2017) generally based on the fosmidomycin structural scaffold and thus often showing poor drug likeness properties. This study aims to research new DXR inhibitors using in silico techniques. We analysed the protein sequence and built 3D models in close and open conformations for the different Plasmodium sequences. Then SANCDB compounds were screened to identify new potential DXR inhibitors with new chemical scaffolds. Finally, the identified hits were submitted to molecular dynamics studies, preceded by a parameterization of the manganese atom in the protein active site.
- Full Text:
- Authors: Diallo, Bakary N'tji
- Date: 2018
- Subjects: Plasmodium 1-deoxy-dxylulose 5-phosphate reductoisomerase , Isoprenoids , Plasmodium , Antimalarials , Malaria -- Chemotherapy , Molecules -- Models , Molecular dynamics , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/64012 , vital:28523
- Description: Malaria remains a major health concern with a complex parasite constantly developing resistance to the different drugs introduced to treat it, threatening the efficacy of the current ACT treatment recommended by WHO (World Health Organization). Different antimalarial compounds with different mechanisms of action are ideal as this decreases chances of resistance occurring. Inhibiting DXR and consequently the MEP pathway is a good strategy to find a new antimalarial with a novel mode of action. From literature, all the enzymes of the MEP pathway have also been shown to be indispensable for the synthesis of isoprenoids. They have been validated as drug targets and the X-ray structure of each of the enzymes has been solved. DXR is a protein which catalyses the second step of the MEP pathway. There are currently 255 DXR inhibitors in the Binding Database (accessed November 2017) generally based on the fosmidomycin structural scaffold and thus often showing poor drug likeness properties. This study aims to research new DXR inhibitors using in silico techniques. We analysed the protein sequence and built 3D models in close and open conformations for the different Plasmodium sequences. Then SANCDB compounds were screened to identify new potential DXR inhibitors with new chemical scaffolds. Finally, the identified hits were submitted to molecular dynamics studies, preceded by a parameterization of the manganese atom in the protein active site.
- Full Text:
The investigation of type-specific features of the copper coordinating AA9 proteins and their effect on the interaction with crystalline cellulose using molecular dynamics studies
- Authors: Moses, Vuyani
- Date: 2018
- Subjects: Copper proteins , Cellulose , Molecular dynamics , Cellulose -- Biodegradation , Bioinformatics
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/58327 , vital:27230
- Description: AA9 proteins are metallo-enzymes which are crucial for the early stages of cellulose degradation. AA9 proteins have been suggested to cleave glycosidic bonds linking cellulose through the use of their Cu2+ coordinating active site. AA9 proteins possess different regioselectivities depending on the resulting cleavage they form and as result, are grouped accordingly. Type 1 AA9 proteins cleave the C1 carbon of cellulose while Type 2 AA9 proteins cleave the C4 carbon and Type 3 AA9 proteins cleave either C1 or C4 carbons. The steric congestion of the AA9 active site has been proposed to be a contributor to the observed regioselectivity. As such, a bioinformatics characterisation of type-specific sequence and structural features was performed. Initially AA9 protein sequences were obtained from the Pfam database and multiple sequence alignment was performed. The sequences were phylogenetically characterised and sequences were grouped into their respective types and sub-groups were identified. A selection analysis was performed on AA9 LPMO types to determine the selective pressure acting on AA9 protein residues. Motif discovery was then performed to identify conserved sequence motifs in AA9 proteins. Once type-specific sequence features were identified structural mapping was performed to assess possible effects on substrate interaction. Physicochemical property analysis was also performed to assess biochemical differences between AA9 LPMO types. Molecular dynamics (MD) simulations were then employed to dynamically assess the consequences of the discovered type-specific features on AA9-cellulose interaction. Due to the absence of AA9 specific force field parameters MD simulations were not readily applicable. As a result, Potential Energy Surface (PES) scans were performed to evaluate the force field parameters for the AA9 active site using the PM6 semi empirical approach and least squares fitting. A Type 1 AA9 active site was constructed from the crystal structure 4B5Q, encompassing only the Cu2+ coordinating residues, the Cu2+ ion and two water residues. Due to the similarity in AA9 active sites, the Type force field parameters were validated on all three AA9 LPMO types. Two MD simulations for each AA9 LPMO types were conducted using two separate Lennard-Jones parameter sets. Once completed, the MD trajectories were analysed for various features including the RMSD, RMSF, radius of gyration, coordination during simulation, hydrogen bonding, secondary structure conservation and overall protein movement. Force field parameters were successfully evaluated and validated for AA9 proteins. MD simulations of AA9 proteins were able to reveal the presence of unique type-specific binding modes of AA9 active sites to cellulose. These binding modes were characterised by the presence of unique type-specific loops which were present in Type 2 and 3 AA9 proteins but not in Type 1 AA9 proteins. The loops were found to result in steric congestion that affects how the Cu2+ ion interacts with cellulose. As a result, Cu2+ binding to cellulose was observed for Type 1 and not Type 2 and 3 AA9 proteins. In this study force field parameters have been evaluated for the Type 1 active site of AA9 proteins and this parameters were evaluated on all three types and binding. Future work will focus on identifying the nature of the reactive oxygen species and performing QM/MM calculations to elucidate the reactive mechanism of all three AA9 LPMO types.
- Full Text:
- Authors: Moses, Vuyani
- Date: 2018
- Subjects: Copper proteins , Cellulose , Molecular dynamics , Cellulose -- Biodegradation , Bioinformatics
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/58327 , vital:27230
- Description: AA9 proteins are metallo-enzymes which are crucial for the early stages of cellulose degradation. AA9 proteins have been suggested to cleave glycosidic bonds linking cellulose through the use of their Cu2+ coordinating active site. AA9 proteins possess different regioselectivities depending on the resulting cleavage they form and as result, are grouped accordingly. Type 1 AA9 proteins cleave the C1 carbon of cellulose while Type 2 AA9 proteins cleave the C4 carbon and Type 3 AA9 proteins cleave either C1 or C4 carbons. The steric congestion of the AA9 active site has been proposed to be a contributor to the observed regioselectivity. As such, a bioinformatics characterisation of type-specific sequence and structural features was performed. Initially AA9 protein sequences were obtained from the Pfam database and multiple sequence alignment was performed. The sequences were phylogenetically characterised and sequences were grouped into their respective types and sub-groups were identified. A selection analysis was performed on AA9 LPMO types to determine the selective pressure acting on AA9 protein residues. Motif discovery was then performed to identify conserved sequence motifs in AA9 proteins. Once type-specific sequence features were identified structural mapping was performed to assess possible effects on substrate interaction. Physicochemical property analysis was also performed to assess biochemical differences between AA9 LPMO types. Molecular dynamics (MD) simulations were then employed to dynamically assess the consequences of the discovered type-specific features on AA9-cellulose interaction. Due to the absence of AA9 specific force field parameters MD simulations were not readily applicable. As a result, Potential Energy Surface (PES) scans were performed to evaluate the force field parameters for the AA9 active site using the PM6 semi empirical approach and least squares fitting. A Type 1 AA9 active site was constructed from the crystal structure 4B5Q, encompassing only the Cu2+ coordinating residues, the Cu2+ ion and two water residues. Due to the similarity in AA9 active sites, the Type force field parameters were validated on all three AA9 LPMO types. Two MD simulations for each AA9 LPMO types were conducted using two separate Lennard-Jones parameter sets. Once completed, the MD trajectories were analysed for various features including the RMSD, RMSF, radius of gyration, coordination during simulation, hydrogen bonding, secondary structure conservation and overall protein movement. Force field parameters were successfully evaluated and validated for AA9 proteins. MD simulations of AA9 proteins were able to reveal the presence of unique type-specific binding modes of AA9 active sites to cellulose. These binding modes were characterised by the presence of unique type-specific loops which were present in Type 2 and 3 AA9 proteins but not in Type 1 AA9 proteins. The loops were found to result in steric congestion that affects how the Cu2+ ion interacts with cellulose. As a result, Cu2+ binding to cellulose was observed for Type 1 and not Type 2 and 3 AA9 proteins. In this study force field parameters have been evaluated for the Type 1 active site of AA9 proteins and this parameters were evaluated on all three types and binding. Future work will focus on identifying the nature of the reactive oxygen species and performing QM/MM calculations to elucidate the reactive mechanism of all three AA9 LPMO types.
- Full Text:
Structural studies on yeast eIF5A using biomolecular NMR and molecular dynamics
- Authors: Sigauke, Lester Takunda
- Date: 2015
- Subjects: Molecular dynamics , Reverse transcriptase , HIV (Viruses) , HIV infections , Eukaryotic cells , Yeast
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4547 , http://hdl.handle.net/10962/d1017927
- Description: Eukaryotic initiation factor 5A, eIF5A, is a ubiquitous eukaryotic protein that has been shown to influence the translation initiation of a specific subset of mRNAs. It is the only protein known to undergo hypusination in a two-step post translational modification process involving deoxyhypusine synthase (DHS) and deoxyhypusine hydroxylase (DOHH) enzymes. Hypusination has been shown to influence translation of HIV-1 and HTLV-1 nuclear export signals, while the involvement of active hypusinated eIF5A in induction of IRES mediated processes that initiate pro-apoptotic process have inspired studies into the manipulation of eIF5A in anti-cancer and anti-diabetic therapies. eIF5A oligomerisation in eukaryotic systems has been shown to be influenced by hypusination and the mechanism of dimerisation is RNA dependent. Nuclear magnetic resonance spectroscopy approaches were proposed to solve the structure of the hypusinated eIF5A in solution in order to understand the influence of hypusination on the monomeric arrangement which enhances dimerisation and activates the protein. Cleavage of the 18 kDa protein monomer by introduction of thrombin cleavage site within the flexible domain was thought to give rise to 10 kDa fragments accessible to a 600 MHz NMR spectrometer. Heteronuclear single quantum correlation experiments of the mutated isotopically labelled protein expressed in E. coli showed that the eIF5A protein with a thrombin cleavage insert, eIF5AThr (eIF5A subscript Thr), was unfolded. In silico investigations of the behaviour of eIF5A and eIF5AThr (eIF5A subscript Thr) models in solution using molecular dynamics showed that the mutated model had different solution dynamics to the native model. Chemical shift predictors were used to extract atomic resolution data of solution dynamics and the introduction of rigidity in the flexible loop region of eIF5A affected solution behaviour consistent with lack of in vivo function of eIF5AThr (eIF5A subscript Thr) in yeast. Residual dipolar coupling and T₁ relaxation times were calculated in anticipation of the extraction of experimental data from RDC and relaxation dispersion experiments based on HSQC measurable restraints.
- Full Text:
- Authors: Sigauke, Lester Takunda
- Date: 2015
- Subjects: Molecular dynamics , Reverse transcriptase , HIV (Viruses) , HIV infections , Eukaryotic cells , Yeast
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4547 , http://hdl.handle.net/10962/d1017927
- Description: Eukaryotic initiation factor 5A, eIF5A, is a ubiquitous eukaryotic protein that has been shown to influence the translation initiation of a specific subset of mRNAs. It is the only protein known to undergo hypusination in a two-step post translational modification process involving deoxyhypusine synthase (DHS) and deoxyhypusine hydroxylase (DOHH) enzymes. Hypusination has been shown to influence translation of HIV-1 and HTLV-1 nuclear export signals, while the involvement of active hypusinated eIF5A in induction of IRES mediated processes that initiate pro-apoptotic process have inspired studies into the manipulation of eIF5A in anti-cancer and anti-diabetic therapies. eIF5A oligomerisation in eukaryotic systems has been shown to be influenced by hypusination and the mechanism of dimerisation is RNA dependent. Nuclear magnetic resonance spectroscopy approaches were proposed to solve the structure of the hypusinated eIF5A in solution in order to understand the influence of hypusination on the monomeric arrangement which enhances dimerisation and activates the protein. Cleavage of the 18 kDa protein monomer by introduction of thrombin cleavage site within the flexible domain was thought to give rise to 10 kDa fragments accessible to a 600 MHz NMR spectrometer. Heteronuclear single quantum correlation experiments of the mutated isotopically labelled protein expressed in E. coli showed that the eIF5A protein with a thrombin cleavage insert, eIF5AThr (eIF5A subscript Thr), was unfolded. In silico investigations of the behaviour of eIF5A and eIF5AThr (eIF5A subscript Thr) models in solution using molecular dynamics showed that the mutated model had different solution dynamics to the native model. Chemical shift predictors were used to extract atomic resolution data of solution dynamics and the introduction of rigidity in the flexible loop region of eIF5A affected solution behaviour consistent with lack of in vivo function of eIF5AThr (eIF5A subscript Thr) in yeast. Residual dipolar coupling and T₁ relaxation times were calculated in anticipation of the extraction of experimental data from RDC and relaxation dispersion experiments based on HSQC measurable restraints.
- Full Text:
- «
- ‹
- 1
- ›
- »