Application of machine learning, molecular modelling and structural data mining against antiretroviral drug resistance in HIV-1
- Sheik Amamuddy, Olivier Serge André
- Authors: Sheik Amamuddy, Olivier Serge André
- Date: 2020
- Subjects: Machine learning , Molecules -- Models , Data mining , Neural networks (Computer science) , Antiretroviral agents , Protease inhibitors , Drug resistance , Multidrug resistance , Molecular dynamics , Renin-angiotensin system , HIV (Viruses) -- South Africa , HIV (Viruses) -- Social aspects -- South Africa , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115964 , vital:34282
- Description: Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis.
- Full Text:
- Authors: Sheik Amamuddy, Olivier Serge André
- Date: 2020
- Subjects: Machine learning , Molecules -- Models , Data mining , Neural networks (Computer science) , Antiretroviral agents , Protease inhibitors , Drug resistance , Multidrug resistance , Molecular dynamics , Renin-angiotensin system , HIV (Viruses) -- South Africa , HIV (Viruses) -- Social aspects -- South Africa , South African Natural Compounds Database
- Language: English
- Type: text , Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/115964 , vital:34282
- Description: Millions are affected with the Human Immunodeficiency Virus (HIV) world wide, even though the death toll is on the decline. Antiretrovirals (ARVs), more specifically protease inhibitors have shown tremendous success since their introduction into therapy since the mid 1990’s by slowing down progression to the Acquired Immune Deficiency Syndrome (AIDS). However, Drug Resistance Mutations (DRMs) are constantly selected for due to viral adaptation, making drugs less effective over time. The current challenge is to manage the infection optimally with a limited set of drugs, with differing associated levels of toxicities in the face of a virus that (1) exists as a quasispecies, (2) may transmit acquired DRMs to drug-naive individuals and (3) that can manifest class-wide resistance due to similarities in design. The presence of latent reservoirs, unawareness of infection status, education and various socio-economic factors make the problem even more complex. Adequate timing and choice of drug prescription together with treatment adherence are very important as drug toxicities, drug failure and sub-optimal treatment regimens leave room for further development of drug resistance. While CD4 cell count and the determination of viral load from patients in resource-limited settings are very helpful to track how well a patient’s immune system is able to keep the virus in check, they can be lengthy in determining whether an ARV is effective. Phenosense assay kits answer this problem using viruses engineered to contain the patient sequences and evaluating their growth in the presence of different ARVs, but this can be expensive and too involved for routine checks. As a cheaper and faster alternative, genotypic assays provide similar information from HIV pol sequences obtained from blood samples, inferring ARV efficacy on the basis of drug resistance mutation patterns. However, these are inherently complex and the various methods of in silico prediction, such as Geno2pheno, REGA and Stanford HIVdb do not always agree in every case, even though this gap decreases as the list of resistance mutations is updated. A major gap in HIV treatment is that the information used for predicting drug resistance is mainly computed from data containing an overwhelming majority of B subtype HIV, when these only comprise about 12% of the worldwide HIV infections. In addition to growing evidence that drug resistance is subtype-related, it is intuitive to hypothesize that as subtyping is a phylogenetic classification, the more divergent a subtype is from the strains used in training prediction models, the less their resistance profiles would correlate. For the aforementioned reasons, we used a multi-faceted approach to attack the virus in multiple ways. This research aimed to (1) improve resistance prediction methods by focusing solely on the available subtype, (2) mine structural information pertaining to resistance in order to find any exploitable weak points and increase knowledge of the mechanistic processes of drug resistance in HIV protease. Finally, (3) we screen for protease inhibitors amongst a database of natural compounds [the South African natural compound database (SANCDB)] to find molecules or molecular properties usable to come up with improved inhibition against the drug target. In this work, structural information was mined using the Anisotropic Network Model, Dynamics Cross-Correlation, Perturbation Response Scanning, residue contact network analysis and the radius of gyration. These methods failed to give any resistance-associated patterns in terms of natural movement, internal correlated motions, residue perturbation response, relational behaviour and global compaction respectively. Applications of drug docking, homology-modelling and energy minimization for generating features suitable for machine-learning were not very promising, and rather suggest that the value of binding energies by themselves from Vina may not be very reliable quantitatively. All these failures lead to a refinement that resulted in a highly sensitive statistically-guided network construction and analysis, which leads to key findings in the early dynamics associated with resistance across all PI drugs. The latter experiment unravelled a conserved lateral expansion motion occurring at the flap elbows, and an associated contraction that drives the base of the dimerization domain towards the catalytic site’s floor in the case of drug resistance. Interestingly, we found that despite the conserved movement, bond angles were degenerate. Alongside, 16 Artificial Neural Network models were optimised for HIV proteases and reverse transcriptase inhibitors, with performances on par with Stanford HIVdb. Finally, we prioritised 9 compounds with potential protease inhibitory activity using virtual screening and molecular dynamics (MD) to additionally suggest a promising modification to one of the compounds. This yielded another molecule inhibiting equally well both opened and closed receptor target conformations, whereby each of the compounds had been selected against an array of multi-drug-resistant receptor variants. While a main hurdle was a lack of non-B subtype data, our findings, especially from the statistically-guided network analysis, may extrapolate to a certain extent to them as the level of conservation was very high within subtype B, despite all the present variations. This network construction method lays down a sensitive approach for analysing a pair of alternate phenotypes for which complex patterns prevail, given a sufficient number of experimental units. During the course of research a weighted contact mapping tool was developed to compare renin-angiotensinogen variants and packaged as part of the MD-TASK tool suite. Finally the functionality, compatibility and performance of the MODE-TASK tool were evaluated and confirmed for both Python2.7.x and Python3.x, for the analysis of normals modes from single protein structures and essential modes from MD trajectories. These techniques and tools collectively add onto the conventional means of MD analysis.
- Full Text:
Synthesis of novel heterocyclic systems as potential inhibitors of HIV-1 enzymes
- Authors: Sekgota, Khethobole Cassius
- Date: 2020
- Subjects: Protease inhibitors , Heterocyclic compounds , HIV (Viruses) , Quinoline , Amides , Nuclear magnetic resonance , Antiretroviral agents , AIDS vaccines , Nitrobenzaldehyde , Propylphosphonic acid anhydride
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/146502 , vital:38531
- Description: This study has focussed on the application of Baylis-Hillman methodology in the development of efficient synthetic pathways to libraries of novel 3-[(N-cycloalkylbenzamido)methyl]-2-quinolones and indolizine-2-carboxamides and on an exploration of their medicinal potential. The approach to 3-[(N-cycloalkylbenzamido)methyl]-2(1H)-quinolones involved a six-step pathway comprising: Baylis-Hillman reaction of 2-nitrobenzaldehyde derivatives and methyl acrylate to afford nitro-Baylis-Hillman adducts; thermal cyclisation of the adducts to give a range of 3-(acetoxymethyl)-2(1H)-quinolones in good to excellent yields; hydrolysis of the acetates; conversion of the resulting alcohols to the 3-chloromethyl analogues; amination; and, finally, acylation to afford the target amides. Variable temperature NMR methods were used to facilitate analysis of the ¹H and ¹³C NMR spectra which were complicated by internal rotation and cycloalkyl ring-flipping effects. On the other hand, the indolizine-2-carboxamides were obtained in several steps commencing with the Baylis-Hillman reaction of pyridine-2-carboxaldehyde and methyl acrylate. Thermal cyclisation of the Baylis-Hillman adduct afforded indolizine esters, hydrolysis of which gave the corresponding acids which served as precursors to the target indolizine-2-carboxamides. The final amidation step, however, proved to be particularly challenging. Various coupling strategies were explored to access indolizine-2-carboxamides. These included the use of 2,2,2-trifluoroethyl borate which showed limited promise, but propylphosphonic acid anhydride (T3P) proved to be the most effective coupling agent, permitting the formation of 24 novel indolizine-2-carboxamides from hydrazines, aliphatic amines and a range of heterocyclic amines. A high-field NMR-based kinetic study of the mechanism of the Baylis-Hillman reaction of pyridine-4-carboxaldehyde and methyl acrylate in the presence of 3-hydroxyquinuclidine in deuterated chloroform was initiated, reaction progress being followed by the automated collection of ¹H and DEPT 135 NMR spectra over ca. 24 hours using a high-field (600 MHz) NMR instrument. The results have provided critical new insights into the mechanism. NMR analysis has also been used to elucidate the multiplicity of signals associated with rotameric equilibria observed at ambient probe temperature. Variable temperature 1D- and 2D-NMR spectra were used to facilitate the unambiguous characterisation of the 2-quinolone benzamides and some of the indolizine-2-carboxamides. The 3-[(N-cycloalkylbenzamido)methyl]-2(1H)-quinolones, together with selected precursors, and a number of the indolizine-2-carboxamides have been screened in vitro as potential HIV-1 enzyme inhibitors. A survey of the activity of the 2-quinolones against HIV-1 integrase, protease and reverse transcriptase revealed selective inhibition of HIV-1 integrase with the most active IN inhibitor, 3-[(cyclopentylamino)methyl-6-methoxy-2(1H)-quinolone 115e, producing residual enzyme activity of 40% at a concentration of 20 μM. Many of the 2-quinolones exhibited no significant cytotoxicity against HEK 293 cells at 20 μM concentrations. 3-[(N-Cyclohexylamino)methyl]-6-methoxy-2(1H)-quinolone 114e was the only compound to exhibit ant-plasmodial activity (55% pfLDH activity). The survey of indolizine-2-carboxamides also revealed encouraging inhibition against HIV-1 integrase. None of these compounds exhibited cytotoxicity at 20 μM against HEK 293 cells, while a number of them exhibited some activity against Plasmodium falciparum (3D7 strain) and Trypanosoma brucei. Selected indolizine-2-carboxamides exhibited significant anti-tubercular activity in the 7H9 CAS GLU Tx and 7H9 ADC GLU Tw media. In view of the inherent fluorescent character and biological potential of the synthesised indolizine-2-carboxamides, their photophysical properties were explored to establish their possible dual use as bio-imaging and therapeutic agents. The major absorption and corresponding emission bands, and the associated molar absorption coefficients (Ɛ) expressed in the form of log Ɛ were determined. Their high extinction coefficients, large Stokes shift and red-shifted emissions in the visible region indicate their potential for use as fluorophores.
- Full Text:
- Authors: Sekgota, Khethobole Cassius
- Date: 2020
- Subjects: Protease inhibitors , Heterocyclic compounds , HIV (Viruses) , Quinoline , Amides , Nuclear magnetic resonance , Antiretroviral agents , AIDS vaccines , Nitrobenzaldehyde , Propylphosphonic acid anhydride
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: http://hdl.handle.net/10962/146502 , vital:38531
- Description: This study has focussed on the application of Baylis-Hillman methodology in the development of efficient synthetic pathways to libraries of novel 3-[(N-cycloalkylbenzamido)methyl]-2-quinolones and indolizine-2-carboxamides and on an exploration of their medicinal potential. The approach to 3-[(N-cycloalkylbenzamido)methyl]-2(1H)-quinolones involved a six-step pathway comprising: Baylis-Hillman reaction of 2-nitrobenzaldehyde derivatives and methyl acrylate to afford nitro-Baylis-Hillman adducts; thermal cyclisation of the adducts to give a range of 3-(acetoxymethyl)-2(1H)-quinolones in good to excellent yields; hydrolysis of the acetates; conversion of the resulting alcohols to the 3-chloromethyl analogues; amination; and, finally, acylation to afford the target amides. Variable temperature NMR methods were used to facilitate analysis of the ¹H and ¹³C NMR spectra which were complicated by internal rotation and cycloalkyl ring-flipping effects. On the other hand, the indolizine-2-carboxamides were obtained in several steps commencing with the Baylis-Hillman reaction of pyridine-2-carboxaldehyde and methyl acrylate. Thermal cyclisation of the Baylis-Hillman adduct afforded indolizine esters, hydrolysis of which gave the corresponding acids which served as precursors to the target indolizine-2-carboxamides. The final amidation step, however, proved to be particularly challenging. Various coupling strategies were explored to access indolizine-2-carboxamides. These included the use of 2,2,2-trifluoroethyl borate which showed limited promise, but propylphosphonic acid anhydride (T3P) proved to be the most effective coupling agent, permitting the formation of 24 novel indolizine-2-carboxamides from hydrazines, aliphatic amines and a range of heterocyclic amines. A high-field NMR-based kinetic study of the mechanism of the Baylis-Hillman reaction of pyridine-4-carboxaldehyde and methyl acrylate in the presence of 3-hydroxyquinuclidine in deuterated chloroform was initiated, reaction progress being followed by the automated collection of ¹H and DEPT 135 NMR spectra over ca. 24 hours using a high-field (600 MHz) NMR instrument. The results have provided critical new insights into the mechanism. NMR analysis has also been used to elucidate the multiplicity of signals associated with rotameric equilibria observed at ambient probe temperature. Variable temperature 1D- and 2D-NMR spectra were used to facilitate the unambiguous characterisation of the 2-quinolone benzamides and some of the indolizine-2-carboxamides. The 3-[(N-cycloalkylbenzamido)methyl]-2(1H)-quinolones, together with selected precursors, and a number of the indolizine-2-carboxamides have been screened in vitro as potential HIV-1 enzyme inhibitors. A survey of the activity of the 2-quinolones against HIV-1 integrase, protease and reverse transcriptase revealed selective inhibition of HIV-1 integrase with the most active IN inhibitor, 3-[(cyclopentylamino)methyl-6-methoxy-2(1H)-quinolone 115e, producing residual enzyme activity of 40% at a concentration of 20 μM. Many of the 2-quinolones exhibited no significant cytotoxicity against HEK 293 cells at 20 μM concentrations. 3-[(N-Cyclohexylamino)methyl]-6-methoxy-2(1H)-quinolone 114e was the only compound to exhibit ant-plasmodial activity (55% pfLDH activity). The survey of indolizine-2-carboxamides also revealed encouraging inhibition against HIV-1 integrase. None of these compounds exhibited cytotoxicity at 20 μM against HEK 293 cells, while a number of them exhibited some activity against Plasmodium falciparum (3D7 strain) and Trypanosoma brucei. Selected indolizine-2-carboxamides exhibited significant anti-tubercular activity in the 7H9 CAS GLU Tx and 7H9 ADC GLU Tw media. In view of the inherent fluorescent character and biological potential of the synthesised indolizine-2-carboxamides, their photophysical properties were explored to establish their possible dual use as bio-imaging and therapeutic agents. The major absorption and corresponding emission bands, and the associated molar absorption coefficients (Ɛ) expressed in the form of log Ɛ were determined. Their high extinction coefficients, large Stokes shift and red-shifted emissions in the visible region indicate their potential for use as fluorophores.
- Full Text:
- «
- ‹
- 1
- ›
- »