Chlorobenzene as a new halogen bond probe in molecular dynamics simulations

By Zoe Greenburg

Non-covalent bonds play a crucial role in chemistry, biochemistry, and biology. H-bonds, and hydrophobic bonds at the core of protein folding, and govern reversible interactions between biomolecules and their ligands. In drug discovery, H-bonding and hydrophobic interactions define the affinity and selectivity of small molecules for their biological targets.

With the intent to better understand these interactions and their relevance in the design of small molecule modulators of biological function, several computational solvent-mapping methods have been developed over the past few years. Schrodinger introduced WaterMap, a method that allows understanding the locations and thermodynamics of water molecules that solvate binding sites, and guiding the optimization of small molecule hits; conserved, strongly bound water molecules are costly to displace but can be useful bridges. An alternative method for understanding the role of water in ligand binding, which uses all-atom molecular dynamics (MD), and that can be accessed through this company, was developed by De Fabritiis et al. in 2011. Mapping of other solvents like ACN, propane, benzene, and IPA has also been demonstrated with MD simulations of binary mixtures, and been put in the context of target druggability assessment.

A new category of substrate to study: organohalogens

Halogens are common substituents in drugs as halogen bonding interactions can offer key boosts in selectivity and potency. Initially attributed to van der Waals forces, this is now explained in terms of the halogen bond, another type of non-covalent interaction that in recent years has been exploited for many applications in synthetic, supramolecular and medicinal chemistry.

Anisotropy in halogen electron charge distribution allows halogens to interact with donors such as Lewis base centers and π-surfaces, something that, if one is to consider electron density alone, can be rather counter intuitive. Halogen-donor interactions increase with the polarizability of the halogen’s surrounding electron cloud, and with the withdrawing power of adjacent atoms. These observations are now explained using the sigma hole concept, which defines a region of positive charge at the surface of the halogen, along the covalent bond axis.

Within the complex structure of a protein, there are many atoms to which a halogen-containing substrate can form a halogen bond. Halogen – usually chlorine – interactions with backbone and side chain carbonyls, as well as with side chain alcohols, thiols and thioethers, amines and the surfaces of aromatic rings can affect the conformation of the protein as well as the affinity that other binding sites on the protein have for other substrates, such as water. This is however difficult to predict even though doing so could be highly beneficial in guiding structure-based drug design. Experimental techniques that allow solvent mapping such as the multiple solvent crystal structure (MCSC), can be effective but are time intensive and expensive to run, and lack of an ideal strategy and/or representative force fields have limited computational approaches.

Computational halogen bond mapping with molecular dynamics simulations

Recently, Verma and collaborators proposed a molecular dynamics based method for halogen bond mapping. This study aimed to observe the effect of a chosen probe, in this case chlorobenzene, on the binding sites of four different proteins using molecular dynamics. Chlorobenzene is potentially a bifunctional substrate; the sp2 bonded chlorine can probe for halogen bonds while the aromatic ring can probe for π-stacking, or hydrophobic sites. For proof-of-concept, MDM2(Pdb:1YCR), MCL-1(Pdb:3kJ0), interleukin-2(Pdb:1Z92), and Bcl-xL(Pdb:1BXL) were selected because of their involvement in protein-protein interactions, the availability of structural data, and the well reported structural changes they endure upon substrate binding, which can present unexposed halogen bonding sites. All-atom MD simulations were selected for the development of protein-small molecule interaction maps because other computational methods considered could not capture the protein’s dynamics sufficiently, and Lexa and Carlson showed this to be crucial for effective solvent interaction mapping.

For chlorobenzene preparation the authors used GAFF. Atomic charges were derived using the R.E.D. Server by fitting restrained electrostatic potential charges to a molecular electrostatic potential computed with Gaussian. One peculiarity of chlorobenzene when compared to benzene, a molecule the authors also used for mapping in a previous study, was found to be the propensity of the halogenated compound to aggregate at lower concentrations, which forced to reduce the substrate concentration by 25% to 0.15M, and to increase the sampling time during the simulation experiments to 10ns per trajectory.

Using molecular dynamics simulations to construct halogen affinity maps

For MDM2, all of the six halogen bonding sites found in the Pdb were recovered by the proposed method. Three of these, known to be occupied by a halogenated benzodiazepine inhibitor, and two others, occupied by Nutlin-2, were recovered on first instance by the mapping protocol. The sixth was recovered after modification of the Pdb structure, as the initial structure lacked an important sequence of residues at the N- terminus. In addition, four cryptic hydrophobic binding sites were detected that were not present in the initial input conformation of the protein. The existence of these sites could also be validated by matching with results available in the literature. For the other three targets the authors report similar results.

All-atom MD simulation with chlorobenzene can be used to map halogen and hydrophobic binding sites

Using molecular dynamics simulations, and the chlorobenzene parameters shown above, the authors could map halogen and hydrophobic binding sites on all four tested proteins. The results were consistent with available experimental data, and show the potential of this method in structure based drug design. As we have come to see in our own work at Acellera probing libraries of fragments using all-atom MD simulations, the choice of force field is difficult for halogenated molecules. The ability of the force fields used in this paper to reproduce the binding sites was remarkable, and could potentially be harnessed for a broader use in computer aided drug design.

gianniChlorobenzene as a new halogen bond probe in molecular dynamics simulations
read more

Kinetic modulation of a disordered protein domain by phosphorylation

by Santi Esteban

Protein phosphorylation plays key roles in many signal transduction processes, preferentially targeting intrinsically disordered protein domains (IDPs). IDPs remain largely unstructured under native conditions, resembling random-coil polymers akin to the unfolded states of proteins. Present in more than 50% of eukaryotic proteins, IDPs perform a plethora of biological functions and are commonly associated with a variety of human diseases.

Technical limitations hamper intrinsically disordered protein characterization

The mechanism by which phosphorylation modulates disordered domains remains unknown. This stems from the extraordinary structural heterogeneity of IDPs, which poses a challenge for their characterization using common structural determination experimental techniques. For instance, the study of IDPs is inviable with X-ray crystallography because of the fast mobility of the domains, and NMR does not allow singling out the structural details of the multiple, underlying conformational states.

Can intrinsically disordered proteins be characterized using molecular dynamics simulations?

Recent advances in computer simulation technology offer a unique opportunity to study IDPs with atomic resolution on the millisecond timescale, the regime necessary to study these systems. The introduction of accelerator processors, and their application to MD, as well as the development of HTMD, and adaptive sampling methods, have allowed an increase of sampling efficiency of several orders of magnitude. Indeed, a single workstation can now obtain microsecond long simulations with atomic resolution in one day, and milliseconds can be obtained rapidly on a supercomputer, or through distributed computing projects. Nevertheless, this bourgeoning field is still limited in its ability to analyse millions of protein conformers, and compute their population and kinetics.

All-atom MD for intrinsically disordered protein characterization

In this work (Nat. Comm. 2014, DOI: 10.1038/ncomms6272), we successfully reconstructed the conformational ensemble of an IDP domain involved in gene transcription. A new analysis tool capable of resolving the disordered state of disordered domains was used in combination with large scale all-atom, unbiased, molecular simulations. 1.7 milliseconds of aggregated time were sampled. This methodology was used to study a well-characterized disordered fragment of the protein kinase-inducible domain (KID) in the cellular transcription factor of the cAMP response element-binding protein (CREB).

Identification of slow molecular order parameters for Markov model construction

The kinetic characterization of macromolecular systems requires the description of slow relaxation processes. This depends on the identification of the structural changes involved in these processes (selection of structural descriptors), the discretization of the high-dimensional coordinate space (clustering of conformers according to the descriptors) and estimation of the rates or timescales at which these slow processes occur. Approaches to this end include Markov models, master-equation models, and kinetic network models.

In this work we analyzed the simulated data by constructing a Markov state model of the entire ensemble of trajectories using inter-residue Cα-Cα distances and φ/ψ backbone dihedral angles as general metrics to build the kinetic model. The conformational space was discretized into one thousand clusters and then projected on a 5-dimensional space using time-lagged independent component analysis (TICA), which identifies the slow coordinates in the dynamics without relying on subjective guesses.

Phosphorylation reduces the conformational kinetics of intrinsically disordered proteins

The results presented in Figure 1 show the first microscopic view of the equilibrium states and conformational kinetics of a disordered protein domain at atomic resolution. The conformational exchange between state D (disordered) and state O (ordered) is shown together with forward (τ1) and backward (τ-1) transition times. Conformations were aligned using residues with the lowest Cα-Cα distance variance. N- to C-terminal is color coded red and blue.

We find that phosphorylation induces a 60-fold slowdown in conformational kinetics, compared to a non-phosphorylated domain, which involves a low populated and partially structured excited state known to participate in an early binding intermediate (Fig. 2). Figure 2a shows residue specific relaxation times of helix folding/unfolding process (filled bars) derived from autocorrelation function. Empty bars show chemical shifts changes in pKID that map residues participating in an early binding intermediate as detected by NMR. Figure 2b is an example of a kinetically locked conformation in phosphorylated KID. The peptide backbone is shown in cartoon representation, and the heavy atoms of phosphorylated serine 133 and several charged residues are shown as sticks.

Direct comparison against NMR data supports the validity of these findings and show that, contrary to what is typically assumed and in line with experiments, mutation of the phosphoserine to glutamate cannot recover the slowdown in conformational exchange (Fig. 1c).

Is the regulation of phosphorylation via the kinetic modulation of intrinsically disordered proteins conserved?

We propose a new mechanism of phosphorylation regulation by kinetic modulation which could be general for disordered domains which lack a well-defined equilibrium structure. This mechanism emerges from a theoretically derived kinetic model that shows that it is possible to modulate the binding affinity of protein-protein complexes by inducing a slowdown in conformational kinetics affecting both on- and off-conformational transition times by the same amount and simultaneously. This model agrees with kinetic and affinity values obtained through experiment (See paper).

With disordered domains taking part in more than 50% of proteins responsible for signaling in the cell, this kinetic mechanism of modulation highlights a further way by which post-translation modifications may affect disordered domains and their interactions with binding partners.

This work was performed in the De Fabritiis lab. Gianni De Fabritiis is the founder of Acellera. Should you be interested in performing similar studies in-house let Acellera know. We will be happy to inform you.

gianniKinetic modulation of a disordered protein domain by phosphorylation
read more

Adaptive Sampling for MD Simulations Talk at GTCBio

by David Soriano

A few of weeks ago we were invited to present at GTCBio Drug Design and Medicinal Chemistry 2014 Conference. The conference was very interesting and very well organized, and we will make sure to keep it our agenda for future years. The meeting also happened to be in Berlin, which instantly became one of my favorite cities. Our talk was titled “Machine Learning in FBLD” and discussed our new adaptive sampling method for probing protein ligand interactions using atomistic molecular dynamics simulations that was developed in collaboration with the De Fabritiis lab. So how did we get there?

High-throughput molecular dynamics can be used for profiling protein-ligand interactions

In 2011 we reported our seminal work towards a practical approach for profiling protein ligand interactions in silico with all atom resolution. In this proof-of-principle study we showed that with enough sampling we could use atomistic high-throughput molecular dynamics simulations to reconstruct the binding of benzamidine to trypsin, and not only obtain accurate energies, kinetics, poses but also resolve a ligand binding pathway. We were also able to predict several metastable states later observed experimentally.

Application of molecular dynamics simulations in fragment based lead discovery (FBLD)

We next focused on expanding the library size, and on adapting the methodology to fragment based ligand design. Benzamidine is a small molecule of micromolar affinity for trypsin, and therefore extending this method to FBLD was a natural progression. For the fragment study we used a 2003 STD NMR study that targeted Factor Xa, a target related to trypsin, and a set of forty fragments. Once more our MD experiments were able to recapitulate the binding of the library to the target, and also give accurate representations of binding energies, kinetics, poses and binding pathways. Importantly, we were also able to rank the compounds in order of increasing binding affinity accurately. As exciting as the results were, we needed to collect 2milliseconds of aggregate biological timescale trajectory data, a challenge we could only meet because of our access to GPUGrid through our collaboration with the De Fabritiis group.

Development of an adaptive sampling protocol for MD

One limitation of our HTMD based in-silico binding method at that time involved the amount of sampling needed for system convergence, which was in the order of 50×10^-6 s per compound for Factor Xa. Clearly, in order to make this high-throughput molecular dynamics methodology a practical complement to current experimental FBLD screens, we needed to improve the efficiency of our simulation methods. To this end, we developed a fully automated adaptive sampling method that can deliver system convergence about an order of magnitude faster than the brute-force HTMD approach.

As opposed to HTMD, where we run hundreds of simulations at once, our adaptive sampling protocol begins with only ten trajectories each starting from random initial states and each a few tens of nanoseconds long. We then analyze these trajectories using Markov State modelling, and use residence time information obtained from clustering to select the starting conformations of future runs. This run-analyze-respawn cycle is defined as an epoch, and each adaptive experiment is composed of ten epochs. Note that all of this is automated and that we only need to setup the initial trajectories before working up the results after the last epoch. For each protein-ligand system we run ten replicate adaptive experiments.

For our proof-of-concept we tested this adaptive sampling method on trypsin-benzamidine, a system for we which we have a very large amount of data and that over the years has become our benchmark. As opposed to regular HTMD where we needed to collect about 50×10^-6 s of data for convergence, only 5×10^-6 s were needed in our adaptive experiment. In each experiment, learning was evident after the 3rd epoch, and the energy and pose matched those of our control HTMD experiment after the 5th epoch, in about 80% of the experiments.

What next?

We are very excited about these results as in principle — and assuming there were no other bottlenecks — we should be able to screen 400 compounds in the same amount of time previously spent with 40. This would make this technique fast enough for medicinal chemists to start looking at it as a practical and reliable complement to current FBLD screens. We hope to be there soon but we still need to test the universality of the method thoroughly as well as work out other kinks such as the issue of ligand parametrization.

gianniAdaptive Sampling for MD Simulations Talk at GTCBio
read more

Crucial step in AIDS virus maturation simulated for first time

Source: IMIM

Bioinformaticians at IMIM (Hospital del Mar Medical Research Institute) and UPF (Pompeu Fabra University) have used molecular simulation techniques to explain a specific step in the maturation of the HIV virions, i.e., how newly formed inert virus particles become infectious, which is essential in understanding how the virus replicates. These results, which have been published in the latest edition of PNAS, could be crucial to the design of future antiretrovirals.

HIV virions mature and become infectious as a result of the action of a protein called HIV protease. This protein acts like a pair of scissors, cutting the long chain of connected proteins that form HIV into individual proteins that will form the infectious structure of new virions. According to the researchers of the IMIM-UPF computational biophysics group, “One of the most intriguing aspects of the whole HIV maturation process is how free HIV protease, i.e. the ‘scissors protein,’ appears for the first time, since it is also initially part of the long poly-protein chains that make up new HIV virions.

Using ACEMD a software for molecular simulations and a technology known as, Gianni De Fabritiis’ group has demonstrated that the first “scissors proteins” can cut themselves out from within the middle of these poly-protein chains. They do this by binding one of their connected ends (the N-terminus) to their own active site and then cutting the chemical bond that connects them to the rest of the chain. This is the initial step of the whole HIV maturation process. If the HIV protease can be stopped during the maturation process, it will prevent viral particles, or virions, from reaching maturity and, therefore, from becoming infectious.

This work was performed using, a voluntary distributed computing platform that harnesses the processing power of thousands of NVIDIA GPU accelerators from household computers made available by the public for research purposes. It’s akin to accessing a virtual supercomputer. One of the benefits of GPU acceleration is that it provides computing power that is around 10 times higher than that generated by computers based on CPUs alone. It reduces research costs accordingly by providing a level computational power that previously was only available on dedicated, multi-million dollar supercomputers.

Researchers use this computing power to process large numbers of data and generate highly complex molecular simulations. In this specific case, thousands of computer simulations have been carried out, each for hundreds of nanoseconds (billionths of a second) for a total of almost a millisecond.

According to researchers, this discovery in the HIV maturation process provides an alternative approach in the design of future pharmaceutical products based on the use of these new molecular mechanisms. For now, this work provides a greater understanding of a crucial step in the life cycle of HIV, a virus that directly attacks and weakens the human immune system, making it vulnerable to a wide range of infections, and which affects millions of people around the world.


Kinetic characterization of the critical step in HIV-1 protease maturation”. S Kashif Sadiq, Frank Noe and Gianni De Fabritiis. PNAS. DOI:10.1073/pnas.1210983109.

ACEMD Simulation details.

alejandroCrucial step in AIDS virus maturation simulated for first time
read more

New paper in JACS on conformational characterization of HIV-1 RT

A new paper using ACEMD has been published in JACS under the title Thumbs Down for HIV: Domain Level Rearrangements Do Occur in the NNRTI-Bound HIV-1 Reverse Transcriptase stemming from a collaboration between University College London (UK) and Universitat Pompeu Fabra (Spain) researchers.

In this work researchers unveiled previously undescribed closed conformations in drug-bound HIV-1 RT which suggesting that “allosteric modulation is effected via the alteration of the kinetic landscape of conformational transitions upon drug-binding”. In the publication researchers also state that “a more detailed understanding of the mechanism of NNRTI inhibition and the effect of binding upon domain motion could aid the design of more effective inhibitors and help identify novel allosteric sites.”

The work was performed through an ensemble molecular dynamics strategy using ACEMD and aggregate simulation time of ~0.6 µs.

D. W. Wright, S. K. Sadiq, G. De Fabritiis, P. V. Coveney, Thumbs Down for HIV: Domain Level Rearrangements Do Occur in the NNRTI-Bound HIV-1 Reverse Transcriptase, J. Am. Chem. Soc., 2012, 134 (31), 12885–12888.

alejandroNew paper in JACS on conformational characterization of HIV-1 RT
read more

Science paper confirms ACEMD results

A Science paper titled “Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions” by the group of Professor R. C. Stevens at The Scripps Research Institute in La Jolla, CA (USA) has confirmed the role of sodium ions in the allosteric regulation of GPCRs as it had been modeled by Selent et al. in 2010 using high-throughput MD simulations on ACEMD.

Results by Stevens Group shed on the importance of endogenous small molecules at specific binding sites as control mechanisms of membrane proteins. Such concept exceeds the common view of allostery via pharmacological ligands. The results have profound effects on the current understanding of the functional mechanisms of GPCRs, a broad family of proteins that comprises many of today’s pharmacological targets.

In that regard, computational modeling of small molecule-protein interactions using ACEMD has proven powerful tool to predict unknown solvent-derived effects on protein function.


– Liu W., et al., Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions, Science 2012: 337 (6091), 232-236. DOI:10.1126/science.1219218

– Selent J., et al., Induced Effects of Sodium Ions on Dopaminergic G-Protein Coupled Receptors, PLOS Computational Biology 2010, 6, e1000884. DOI:10.1371/journal.pcbi.1000884

– Source: The Scripps Research Institute

alejandroScience paper confirms ACEMD results
read more

A great new paper using ACEMD

100 for the price of 1:

Current computational tools allow performing microsecond long simulations routinely at low cost.  In a perspective article  Harvey and De Fabritiis discuss in the context of drug discovery the potential of performing molecular dynamics simulations in high-throughput (HT-MD) and argue the time for HT-MD is here.

Sweet dynamics:

Sattelle et al. use ACEMD to produce microsecond simulations of pyranose ring puckering and study its dependence on anomeric configuration. These results appear in the Journal of Physical Chemistry B:

alejandroA great new paper using ACEMD
read more