MD Methods

Chlorobenzene as a new halogen bond probe in molecular dynamics simulations

By Zoe Greenburg

Non-covalent bonds play a crucial role in chemistry, biochemistry, and biology. H-bonds, and hydrophobic bonds at the core of protein folding, and govern reversible interactions between biomolecules and their ligands. In drug discovery, H-bonding and hydrophobic interactions define the affinity and selectivity of small molecules for their biological targets.

With the intent to better understand these interactions and their relevance in the design of small molecule modulators of biological function, several computational solvent-mapping methods have been developed over the past few years. Schrodinger introduced WaterMap, a method that allows understanding the locations and thermodynamics of water molecules that solvate binding sites, and guiding the optimization of small molecule hits; conserved, strongly bound water molecules are costly to displace but can be useful bridges. An alternative method for understanding the role of water in ligand binding, which uses all-atom molecular dynamics (MD), and that can be accessed through this company, was developed by De Fabritiis et al. in 2011. Mapping of other solvents like ACN, propane, benzene, and IPA has also been demonstrated with MD simulations of binary mixtures, and been put in the context of target druggability assessment.

A new category of substrate to study: organohalogens

Halogens are common substituents in drugs as halogen bonding interactions can offer key boosts in selectivity and potency. Initially attributed to van der Waals forces, this is now explained in terms of the halogen bond, another type of non-covalent interaction that in recent years has been exploited for many applications in synthetic, supramolecular and medicinal chemistry.

Anisotropy in halogen electron charge distribution allows halogens to interact with donors such as Lewis base centers and π-surfaces, something that, if one is to consider electron density alone, can be rather counter intuitive. Halogen-donor interactions increase with the polarizability of the halogen’s surrounding electron cloud, and with the withdrawing power of adjacent atoms. These observations are now explained using the sigma hole concept, which defines a region of positive charge at the surface of the halogen, along the covalent bond axis.

Within the complex structure of a protein, there are many atoms to which a halogen-containing substrate can form a halogen bond. Halogen – usually chlorine – interactions with backbone and side chain carbonyls, as well as with side chain alcohols, thiols and thioethers, amines and the surfaces of aromatic rings can affect the conformation of the protein as well as the affinity that other binding sites on the protein have for other substrates, such as water. This is however difficult to predict even though doing so could be highly beneficial in guiding structure-based drug design. Experimental techniques that allow solvent mapping such as the multiple solvent crystal structure (MCSC), can be effective but are time intensive and expensive to run, and lack of an ideal strategy and/or representative force fields have limited computational approaches.

Computational halogen bond mapping with molecular dynamics simulations

Recently, Verma and collaborators proposed a molecular dynamics based method for halogen bond mapping. This study aimed to observe the effect of a chosen probe, in this case chlorobenzene, on the binding sites of four different proteins using molecular dynamics. Chlorobenzene is potentially a bifunctional substrate; the sp2 bonded chlorine can probe for halogen bonds while the aromatic ring can probe for π-stacking, or hydrophobic sites. For proof-of-concept, MDM2(Pdb:1YCR), MCL-1(Pdb:3kJ0), interleukin-2(Pdb:1Z92), and Bcl-xL(Pdb:1BXL) were selected because of their involvement in protein-protein interactions, the availability of structural data, and the well reported structural changes they endure upon substrate binding, which can present unexposed halogen bonding sites. All-atom MD simulations were selected for the development of protein-small molecule interaction maps because other computational methods considered could not capture the protein’s dynamics sufficiently, and Lexa and Carlson showed this to be crucial for effective solvent interaction mapping.

For chlorobenzene preparation the authors used GAFF. Atomic charges were derived using the R.E.D. Server by fitting restrained electrostatic potential charges to a molecular electrostatic potential computed with Gaussian. One peculiarity of chlorobenzene when compared to benzene, a molecule the authors also used for mapping in a previous study, was found to be the propensity of the halogenated compound to aggregate at lower concentrations, which forced to reduce the substrate concentration by 25% to 0.15M, and to increase the sampling time during the simulation experiments to 10ns per trajectory.

Using molecular dynamics simulations to construct halogen affinity maps

For MDM2, all of the six halogen bonding sites found in the Pdb were recovered by the proposed method. Three of these, known to be occupied by a halogenated benzodiazepine inhibitor, and two others, occupied by Nutlin-2, were recovered on first instance by the mapping protocol. The sixth was recovered after modification of the Pdb structure, as the initial structure lacked an important sequence of residues at the N- terminus. In addition, four cryptic hydrophobic binding sites were detected that were not present in the initial input conformation of the protein. The existence of these sites could also be validated by matching with results available in the literature. For the other three targets the authors report similar results.

All-atom MD simulation with chlorobenzene can be used to map halogen and hydrophobic binding sites

Using molecular dynamics simulations, and the chlorobenzene parameters shown above, the authors could map halogen and hydrophobic binding sites on all four tested proteins. The results were consistent with available experimental data, and show the potential of this method in structure based drug design. As we have come to see in our own work at Acellera probing libraries of fragments using all-atom MD simulations, the choice of force field is difficult for halogenated molecules. The ability of the force fields used in this paper to reproduce the binding sites was remarkable, and could potentially be harnessed for a broader use in computer aided drug design.

gianniChlorobenzene as a new halogen bond probe in molecular dynamics simulations
read more

Kinetic modulation of a disordered protein domain by phosphorylation

by Santi Esteban

Protein phosphorylation plays key roles in many signal transduction processes, preferentially targeting intrinsically disordered protein domains (IDPs). IDPs remain largely unstructured under native conditions, resembling random-coil polymers akin to the unfolded states of proteins. Present in more than 50% of eukaryotic proteins, IDPs perform a plethora of biological functions and are commonly associated with a variety of human diseases.

Technical limitations hamper intrinsically disordered protein characterization

The mechanism by which phosphorylation modulates disordered domains remains unknown. This stems from the extraordinary structural heterogeneity of IDPs, which poses a challenge for their characterization using common structural determination experimental techniques. For instance, the study of IDPs is inviable with X-ray crystallography because of the fast mobility of the domains, and NMR does not allow singling out the structural details of the multiple, underlying conformational states.

Can intrinsically disordered proteins be characterized using molecular dynamics simulations?

Recent advances in computer simulation technology offer a unique opportunity to study IDPs with atomic resolution on the millisecond timescale, the regime necessary to study these systems. The introduction of accelerator processors, and their application to MD, as well as the development of HTMD, and adaptive sampling methods, have allowed an increase of sampling efficiency of several orders of magnitude. Indeed, a single workstation can now obtain microsecond long simulations with atomic resolution in one day, and milliseconds can be obtained rapidly on a supercomputer, or through distributed computing projects. Nevertheless, this bourgeoning field is still limited in its ability to analyse millions of protein conformers, and compute their population and kinetics.

All-atom MD for intrinsically disordered protein characterization

In this work (Nat. Comm. 2014, DOI: 10.1038/ncomms6272), we successfully reconstructed the conformational ensemble of an IDP domain involved in gene transcription. A new analysis tool capable of resolving the disordered state of disordered domains was used in combination with large scale all-atom, unbiased, molecular simulations. 1.7 milliseconds of aggregated time were sampled. This methodology was used to study a well-characterized disordered fragment of the protein kinase-inducible domain (KID) in the cellular transcription factor of the cAMP response element-binding protein (CREB).

Identification of slow molecular order parameters for Markov model construction

The kinetic characterization of macromolecular systems requires the description of slow relaxation processes. This depends on the identification of the structural changes involved in these processes (selection of structural descriptors), the discretization of the high-dimensional coordinate space (clustering of conformers according to the descriptors) and estimation of the rates or timescales at which these slow processes occur. Approaches to this end include Markov models, master-equation models, and kinetic network models.

In this work we analyzed the simulated data by constructing a Markov state model of the entire ensemble of trajectories using inter-residue Cα-Cα distances and φ/ψ backbone dihedral angles as general metrics to build the kinetic model. The conformational space was discretized into one thousand clusters and then projected on a 5-dimensional space using time-lagged independent component analysis (TICA), which identifies the slow coordinates in the dynamics without relying on subjective guesses.

Phosphorylation reduces the conformational kinetics of intrinsically disordered proteins

The results presented in Figure 1 show the first microscopic view of the equilibrium states and conformational kinetics of a disordered protein domain at atomic resolution. The conformational exchange between state D (disordered) and state O (ordered) is shown together with forward (τ1) and backward (τ-1) transition times. Conformations were aligned using residues with the lowest Cα-Cα distance variance. N- to C-terminal is color coded red and blue.

We find that phosphorylation induces a 60-fold slowdown in conformational kinetics, compared to a non-phosphorylated domain, which involves a low populated and partially structured excited state known to participate in an early binding intermediate (Fig. 2). Figure 2a shows residue specific relaxation times of helix folding/unfolding process (filled bars) derived from autocorrelation function. Empty bars show chemical shifts changes in pKID that map residues participating in an early binding intermediate as detected by NMR. Figure 2b is an example of a kinetically locked conformation in phosphorylated KID. The peptide backbone is shown in cartoon representation, and the heavy atoms of phosphorylated serine 133 and several charged residues are shown as sticks.

Direct comparison against NMR data supports the validity of these findings and show that, contrary to what is typically assumed and in line with experiments, mutation of the phosphoserine to glutamate cannot recover the slowdown in conformational exchange (Fig. 1c).

Is the regulation of phosphorylation via the kinetic modulation of intrinsically disordered proteins conserved?

We propose a new mechanism of phosphorylation regulation by kinetic modulation which could be general for disordered domains which lack a well-defined equilibrium structure. This mechanism emerges from a theoretically derived kinetic model that shows that it is possible to modulate the binding affinity of protein-protein complexes by inducing a slowdown in conformational kinetics affecting both on- and off-conformational transition times by the same amount and simultaneously. This model agrees with kinetic and affinity values obtained through experiment (See paper).

With disordered domains taking part in more than 50% of proteins responsible for signaling in the cell, this kinetic mechanism of modulation highlights a further way by which post-translation modifications may affect disordered domains and their interactions with binding partners.

This work was performed in the De Fabritiis lab. Gianni De Fabritiis is the founder of Acellera. Should you be interested in performing similar studies in-house let Acellera know. We will be happy to inform you.

gianniKinetic modulation of a disordered protein domain by phosphorylation
read more

Adaptive Sampling for MD Simulations Talk at GTCBio

by David Soriano

A few of weeks ago we were invited to present at GTCBio Drug Design and Medicinal Chemistry 2014 Conference. The conference was very interesting and very well organized, and we will make sure to keep it our agenda for future years. The meeting also happened to be in Berlin, which instantly became one of my favorite cities. Our talk was titled “Machine Learning in FBLD” and discussed our new adaptive sampling method for probing protein ligand interactions using atomistic molecular dynamics simulations that was developed in collaboration with the De Fabritiis lab. So how did we get there?

High-throughput molecular dynamics can be used for profiling protein-ligand interactions

In 2011 we reported our seminal work towards a practical approach for profiling protein ligand interactions in silico with all atom resolution. In this proof-of-principle study we showed that with enough sampling we could use atomistic high-throughput molecular dynamics simulations to reconstruct the binding of benzamidine to trypsin, and not only obtain accurate energies, kinetics, poses but also resolve a ligand binding pathway. We were also able to predict several metastable states later observed experimentally.

Application of molecular dynamics simulations in fragment based lead discovery (FBLD)

We next focused on expanding the library size, and on adapting the methodology to fragment based ligand design. Benzamidine is a small molecule of micromolar affinity for trypsin, and therefore extending this method to FBLD was a natural progression. For the fragment study we used a 2003 STD NMR study that targeted Factor Xa, a target related to trypsin, and a set of forty fragments. Once more our MD experiments were able to recapitulate the binding of the library to the target, and also give accurate representations of binding energies, kinetics, poses and binding pathways. Importantly, we were also able to rank the compounds in order of increasing binding affinity accurately. As exciting as the results were, we needed to collect 2milliseconds of aggregate biological timescale trajectory data, a challenge we could only meet because of our access to GPUGrid through our collaboration with the De Fabritiis group.

Development of an adaptive sampling protocol for MD

One limitation of our HTMD based in-silico binding method at that time involved the amount of sampling needed for system convergence, which was in the order of 50×10^-6 s per compound for Factor Xa. Clearly, in order to make this high-throughput molecular dynamics methodology a practical complement to current experimental FBLD screens, we needed to improve the efficiency of our simulation methods. To this end, we developed a fully automated adaptive sampling method that can deliver system convergence about an order of magnitude faster than the brute-force HTMD approach.

As opposed to HTMD, where we run hundreds of simulations at once, our adaptive sampling protocol begins with only ten trajectories each starting from random initial states and each a few tens of nanoseconds long. We then analyze these trajectories using Markov State modelling, and use residence time information obtained from clustering to select the starting conformations of future runs. This run-analyze-respawn cycle is defined as an epoch, and each adaptive experiment is composed of ten epochs. Note that all of this is automated and that we only need to setup the initial trajectories before working up the results after the last epoch. For each protein-ligand system we run ten replicate adaptive experiments.

For our proof-of-concept we tested this adaptive sampling method on trypsin-benzamidine, a system for we which we have a very large amount of data and that over the years has become our benchmark. As opposed to regular HTMD where we needed to collect about 50×10^-6 s of data for convergence, only 5×10^-6 s were needed in our adaptive experiment. In each experiment, learning was evident after the 3rd epoch, and the energy and pose matched those of our control HTMD experiment after the 5th epoch, in about 80% of the experiments.

What next?

We are very excited about these results as in principle — and assuming there were no other bottlenecks — we should be able to screen 400 compounds in the same amount of time previously spent with 40. This would make this technique fast enough for medicinal chemists to start looking at it as a practical and reliable complement to current FBLD screens. We hope to be there soon but we still need to test the universality of the method thoroughly as well as work out other kinks such as the issue of ligand parametrization.

gianniAdaptive Sampling for MD Simulations Talk at GTCBio
read more

Supervised Molecular Dynamics (SuMD) as a helpful tool to depict GPCR-ligand recognition pathway in a nanosecond time scale

by Davide Sabbadin, PhD.

Brief abstract: Supervised Molecular Dynamics (SuMD) is a computational technique for investigating unbiased receptor-ligand recognition pathways, in the nanosecond timescale, by combining molecular dynamics simulations and a tabu-like algorithm.

G protein-coupled receptors (GPCRs) are transmembrane proteins that play a crucial role in linking various extracellular inputs with diverse cellular responses. It has been estimated that GPCRs constitute the target of about half of all drugs in clinical use today. Thus, from structural and pharmacological perspectives, represent an ideal target to design molecules with potential therapeutic effect (1). Understanding the GPCR-ligand recognition process is crucial for the development of drug candidates with favorable pharmacodynamic profiles.

Recent technological innovation has allowed Molecular Dynamics (MD) simulations to reach timescales comparable with those on which most biomolecular events of interest take place, thus closing the gap between theoretical models and experiments (2,3). Although now possible, it remains expensive to use unbiased Molecular Dynamics to quantitatively characterise ligand-protein binding.

Supervised Molecular Dynamics (SuMD) addresses this shortcoming, enabling the complete ligand-protein recognition process to be elucidated, using unbiased all-atom Molecular Dynamics, in a reduced timescale. The general method is not sensitive to the ligand’s starting position, chemical structure and receptor binding affinity (4).

As shown in the figure below, SuMD uses a standard MD simulation in which the ligand-receptor docking pathway is supervised by a tabu-like algorithm. During the production of the MD trajectory the distances between the center of masses of the ligand atoms and the residues composing the orthosteric binding site of the GPCR (dcmL-R) are monitored over a fixed time window (Δtck, e.g. 200 ps). An arbitrary number of distance points (n: a, b, c, d, e) per each checkpoint trajectory are collected and a linear function f(x)=m·x is fitted on the distance points at the end of the checkpoint time.

The tabu-like algorithm is applied to increase the probability of ligand-receptor binding events occuring without introducing bias to the MD simulation. More precisely, if m<0, the ligand-receptor distance is likely to be shortened over the checkpoint time and so the MD simulation is restarted from the last produced set of coordinates. Otherwise, the simulation is restored from the original set of coordinates and random velocities drawn from the Boltzman distribution are reassigned. This has the effect of causing the restarted simulation to explore a different region of configuration space, while maintaining sampling from the correct NVT ensemble. The tabu-like supervision algorithm is repeatedly applied until ligand-receptor distance (dcmL-R) is less than 5 Å.

Supervised Molecular Dynamics (SuMD) can be fully integrated into
ACEMD and enable investigating the complete binding process, in a nanosecond time scale, and to reproduce with high accuracy the crystallographic pose of protein-ligand complexes. Moreover, using SuMD simulations, it is possible to easily determine and characterize all possible metastable sites on the binding pathway to the orthosteric pose.

For more information visit:
Molecular Modeling Section from the Department of Pharmaceutical and Pharmacological Sciences, University of Padova

Figure: High affinity antagonist ZM241385 (yellow spheres) recognition pathway, using Supervised Molecular Dynamics (SuMD), to the membrane embedded human A2A Adenosine Receptor. During classic Molecular Dynamics (MD) simulations, the receptor-ligand distance vector (dcmL-R) is monitored and supervised in real time by a tabu-like algorithm. This computational method allows investigating the multifaceted unbiased ligand binding process in a reduced time scale of orders of magnitude, while taking advantages of the all-atom MD simulations accuracy of a GPCR-ligand complex embedded into explicit lipid-water environment.

Video: ZM241385-human A2A Adenosine Receptor recognition mechanism using Supervised Molecular Dynamics (SuMD). Simulation time in nanoseconds is reported. Van der Waals spheres represent ZM241385 atoms. Ligand nitrogen atoms are depicted in blue, oxygen atoms in red while carbon atoms are colored in white. Receptor ribbon representation is viewed from the membrane side facing transmembrane domain 6 (TM6) and transmembrane domain 7 (TM7). Hydrogen atoms are not displayed. Ligand binding sites are highlighted by arrows.

(1) Jacobson, K. A.; Costanzi, S. New Insights for Drug Design from the X-Ray Crystallographic Structures of G-Protein-Coupled Receptors. Mol. Pharmacol. 2012, 82, 361–371.
(2) Dror, R. O.; Dirks, R. M.; Grossman, J. P.; Xu, H.; Shaw, D. E. Biomolecular Simulation: A Computational Microscope for Molecular Biology. Annu. Rev. Biophys. 2012, 41, 429–452.
(3) Buch, I.; Giorgino, T.; De Fabritiis, G. Complete Reconstruction of an Enzyme-Inhibitor Binding Process by Molecular Dynamics Simulations. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 10184–10189.
(4) Sabbadin, D.; Moro, S. Supervised Molecular Dynamics (SuMD) as a Helpful Tool to Depict GPCR-Ligand Recognition Pathway in a Nanosecond Time Scale. J. Chem. Inf. Model. 2014.

gianniSupervised Molecular Dynamics (SuMD) as a helpful tool to depict GPCR-ligand recognition pathway in a nanosecond time scale
read more