Acellera's Blog

Acellera, Spring & Summer’16 conferences

By Franck Chevalier

Meet Acellera during Spring & Summer’16 Conferences.

Fragment Based Drug Discovery (FBDD), Markov state model, free energy methods, drug discovery by high throughput molecular dynamics (HTMD); these are few of the topics developed during the events we will join within the next weeks.

Franck ChevalierAcellera, Spring & Summer’16 conferences
Read more

GPCR Unbiased lipid-like ligand binding with ACEMD

By Franck Chevalier

Introduction:

The binding process through the membrane bilayer of lipid-like ligands to a GPCR protein target is an important but poorly explored recognition process at the atomic level. In this work ,the binding of the lipid inhibitor ML056 to the sphingosine-1-phosphate receptor 1 (S1P1R) has been successfully reported using unbiased molecular dynamics simulations with an aggregate sampling of over 800 μs. The binding pathway is a multi-stage process consisting of the ligand diffusing in the bilayer leaflet to contact a “membrane vestibule” at the top of TM 7, subsequently moving from this lipid-facing vestibule to the orthosteric binding cavity through a channel formed by TMs 1 and 7 and the N-terminal of the receptor. Unfolding of the GPCR N-terminal alpha-helix increases the volume of the channel upon ligand entry, helping to reach the crystallographic pose that also corresponds to the predicted favorable pose. The relaxation timescales of the binding process show that the binding of the ligand to the “membrane vestibule” is the rate-limiting step in the multi microseconds timescale. We comment on the significance and parallels of the binding process in the context of other binding studies.

Current challenges

Both ACEMD and HTMD software have permitted to perform the preparation, simulation and analysis of trajectories and elucidate the binding pathway along with critical atomic interactions and conformational changes. Similar to other studies of ligand binding to GPCRs4 , there is a barrier to binding that occurs far away from the final binding pose. This barrier is formed by the interaction between the zwitterionic head group of the ligand and R2927.34 and E2947.36 in the “membrane vestibule” and the desolvation necessary for the ligand to enter the channel.

A final barrier is due to rearrangements of the water molecules in the binding cavity and of dihedral angles in the head and tail of the ligand such that all these favorable interactions can be formed.

Along the binding process, the widening of the cavity by the flexible N-terminus and the desolvation of the ligand head group appear as general trends.

 

Result

In this work,  the binding of a lipid-like ligand from the membrane bilayer directly to the orthosteric binding site of a GPCR using unbiased MD simulations has been shown. The ML056 inhibitor, studied here, binds S1P1R through a multi-step process that finally leads to the crystallographic binding pose.

 

References

Nathaniel Stanley, Leonardo Pardo & Gianni De Fabritiis, The pathway of ligand entry from the membrane bilayer to a lipid G protein-coupled receptor (GPCR), Scientific Reports 2016

Franck ChevalierGPCR Unbiased lipid-like ligand binding with ACEMD
Read more

Fragment Hit Binding by Molecular Simulations

By Franck Chevalier

 

Introduction

During the last 20 years, a great interest has been shown for fragment in order to answer one of the major issue faced by traditional drug discovery program: by its properties, it allows to reduce the number of molecules tested whatever the method used, series are directly related to fragment and optimization of hit is more intuitive. Fragment based drug discovery (FBDD) by molecular dynamics offers now an awesome tool for fragment hit binding.

How has FBDD evolved?

A brief analysis of articles published within last two decades shows that main techniques used include bioassay (inhibition assay, radiolabeled binding assay…) and biophysical methods (Crystallography, NMR, SPR, MS, ITC…).

Low affinity showed by most fragment for their target limits the range of detection for most methods and are not necessarily the high affinity binder that are detected. Most articles in FBDD present the use of a single method, mainly in vitro bioassay. The rate of new articles is about 150/year. Expertise needed, throughput and cost of main biophysics methods still limit their access.

Which alternative?

Within the same period, we observed an increase of publications on virtual methods. Molecular dynamics simulations have been published in more than tens of thousands articles. Interestingly less than 5% of these publications deal about discovery at a yield of 300 publications/year within the last few years.

When FBDD is performed, about 1 to 2 techniques are used. If more than 2 techniques are used, 50% of publications mention the use of virtual methods. About 30% only used virtual methods and tendency has been increased. The large dataset of structures generated by crystallography empowered the use of docking. The important role played by protein flexibility has been answered with molecular dynamics.

Discovery or design?

At this stage, most work focused on discovery, meaning the detection of molecules able to bind, inhibit or activate the target. Few articles yet deals about linking, merging, growing fragment.

Targeting what?

Favorite targets are kinase, enzymes but more complex system involving ion channel, GPCR are now reported.

Reliability of molecular dynamics

In silico FBDD’s impact on lead development has been limited to date. Inability to describe kinetics and conformational changes upon binding are some of the reasons. In the following reference, a comprehensive computational assay for fragment hit characterization has been performed: 2.1 ms of unbiased all-atom high-throughput molecular dynamics (HTMD) data were obtained and analyzed to show the potential of this technique for FBDD.

 

Reference: Noelia Ferruz,Matthew J. Harvey,  Jordi Mestres, and Gianni De Fabritiis J. Chem. Inf. Model. 2015, 55, 2200−2205

Franck ChevalierFragment Hit Binding by Molecular Simulations
Read more

HTMD: Molecular discovery simplified

By Stefan Doerr

Introduction

Molecular dynamics simulations (MD) can provide an atomic level resolution of biological processes at very high temporal resolution, but it comes with its own set of limitations; the most pronounced ones being the accuracy of the forcefields and the time sampling limitations.

Yet, we believe that there are further important problems: the data analysis and the reproducibility of experiments. In the last few years, specialized hardware, high-throughput methods and advanced sampling techniques have come to significantly improve molecular dynamics, allowing them to reach aggregate simulation times of multiple milliseconds. Forcefields have also dramatically improved. This increase of simulation accuracy and data has led to the necessity of a more standardized methodology for preparing, executing and handling thousands of individual trajectories.

Recently we have been working on provide an environment for molecular discovery which simplifies every step of molecular modelling and simulations. We call this environment high-throughput molecular dynamics (HTMD) and the final paper is now published in JCTC. A short intro is provided in this blog.

HTMD

Current challenges

Investigating biological processes using MD usually requires the processing of large amounts of data and files, using various tools and adapting to peculiarities of many different software packages developed over several decades. With all these fragile set of tools, it is hard to follow the steps of a workflow that lead from the original PDB to the results, even for the scientist who wrote the workflow. Secondly, it is hard to extend the functionality of the tools because of such diversity of languages and the absence of a common programming environment where to introduce new extensions.

Our vision

HTMD is our vision of a unified platform, a programmable workspace for simulation based molecular discovery. We name it HTMD (high-throughput molecular dynamics) to indicate the fact that it allows the handling of thousands of simulations and multiple systems in a controlled manner. HTMD extends the Python programming language with functions and classes to handle molecular systems at different levels while abstracting implementation details and best-practice knowledge. Python is a scripting language which enjoys widespread usage in the scientific community and thus provides an ideal platform on which to develop and distribute HTMD. HTMD’s functionalities span from molecular structure manipulation to visualization, to preparing and executing molecular simulations on different compute resources and data analysis, including the use of Markov state models (MSMs) to identify slow events, kinetic rates, affinities and pathways.

References

Stefan Doerr, Matthew J. Harvey, Frank Noé, and Gianni De Fabritiis, HTMD: High-throughput molecular dynamics for molecular discovery, in press JCTC 2016

Franck ChevalierHTMD: Molecular discovery simplified
Read more

Large molecular dynamics simulations as a tool to understand experimental polyethylene phase transitions

By Javier Ramos, Juan F. Vega and Javier Martínez-Salazar

Research activity in the field of macromolecular science requires the use of a variety of techniques. This is due to the intrinsic characteristics of polymeric materials, which reveal interesting physical phenomena in the length scale from sub-nanometer up to microns or, alternatively, in the time scale from picoseconds to years. Covering these wide length and time scales demands the use of powerful experimental techniques, with the combination of complementary computational tools in order to effectively resolve the processes of interest. At this respect major advances in computational power, such as the use of GPU processors, and methodologies can nowadays help to face classical problems, and to establish a shared knowledge between theorists and experimentalists.

The polyethylene phase transitions

The glass transition and the crystallization processes in polymers still remain under debate. Among all synthetic polymeric materials, polyethylene (PE) represents a model system in polymer physics. The advantage of this polymer is the simplicity of its chemical structure, and the extensive set of experimental data available to validate the simulations. The objective of this contribution is to examine fundamental physical processes in a model of entangled polyethylene (C192) using two different force fields, the so-called PYS and TraPPe-UA. We mainly focus our attention on the ability of each model to simulate complex phenomena as glass transition and especially crystallization process.

Computational modeling as a tool to understand polymer physics

Most of the minimizations and molecular dynamics (MD) simulations were performed with GROMACS 4.6.x package installed in Metrocubo Graphical Processor Units (GPU) workstations bought to Acellera. The glass transition and crystallization processes of polymer chains can be studied by cooling an amorphous system at a constant cooling rate. If the cooling is sufficiently slow, early stages of nucleation and crystallization can be studied. On the contrary, at high cooling rates a glass state is produced. Non-isothermal simulations using several cooling rates from an amorphous system in the range of 2,000 (tsim.~0.1 ns) and 0.05 K ns-1 (tsim.~4-5 μs) were performed.

The glass transition temperature (Tg) and the crystallization process

The equilibrated PE system was cooled down at different finite cooling rates, G = 1 to 2,000 K×ns-1 (high enough to prevent polymer crystallization). From these simulations one can calculate the apparent Tg versus cooling rate (Figure 1). Thus, estimate values of Tg0 of 187.0 K and 214.1 K for TraPPe-UA and PYS FFs, respectively are obtained. Experimentally, it has been usual to obtain the Tg by extrapolation to an amorphous PE free of constraints provoked by the presence of crystals, i.e. that obtained by Gordon-Taylor equation. The use of this equation gives rise to a value of Tg0 = 185-195 K. which is very close to that obtained from the TraPPE-UA system.

MD simulations at low cooling rates (G = 0.05 to 1 K·ns-1) allow one to study the crystallization process (Figure 2). A sudden drop of the specific volume and a single peak of the specific heat clearly indicate a phase transition at a crystallization temperature, Tc, during cooling. A clear segregation process (semicrystalline state) has been observed at the early stages of crystallization. Thus, one can observe two different layers, one ordered, and the other one amorphous. The final structure clearly recalls the model experimentally proposed by Ungar and coworkers for quenched long-alkanes of similar molecular length.

Towards a better understanding of the phase transitions in polymers by using large molecular dynamics simulations

We have performed simulations to capture the complex behavior of glass transition and crystallization processes at different cooling rates. This requires large simulation boxes (at the nanometer scale) and long time simulations (at the microsecond scale) well-suited to be performed in GPU processors. We can draw the following general conclusions:

  • The apparent glass transition and its dependence with the cooling rate are well described by TraPPe-UA force field. In fact the extrapolated value at experimental conditions is close to that obtained for totally amorphous PE without crystalline constraints (Gordon-Taylor equation)
  • For the first time the TraPPe-UA force field is employed to simulate the homogeneous early stages of the crystallization process of an entangled n-alkane. Basic experimental facts are correctly described by the simulations, primarily (i) the initial fold length expected for these high supercoolings, and (ii) the segregation of the systems in alternating ordered and disordered layers.

Reference and further reading:

  • Javier Ramos, Juan F Vega, Javier Martínez-Salazar, “Molecular Dynamics Simulations for the Description of Experimental Molecular Conformation, Melt Dynamics, and Phase Transitions in Polyethylene”, 2015, Macromolecules, 48(14), 5016-5027.
  • Sara Sanmartín, Javier Ramos, Juan Francisco Vega, Javier Martínez-Salazar, “Strong influence of branching on the early stage of nucleation and crystal formation of fast cooled ultralong n-alkanes as revealed by computer simulation”, 2014, European Polymer Journal, 50, 190-199
  • Binder, K.; Baschnagel, J.; Paul, W. “Glass transition of polymer melts: test of theoretical concepts by computer simulation”, 2003, Progress in Polymer Science, 28(1), 115-172
  • Goran Ungar and Xiang-bing Zeng, “Learning Polymer Crystallization with the Aid of Linear, Branched and Cyclic Model Compounds”, 2001, Chem. Reviews, 4157–4188
gianniLarge molecular dynamics simulations as a tool to understand experimental polyethylene phase transitions
Read more

Acellera HTMD: A complete software workspace for simulation-guided drug design

Gianni De Fabritiis’ talk at Boston ACS on 18th August, at 8.30AM:

HTMD: A complete software workspace for simulation-guided drug design

Abstract: Performing computational experiments using molecular dynamics simulations is still too difficult and the recent capability of running thousands of simulations has further exacerbated the problem. HTMD is a Matlab-like programmable workspace that provides the power of system preparation, molecular dynamics, Markov state model analysis and visualization at once. As a result, a single short script can lead from the PDB structure to useful quantities such as relaxation timescales, equilibrium populations, conformations and kinetic rates. This facilitates scientists with minimal background to easily integrate MD into their discovery workflow, drastically reduce errors and improves reproducibility.

COMP: Division of Computers in Chemistry
Room 156A – Boston Convention & Exhibition Center
Publication Number: 153

Franck ChevalierAcellera HTMD: A complete software workspace for simulation-guided drug design
Read more

The Road towards GPU Molecular Dynamics: Part 2

by Matt Harvey

In Part 1 of this series, I recounted how we came to be developing molecular dynamics simulation code for the IBM Cell processor. Back in 2006, the Cell seemed poised to make a profound impact on the high performance field – even becoming the processor around which the first petascale system, LANL‘s “Roadrunner”, was built. It is now barely remembered. Only those well-versed in IBM Kremlinology can say why it was effectively killed off, but quite likely its unpopularity amongst programmers forced to deal with its sheer complexity played no small part in the decision.

Lessons learned while working with IBM’s Cell Processor

The Cell was indeed complex. But its crime was not the complexity per se, but rather that so much of it was exposed directly to the programmer. The absence of any tool-kit providing a clean high-level abstraction of the device meant programmers had to deal with the hardware head-on. Nevertheless, our experience of developing for the Cell this way led us to several important conclusions:

1) Future highly parallel processors would have several levels of parallelism (instruction, vector, thread and program), all of which matter (and interact) when aiming for optimum performance.
2) Partitioned address spaces and different classes of memory would make explicit data movement essential, and with it the need to hide latency by overlapping data movement with compute.
3) Heterogeneous designs that combine different processor types each with complementary capabilities in a single device would be common.

These reinforced our view that developing high performance programs for these future processors would require much more than recompilation of existing applications with a special compiler. Their architectures would be so far away from the dominant single-core CPU + serial/threaded program model that major algorithmic changes would be required, necessitating substantial redesign/refactoring of code. Put in the context of scientific computing, where codes may live for decades, it’s clear that if you force change with every new hardware revision programmers won’t target your hardware.

The implication of this was that for the programmer to have any hope of developing new code that is both optimised and portable to other/future hardware, programming with the right high-level abstraction of the hardware would be critical. In other words, having a high quality set of software development tools would be more important than ever.

So, what happened next?

In 2007, NVIDIA, a company known for its PC graphics cards, released its new model, called the G80. Few people in the HPC space knew much about NVIDIA then – its products were mostly sold to gamers and architects – and so the implications of developments in the 3D graphics field had gone unremarked by most of the scientific computing world. The arrival of the G80 was, in retrospect, a rare moment of revolutionary change when an outsider enters a new field and really shakes it up. So what was so revolutionary about the G80? To understand, we need to take a trip down memory lane.

GeForce 8600GT First GPU used for MD

A Potted History of PC Graphics

In the early 90’s a PC graphics adapter was dumb hardware – all it really did was squirt out the contents of its memory (“frame buffer”) down a cable to a monitor. The host CPU was completely responsible for writing the data representing the color of each display pixel (or character) into the frame buffer.

Having the CPU do all the work was very inefficient, not least because of the slow bus connection, so it became increasing common for graphics adapters to have some degree of acceleration for simple, common operations: for example to move a block of pixels from one location to another (“blitting”, useful for moving or scrolling windows, for example), or to fill whole area with color.

When games with 3D graphics first started to become popular (if you worked in a computer lab in the 00s you surely played Quake death matches!) the graphics scene, typically expressed as a triangulated surface, was constructed and rendered by the CPU. Because the demand for better games graphics outmatched what improvements in CPUs and busses could provide (and because the dollar size of the games market was growing dramatically), many of these geometric primitive operations became implemented in hardware. The GPU was born.

The first GPUs bore all the hallmarks of a device engineered for a specific task – the hardware was a set of relatively simple fixed-function block all pipelined together, with each unit performing offload of one fixed aspect of the rendering and display process. Two of the more important functions, of this pipeline, texturing and lighting ( the process of making a surface look more realistic by covering it with an appropriate image and illuminating it) are quite demanding of hardware, so GPUs rapidly began to acquire their own large, very high bandwidth memories as well as limited floating point capabilities.

By the early 2000’s, GPUs were starting to boast performance figures high enough to cause some people in scientific computing, myself included, to start to wonder if they might be useful for other things beyond making pretty pictures. Superficially the GPUs of the day looked highly appealing – they had lots of memory bandwidth, reasonable floating point capability and low cost. Unfortunately, that’s where the good news stopped!

GPUs were still very much fixed-function devices and, furthermore, were only programmable through graphics API languages (NVIDIA’s Cg and OpenGL’s GLSL). To re-purpose them you had to have an algorithm that could map directly onto the structure imposed by the programming abstraction, in effect making the computation look like a special case of shading, with the input as a texture image and the output of the screen frame buffer. Not much fun!

To compound matters, the bus that connected the GPU to the host computer (called AGP) was horrendously slow – anytime you saved by going the compute on the GPU would be lost in the ponderous copy of the results back to the host.

When we were looking for novel processors to develop for we considered, but ultimately dismissed, the contemporary GPUs because of these short-comings, although they are – in essence – more extreme versions of the problems of parallelism, data locality and architectural abstraction that we encountered with Cell.

NVIDIA brings GPUs to HPC

Returning to the G80, NVIDIA had clearly seen the short-comings of their fixed-function hardware, and realised that a more general-purpose architecture would give them a march on their competition by giving the programmer greater flexibility. The G80 was designed around a fully programmable core and, although fixed-function hardware remained, much of the functionality was now performed in software.

Interesting though this was, what made it really significant in the world beyond 3D was that NVIDIA coupled the G80 release with a new programming language, called CUDA, which provided an expressive programming model that embodied an elegant hardware abstraction.

NVIDIA had clearly set its sights on expanding into the HPC space, and it has done so very successfully – from having no presence at all in 2007, by November 2010, it had acquired the number one spot on the Top 500 list of supercomputer, with the 2.5 petaflop Tianhe 1A system.

In the next post, we’ll dive into the architecture of the modern Nvidia GPU.

gianniThe Road towards GPU Molecular Dynamics: Part 2
Read more

AceCloud – Simplified, Limitless Molecular Dynamics Simulations on the Cloud

by Matt Harvey, CTO

Physics-based computer simulation offers a tremendously powerful way to gain insight into the behavior of a wide variety of complex systems. At its most successful, simulation has become a ‘third way’ for scientists and engineers, complementing analytical and experimental methods. In engineering in particular, computational fluid dynamics simulation and finite element analysis are now an integral part of any design effort.

Similarly, we believe that physics-based simulation, in the form of molecular dynamics simulation, has great potential to become one of the basic tools in biochemical, and biophysical R&D, and establish itself as a robust tool in the drug discovery pipeline. Recent work, for example, has demonstrated the ability of MD simulation to produce quantitative estimates of small molecule binding kinetics (PNAS) and of conformational selection in selection in proteins (Nat. Comm., and blog).

Why perform molecular dynamics simulations on the cloud?

Currently, the amount of MD simulation required to undertake these types of studies is often beyond the reach of anyone without access to dedicated high-performance computing resources. To help overcome this critical limiting factor and bring MD simulation to a wider audience, we are pleased to introduce our new product AceCloud.

AceCloud is designed to free you from the constraints of your workstation and – though the use of cloud computing technology – allow you to run hundreds of simulations with the same ease as running one without the need of any additional setup.


Performing cloud molecular dynamics simulations with AceCloud

Accessing AceCloud is a simple matter of using three new commands built into the ACEMD molecular dynamics software package: for running simulations, retrieving results, and monitoring progress. No additional knowledge is required – our software takes care of all of the interaction with the Cloud. Here’s a video of AceCloud in action:


No changes are required to your existing ACEMD simulations, and all features are supported, including extensions and plugins such as PLUMED Metadynamics. As a bonus, users who are already familiar with Gromacs may run their existing simulation inputs on AceCloud, without the need to make any conversions.

The compute resources behind AceCloud are GPU-based and allow simulation rates over 100ns/day for the DHFR benchmark, making them around 40% as fast as the very latest GTX980s in our Metrocubo workstation, but still offering a compelling, cost-effective, level of performance.

AceCloud Costs

AceCloud uses Amazon Web Services (AWS) technology to dynamically provision compute hardware as and when you require it. The only charges for AceCloud are for the time used, from less than 0.40 US$ per hour. There are no minimum or on-going fees. There is no setup required and you can start using it at your convenience in just a few steps. Billing is hassle free and it is done through your own existing Amazon account.

You can calculate representative costs using our AceCloud Cost Estimator in the AceCloud page (this estimate already includes data transfer costs).

Start using AceCloud Today!

AceCloud is available right now. Visit our website for details on getting started.

mattAceCloud – Simplified, Limitless Molecular Dynamics Simulations on the Cloud
Read more

Chlorobenzene as a new halogen bond probe in molecular dynamics simulations

By Zoe Greenburg

Non-covalent bonds play a crucial role in chemistry, biochemistry, and biology. H-bonds, and hydrophobic bonds at the core of protein folding, and govern reversible interactions between biomolecules and their ligands. In drug discovery, H-bonding and hydrophobic interactions define the affinity and selectivity of small molecules for their biological targets.

With the intent to better understand these interactions and their relevance in the design of small molecule modulators of biological function, several computational solvent-mapping methods have been developed over the past few years. Schrodinger introduced WaterMap, a method that allows understanding the locations and thermodynamics of water molecules that solvate binding sites, and guiding the optimization of small molecule hits; conserved, strongly bound water molecules are costly to displace but can be useful bridges. An alternative method for understanding the role of water in ligand binding, which uses all-atom molecular dynamics (MD), and that can be accessed through this company, was developed by De Fabritiis et al. in 2011. Mapping of other solvents like ACN, propane, benzene, and IPA has also been demonstrated with MD simulations of binary mixtures, and been put in the context of target druggability assessment.

A new category of substrate to study: organohalogens

Halogens are common substituents in drugs as halogen bonding interactions can offer key boosts in selectivity and potency. Initially attributed to van der Waals forces, this is now explained in terms of the halogen bond, another type of non-covalent interaction that in recent years has been exploited for many applications in synthetic, supramolecular and medicinal chemistry.

Anisotropy in halogen electron charge distribution allows halogens to interact with donors such as Lewis base centers and π-surfaces, something that, if one is to consider electron density alone, can be rather counter intuitive. Halogen-donor interactions increase with the polarizability of the halogen’s surrounding electron cloud, and with the withdrawing power of adjacent atoms. These observations are now explained using the sigma hole concept, which defines a region of positive charge at the surface of the halogen, along the covalent bond axis.

Within the complex structure of a protein, there are many atoms to which a halogen-containing substrate can form a halogen bond. Halogen – usually chlorine – interactions with backbone and side chain carbonyls, as well as with side chain alcohols, thiols and thioethers, amines and the surfaces of aromatic rings can affect the conformation of the protein as well as the affinity that other binding sites on the protein have for other substrates, such as water. This is however difficult to predict even though doing so could be highly beneficial in guiding structure-based drug design. Experimental techniques that allow solvent mapping such as the multiple solvent crystal structure (MCSC), can be effective but are time intensive and expensive to run, and lack of an ideal strategy and/or representative force fields have limited computational approaches.

Computational halogen bond mapping with molecular dynamics simulations

Recently, Verma and collaborators proposed a molecular dynamics based method for halogen bond mapping. This study aimed to observe the effect of a chosen probe, in this case chlorobenzene, on the binding sites of four different proteins using molecular dynamics. Chlorobenzene is potentially a bifunctional substrate; the sp2 bonded chlorine can probe for halogen bonds while the aromatic ring can probe for π-stacking, or hydrophobic sites. For proof-of-concept, MDM2(Pdb:1YCR), MCL-1(Pdb:3kJ0), interleukin-2(Pdb:1Z92), and Bcl-xL(Pdb:1BXL) were selected because of their involvement in protein-protein interactions, the availability of structural data, and the well reported structural changes they endure upon substrate binding, which can present unexposed halogen bonding sites. All-atom MD simulations were selected for the development of protein-small molecule interaction maps because other computational methods considered could not capture the protein’s dynamics sufficiently, and Lexa and Carlson showed this to be crucial for effective solvent interaction mapping.

For chlorobenzene preparation the authors used GAFF. Atomic charges were derived using the R.E.D. Server by fitting restrained electrostatic potential charges to a molecular electrostatic potential computed with Gaussian. One peculiarity of chlorobenzene when compared to benzene, a molecule the authors also used for mapping in a previous study, was found to be the propensity of the halogenated compound to aggregate at lower concentrations, which forced to reduce the substrate concentration by 25% to 0.15M, and to increase the sampling time during the simulation experiments to 10ns per trajectory.

Using molecular dynamics simulations to construct halogen affinity maps

For MDM2, all of the six halogen bonding sites found in the Pdb were recovered by the proposed method. Three of these, known to be occupied by a halogenated benzodiazepine inhibitor, and two others, occupied by Nutlin-2, were recovered on first instance by the mapping protocol. The sixth was recovered after modification of the Pdb structure, as the initial structure lacked an important sequence of residues at the N- terminus. In addition, four cryptic hydrophobic binding sites were detected that were not present in the initial input conformation of the protein. The existence of these sites could also be validated by matching with results available in the literature. For the other three targets the authors report similar results.

All-atom MD simulation with chlorobenzene can be used to map halogen and hydrophobic binding sites

Using molecular dynamics simulations, and the chlorobenzene parameters shown above, the authors could map halogen and hydrophobic binding sites on all four tested proteins. The results were consistent with available experimental data, and show the potential of this method in structure based drug design. As we have come to see in our own work at Acellera probing libraries of fragments using all-atom MD simulations, the choice of force field is difficult for halogenated molecules. The ability of the force fields used in this paper to reproduce the binding sites was remarkable, and could potentially be harnessed for a broader use in computer aided drug design.

gianniChlorobenzene as a new halogen bond probe in molecular dynamics simulations
Read more

Kinetic modulation of a disordered protein domain by phosphorylation

by Santi Esteban

Protein phosphorylation plays key roles in many signal transduction processes, preferentially targeting intrinsically disordered protein domains (IDPs). IDPs remain largely unstructured under native conditions, resembling random-coil polymers akin to the unfolded states of proteins. Present in more than 50% of eukaryotic proteins, IDPs perform a plethora of biological functions and are commonly associated with a variety of human diseases.

Technical limitations hamper intrinsically disordered protein characterization

The mechanism by which phosphorylation modulates disordered domains remains unknown. This stems from the extraordinary structural heterogeneity of IDPs, which poses a challenge for their characterization using common structural determination experimental techniques. For instance, the study of IDPs is inviable with X-ray crystallography because of the fast mobility of the domains, and NMR does not allow singling out the structural details of the multiple, underlying conformational states.

Can intrinsically disordered proteins be characterized using molecular dynamics simulations?

Recent advances in computer simulation technology offer a unique opportunity to study IDPs with atomic resolution on the millisecond timescale, the regime necessary to study these systems. The introduction of accelerator processors, and their application to MD, as well as the development of HTMD, and adaptive sampling methods, have allowed an increase of sampling efficiency of several orders of magnitude. Indeed, a single workstation can now obtain microsecond long simulations with atomic resolution in one day, and milliseconds can be obtained rapidly on a supercomputer, or through distributed computing projects. Nevertheless, this bourgeoning field is still limited in its ability to analyse millions of protein conformers, and compute their population and kinetics.

All-atom MD for intrinsically disordered protein characterization

In this work (Nat. Comm. 2014, DOI: 10.1038/ncomms6272), we successfully reconstructed the conformational ensemble of an IDP domain involved in gene transcription. A new analysis tool capable of resolving the disordered state of disordered domains was used in combination with large scale all-atom, unbiased, molecular simulations. 1.7 milliseconds of aggregated time were sampled. This methodology was used to study a well-characterized disordered fragment of the protein kinase-inducible domain (KID) in the cellular transcription factor of the cAMP response element-binding protein (CREB).

Identification of slow molecular order parameters for Markov model construction

The kinetic characterization of macromolecular systems requires the description of slow relaxation processes. This depends on the identification of the structural changes involved in these processes (selection of structural descriptors), the discretization of the high-dimensional coordinate space (clustering of conformers according to the descriptors) and estimation of the rates or timescales at which these slow processes occur. Approaches to this end include Markov models, master-equation models, and kinetic network models.

In this work we analyzed the simulated data by constructing a Markov state model of the entire ensemble of trajectories using inter-residue Cα-Cα distances and φ/ψ backbone dihedral angles as general metrics to build the kinetic model. The conformational space was discretized into one thousand clusters and then projected on a 5-dimensional space using time-lagged independent component analysis (TICA), which identifies the slow coordinates in the dynamics without relying on subjective guesses.

Phosphorylation reduces the conformational kinetics of intrinsically disordered proteins

The results presented in Figure 1 show the first microscopic view of the equilibrium states and conformational kinetics of a disordered protein domain at atomic resolution. The conformational exchange between state D (disordered) and state O (ordered) is shown together with forward (τ1) and backward (τ-1) transition times. Conformations were aligned using residues with the lowest Cα-Cα distance variance. N- to C-terminal is color coded red and blue.

We find that phosphorylation induces a 60-fold slowdown in conformational kinetics, compared to a non-phosphorylated domain, which involves a low populated and partially structured excited state known to participate in an early binding intermediate (Fig. 2). Figure 2a shows residue specific relaxation times of helix folding/unfolding process (filled bars) derived from autocorrelation function. Empty bars show chemical shifts changes in pKID that map residues participating in an early binding intermediate as detected by NMR. Figure 2b is an example of a kinetically locked conformation in phosphorylated KID. The peptide backbone is shown in cartoon representation, and the heavy atoms of phosphorylated serine 133 and several charged residues are shown as sticks.

Direct comparison against NMR data supports the validity of these findings and show that, contrary to what is typically assumed and in line with experiments, mutation of the phosphoserine to glutamate cannot recover the slowdown in conformational exchange (Fig. 1c).

Is the regulation of phosphorylation via the kinetic modulation of intrinsically disordered proteins conserved?

We propose a new mechanism of phosphorylation regulation by kinetic modulation which could be general for disordered domains which lack a well-defined equilibrium structure. This mechanism emerges from a theoretically derived kinetic model that shows that it is possible to modulate the binding affinity of protein-protein complexes by inducing a slowdown in conformational kinetics affecting both on- and off-conformational transition times by the same amount and simultaneously. This model agrees with kinetic and affinity values obtained through experiment (See paper).

With disordered domains taking part in more than 50% of proteins responsible for signaling in the cell, this kinetic mechanism of modulation highlights a further way by which post-translation modifications may affect disordered domains and their interactions with binding partners.

This work was performed in the De Fabritiis lab. Gianni De Fabritiis is the founder of Acellera. Should you be interested in performing similar studies in-house let Acellera know. We will be happy to inform you.

gianniKinetic modulation of a disordered protein domain by phosphorylation
Read more