High Performance Computing

A new coarse-grained potential for molecular simulation of proteins



Francesca COLLU


We introduce a non-radial potential term for coarse-grained (CG) molecular dynamics (MD) simulations of proteins. This term mimics the backbone dipole-dipole interactions and accounts for the needed directionality to form stable folded secondary structure elements. We show that β-sheet and α-helical peptide chains are correctly described in dynamics without the need of introducing any a priori bias potentials or ad hoc parameterizations, which limit broader applicability of CG simulations for proteins. Moreover, our model is able to catch formation of super-secondary structural motifs, like transitions from long single a-helices to helix-coil-helix assemblies. This novel scheme requires the structural information of Cα beads only, it does not introduce any additional degree of freedom to the system, and has a general formulation, which allows it to be used in synergy with various CG protocols, leading to an improved description of the structural and dynamic properties of protein assemblies and networks. Moreover, our scheme has been implemented and optimized for a highly parallelized MD code (LAMMPS), which allows us to use large HPC resources for the investigation of the function of large protein assemblies.


Molecular dynamics (MD) simulations are a powerful tool to investigate the structure and dynamics of biomolecules at the atomistic dimensionality [1]. Nonetheless, it remains still unaffordable to thoroughly sample size and time scales that are critical to most of the biological processes within the atomistic framework [2]. Thus, various coarse-grained (CG) models have been recently introduced to overcome these limitations, and tackle relevant problems such as membrane and proteins assembly, [3,4] and protein folding [5,6]. From the first topological models based on (an)elastic networks [7-9], CG potentials that adopt a force-field like functional form have more recently come out [10-13]. These protocols are of potential broad applicability for the study of dynamics of large macromolecular assemblies. At present, the functional form of the CG potential is not univocally defined, and depends on the level of coarse-graining one chooses to perform. Moreover, fully transferable parameters for the potential function have not been determined yet. Therefore, for each system, sets of new parameters have to be constantly produced to keep maximal accuracy of CG simulations [14,15]. The lack of both universality and transferability results in significant drawbacks, which limit in turn their general applicability. In particular, CG models face difficulties in reproducing anisotropic properties, which are crucial for accounting stability in secondary structure elements. Thus, it is usually required to introduce additional biased potentials [10-13], which are defined on a target conformation and do not naturally adapt to secondary structure modifications, which may occur during protein dynamics. One of the reasons for such problems lays in the fact that CG interaction potentials are usually represented in an analytical form similar to that of molecular ones [10-14]. In fact, non-bonded potentials in CG models mimic effective interactions among beads, each representing a specific group of atoms. The electrostatic potential of a group of atoms expressed as a function of a single center is formally defined by its multipolar expansion:br> V{(\vec{R})} = \sum_{{i}{\in}{bead}} {\frac{q_i}{\left|{\vec{r_i} - {\vec{R}}}}\right|} \approx {\frac{Q_{tot}}{R}} + {\frac{{\vec\mu}.{\vec{R}}}{R^3}}  + O(quadrupole),      (1)
As, in most of the cases, CG beads are designed to represent group of atoms with negligible total charge Qtot, it follows that CG electrostatic non-bonded interactions should retain a strong non-radial component. In particular, permanent dipoles should somehow be kept in the CG representation, as they constitute fundamental quantities to describe the physics of proteins [16,17].

Computational Methods

Here, we introduce a non-radial potential term, which mimics the backbone dipole-dipole interactions, and which naturally stabilizes and modulates elementary secondary structure motifs, such as α-helices and β-sheets.
Parameterized potential functions in molecular dynamics usually take the form:

{{V_{AA}} = {\sum_{bonds}} k_r \left({r - r_0}\right)^2 + {\sum_{angles}} k \left({{\gamma} - {\gamma}_0}\right)^2 + \sum_{dihedrals} \left[{1 + cos(n\varphi + \delta)}\right] + \sum_{i<j} \left({\frac{A_{i,j}}{R_{ij}^{12}}} -{B_{i,j}}{R_{ij}^{6}}}\right) + \sum_{i<j} {\frac{q_iq_j}{R_{ij}}} },      (2)
where the first three terms represent stretching, bending and torsional potentials, and the last two terms are non-bonded Van der Walls and Coulomb interactions. Our model potential for CG backbone beads is expressed in a similar form, but with some important modifications. In particular we have changed the functional form of the angular potential, we have introduced a correlation term for consecutive dihedral angles and we have replaced point-charge Coulomb interactions with backbone dipole-dipole interactions:

{{V_{CG}} = {\sum_{bonds}} k_r \left({r - r_0}\right)^2 + {\sum_{angles}} P^{(4)} \left({{\gamma} - {\gamma}_0}\right)^2 + \sum_{dihedrals}  k_{\varphi} \left[{1 + cos(n{\varphi} + {\delta})}\right] + \sum_{dihedrals -1} k_{corr} \left({{\varphi}_i - {\varphi}_{i+i}}\right)^2 + \sum_{k<j} \left({\frac{A_{kl}}{R_{kl}^{12}}} -{B_{kl}}{R_{kl}^{6}}}\right) + sum_{i<j} V \left( \vec\mu_i (C_i), \vec\mu_j (C_j) \right) },      (3)

In equation (3), indices k and l run on all beads, while indices i and j run on all goups of three consecutive bonded Cα,i-1, Cα,i,Cα,i+1 beads (called Ci triplet, see Figure 1)
Our model uses an implicit reconstruction scheme to derive the spatial orientation of the backbone dipoles, as defined by the Ci triplet [18]. The dipole is located at the middle of the bond between second and third bead, and its orientation is determined by defining its projection along such axis as a constant, while the precession angle θ around the same axis is an analytical function of the bending angle γ between the three beads (Figure 1). The intensity of backbone dipole (μ) is a constant (3.7 Debye). The dipole-dipole interaction potential is therefore expressed as an explicit function of Cα coordinates only. Thus, forces and torques acting on dipoles can be easily distributed on triplets. In particular, the forces on the dipole produce a rigid roto-translation of the triplet, apart from the component that directly acts on the angle θ. Such interaction transfers into a bending of the angle θ, according to its analytical relation with the precession angle θ (Figure1).

JPEG - 15.4 kb
fig. 1
Forces (F) and torques (t) acting on a backbone dipole transfer to the Cαtriplets of the protein scaffold. F and the component of t lying on the (n,m) plane orthogonal to the Cα,i - Cα, i+1direction produce a rigid roto-translation of the triplet. The component tv oriented along theCα,i - Cα, i+1 bond modulates the bending angle g following the function θ=θ(γ) reported in the inset. The inset shows the statistical distribution θ=θ(γ) extracted from the PDB (square symbols) and its analytical fitting function (solid line).


The scheme was tested on secondary structure motifs such as α-helices and β-sheets. Polypeptide chains were modeled as nonspecific amino-acid sequences represented by single Cα beads. Starting configurations for the α- and β-structures were taken from canonical conformations extracted from the PDB, and were initially equilibrated in MD using the alanine parameters of the MARTINI force-field [10]. Then, MD simulations were carried on using a modified Hamiltonian in which the new non-radial term was added to the initial potential function. Original Lennard-Jones parameters were recalibrated to avoid overestimation of non-bonded contributions, while bonded interactions were modified to unbias the potential function with respect to the initial structure. MD simulations were performed at 298K in the canonical ensemble in the μs timescale.

We find that the 1-4 and 1-5 pitches of the helical turn are d1-4=5.2(2) Å and d1-5=5.9(5) Å, respectively, whereas the inter-strand distance and orientation in the β-hairpin are d=5.1(1) Å and φ=91(8) deg.
These relevant structural parameters are in very good agreement with experimental values (Table 1).

Present resultsExperimental values
β-strandsd*=5.1(1); ϕ=91(8)d=4.8(1); ϕ=90(6)
α-helicesd1-4=5.2(2); d1-5=5.9(5) d1-4=5.1(1); d1-5=6.2(1)

* d(Å): inter-strand distance; φ(deg): inter-strand torsional angle; d1-4 ,d1-5 (Å): helical pitches
Table 1 - Structural parameters of α-helices and b-sheets in CG MD

β-Sheets remain naturally stable, with an rmsd as small as 1.6(2) Å from the native structure. In particular, their structure shows conserved eclipsed configurations between corresponding Cα’s in the two strands, and the correct overall averaged parallel orientation of their planes (Table 1).
These structural features cannot be easily described using standard CG models. In fact, radial potentials do not discriminate among those configurations, which keep a similar number of contacts between the two strands. This results into a disordered ensemble of structures, different from the native one, which are found during standard CG MD runs. Short α-helical peptides (i.e. 8-12 aa-long chains) were found to be invariantly stable at low and room temperature, and maintained ideal helical conformation during MD, in agreement with previous findings [19]. Longer helices with nonspecific aa sequences tend to undergo conformational transitions leading to super-secondary structure elements like, e.g., helix-coil-helix or β-hairpin motifs [19]. A similar behavior is found in our simulations of longer α-helices (25 aa-long chain), which remain stable at low temperature, but rapidly undergo conformational transition at room temperature, to form a stable helix-loop-helix structure (Figure 2).

JPEG - 13.4 kb
fig. 2
Conformational transition from a-helix to helix-coil-helix super-secondary structure. Time evolution at 298K of the N-C term distance (grey line, average value over 1000 steps in black line). Representative structures found during MD are reported.

Such transition is naturally stabilized by the dipolar coupling of the antiparallel helical conformation. Our protocol, in fact, correctly describes helical N-to-C polarity by producing realistic macromolecular dipoles, which are fundamental ingredients for the correct assembly of helical bundles.


Our protocol is able to introduce backbone dipole interactions in CG MD simulations of proteins, which, in turn, allow an unbiased representation of stable secondary structure elements, as well as prediction of their dynamical arrangement into super-secondary structure assemblies. The proposed directional potential has a general form, and in principle can be coupled to existing CG protocols (single or multibead), which retain structural information about the Cα trace of proteins. Our scheme constitutes a promising step towards the development of a more universal and transferable CG force field, not plagued with knowledge-based biases on the secondary structure. In particular, the directionality of the backbone structure is directly connected to the bending angle. Thus, the secondary structure propensity of amino-acids, which is chemically encoded in the side-chain, can be elegantly controlled by using the backbone bending potential as order parameter [7,12]. The present results anticipate the development of a new CG force field able to take into account intrinsic anisotropy of protein structures, leading to an improved description of the structural and dynamic properties of protein assemblies and networks.
Furthermore, the implementation of our scheme in the LAMMPS MD code (lammps.sandia.gov), which is highly parallelized and efficiently runs on current HPC architectures allows us to investigate with improved accuracy very large protein assemblies for timescales close to the experimental setup (of the order of microseconds and more). Finally, the recent advent of high-performance GPU computing will permit to access similar time/size-scales for CG protein dynamics at a reduced cost by porting the present scheme on GPU-based hardware.


This work is funded by the Swiss National Science Foundation (grants PP0022_118930 and 200021_122120). We thank Dr. A. Giorgetti, E. Spiga, and M. Degiacomi for useful discussions.


[1] Karplus M., McCammon J. A., Nat Struct Biol 2002, 9 (9), 646 652.
[2] Klein M. L., Shinoda W., Science 2008, 321 (5890), 798-800.
[3] Marrink S. J., de Vries A. H., Harroun T. A.; Katsaras J., Wassall S. R., J Am Chem Soc 2008, 130 (1), 10-11.
[4] Bond P. J., Sansom M. S. P., J Am Chem Soc 2006, 128 (8), 2697-2704.
[5] Tozzini V., Curr Opin Struc Biol 2005, 15 (2), 144-150.
[6] Ayton G. S., Noid W. G., Voth G. A., Curr Opin Struc Biol 2007, 17 (2),192-198.
[7] Levitt M., Warshel A., Nature 1975, 253 (5494), 694-698.
[8] Tanaka S., Scheraga H. A., Macromolecules 1976, 9 (6), 945-50.
[9] Go N., Scheraga H. A., Macromolecules 1976, 9 (4), 535-542.
[10] Monticelli L., Kandasamy S. K., Periole X., Larson R. G., Tieleman D. P., Marrink S. J., J Chem Theory Comput 2008, 4 (5), 819-834.
[11] Arkhipov A., Yin Y., Schulten K., Biophys J 2008, 95 (6), 2806-2821.
[12] Tozzini V., Rocchia W., McCammon J. A., J Chem Theory Comput 2006, 2 (3), 667-673.
[13] Bond P. J., Wee C. L., Sansom M. S. P., Biochemistry 2008, 47 (43),11321-11331.
[14] Noid W. G., Chu J. W., Ayton G. S., Krishna V., Izvekov S., Voth G. A., Das A., Andersen H. C., J Chem Phys 2008, 128 (24), 244114.
[15] Noid W. G., Liu P., Wang Y., Chu J. W., Ayton G. S., Izvekov S., Andersen H. C., Voth G. A., J Chem Phys 2008, 128 (24), 244115.
[16] Warshel A., Russell S. T., Q. Rev. Biophys. 1984, 17, 283-422
[17] Hoang T. X., Trovato A., Seno F., Banavar J. R., Maritan A., Proc Natl Acad Sci USA 2004, 101 (21), 7960-7964.
[18] Cascella M., Neri M. A., Carloni P., Dal Peraro M., J Chem Theory Comput 2008, 4 (8), 1378-1385.
[19] Blondelle S. E., Forood B., Houghten R. A., PerezPaya E., Biochemistry 1997, 36 (27), 8393-8400.

Cherchez ...

- dans tous les Flash informatique
(entre 1986 et 2001: seulement sur les titres et auteurs)
- par mot-clé


Cette page est un article d'une publication de l'EPFL.
Le contenu et certains liens ne sont peut-être plus d'actualité.


Les articles n'engagent que leurs auteurs, sauf ceux qui concernent de façon évidente des prestations officielles (sous la responsabilité du DIT ou d'autres entités). Toute reproduction, même partielle, n'est autorisée qu'avec l'accord de la rédaction et des auteurs.

Archives sur clé USB

Le Flash informatique ne paraîtra plus. Le dernier numéro est daté de décembre 2013.

Taguage des articles

Depuis 2010, pour aider le lecteur, les articles sont taggués:
  •   tout public
    que vous soyiez utilisateur occasionnel du PC familial, ou bien simplement propriétaire d'un iPhone, lisez l'article marqué tout public, vous y apprendrez plein de choses qui vous permettront de mieux appréhender ces technologies qui envahissent votre quotidien
  •   public averti
    l'article parle de concepts techniques, mais à la portée de toute personne intéressée par les dessous des nouvelles technologies
  •   expert
    le sujet abordé n'intéresse que peu de lecteurs, mais ceux-là seront ravis d'approfondir un thème, d'en savoir plus sur un nouveau langage.