• GENESIS CGDYN: large-scale coarse-grained MD simulation with dynamic load balancing for heterogeneous biomolecular systems

    Jaewoon Jung, Cheng Tan, Sugita
    Residue-level coarse-grained (CG) molecular dynamics (MD) simulation is widely used to investigate slow biological processes that involve multiple proteins, nucleic acids, and their complexes. Biomolecules in a large simulation system are distributed non-uniformly, limiting computational efficiency with conventional methods. Here, we develop a hierarchical domain decomposition scheme with dynamic load balancing for heterogeneous biomolecular systems to keep computational efficiency even after drastic changes in particle distribution. These schemes are applied to the dynamics of intrinsically disordered protein (IDP) droplets. During the fusion of two droplets, we find that the changes in droplet shape correlate with the mixing of IDP chains. Additionally, we simulate large systems with multiple IDP droplets, achieving simulation sizes comparable to those observed in microscopy. In our MD simulations, we directly observe Ostwald ripening, a phenomenon where small droplets dissolve and their molecules redeposit into larger droplets. These methods have been implemented in CGDYN of the GENESIS software, offering a tool for investigating mesoscopic biological processes using the residue-level CG models.


  • Micelle-like clusters in phase-separated Nanog condensates: A molecular simulation study

    Azuki Mizutani, Cheng Tan, Yuji Sugita, Shoji Takada
    The phase separation model for transcription suggests that transcription factors (TFs), coactivators, and RNA polymerases form biomolecular condensates around active gene loci and regulate transcription. However, the structural details of condensates remain elusive. In this study, for Nanog, a master TF in mammalian embryonic stem cells known to form protein condensates in vitro, we examined protein structures in the condensates using residue-level coarse-grained molecular simulations. Human Nanog formed micelle-like clusters in the condensate. In the micelle-like cluster, the C-terminal disordered domains, including the tryptophan repeat (WR) regions, interacted with each other near the cluster center primarily via hydrophobic interaction. In contrast, hydrophilic disordered N-terminal and DNA-binding domains were exposed on the surface of the clusters. Electrostatic attractions of these surface residues were responsible for bridging multiple micelle-like structures in the condensate. The micelle-like structure and condensate were dynamic and liquid-like. Mutation of tryptophan residues in the WR region which was implicated to be important for a Nanog function resulted in dissolution of the Nanog condensate. Finally, to examine the impact of Nanog cluster to DNA, we added DNA fragments to the Nanog condensate. Nanog DNA-binding domains exposed to the surface of the micelle-like cluster could recruit more than one DNA fragments, making DNA-DNA distance shorter.
  • Extension of the iSoLF implicit-solvent coarse-grainedmodel for multicomponent lipid bilayers

    Diego Ugarte La Torre, Shoji Takada, Yuji Sugita
    iSoLF is a coarse-grained (CG) model for lipid molecules with the implicit-solvent approximation used in molecular dynamics (MD) simulations of biological membranes. Using the original iSoLF (iSoLFv1), MD simulations of lipid bilayers consisting of either POPC or DPPC and these bilayers, including membrane proteins, can be performed. Here, we improve the original model, explicitly treating the electrostatic interactions between different lipid molecules and adding CG particle types. As a result, the available lipid types increase to 30. To parameterize the potential functions of the new model, we performed all-atom MD simulations of each lipid at three different temperatures using the CHARMM36 force field and the modified TIP3P model. Then, we parameterized both the bonded and non-bonded interactions to fit the area per lipid and the membrane thickness of each lipid bilayer by using the multistate Boltzmann Inversion method. The final model reproduces the area per lipid and the membrane thickness of each lipid bilayer at the three temperatures. We also examined the applicability of the new model, iSoLFv2, to simulate the phase behaviors of mixtures of DOPC and DPPC at different concentrations. The simulation results with iSoLFv2 are consistent with those using Dry Martini and Martini 3, although iSoLFv2 requires much fewer computations. iSoLFv2 has been implemented in the GENESIS MD software and is publicly available.
  • Acceleration of generalized replica exchange with solute tempering simulations of large biological systems on massively parallel supercomputer

    Jaewoon Jung, Chigusa Kobayashi, Yuji Sugita
    Generalized replica exchange with solute tempering (gREST) is one of the enhanced sampling algorithms for proteins or other systems with rugged energy landscapes. Unlike the replica-exchange molecular dynamics (REMD) method, solvent temperatures are the same in all replicas, while solute temperatures are different and are exchanged frequently between replicas for exploring various solute structures. Here, we apply the gREST scheme to large biological systems containing over one million atoms using a large number of processors in a supercomputer. First, communication time on a multi-dimensional torus network is reduced by matching each replica to MPI processors optimally. This is applicable not only to gREST but also to other multi-copy algorithms. Second, energy evaluations, which are necessary for the multistate bennet acceptance ratio (MBAR) method for free energy estimations, are performed on-the-fly during the gREST simulations. Using these two advanced schemes, we observed 57.72 ns/day performance in 128-replica gREST calculations with 1.5 million atoms system using 16,384 nodes in Fugaku. These schemes implemented in the latest version of GENESIS software could open new possibilities to answer unresolved questions on large biomolecular complex systems with slow conformational dynamics.
  • Highly Charged Proteins and Their Repulsive Interactions Antagonize Biomolecular Condensation

    Cheng Tan, Ai Niitsu, Yuji Sugita
    Biomolecular condensation is involved in various cellular processes; therefore, regulation of condensation is crucial to prevent deleterious protein aggregation and maintain a stable cellular environment. Recently, a class of highly charged proteins, known as heat-resistant obscure (Hero) proteins, was shown to protect other client proteins from pathological aggregation. However, the molecular mechanisms by which Hero proteins protect other proteins from aggregation remain unknown. In this study, we performed multiscale molecular dynamics (MD) simulations of Hero11, a Hero protein, and the C-terminal low-complexity domain (LCD) of the transactive response DNA-binding protein 43 (TDP-43), a client protein of Hero11, under various conditions to examine their interactions with each other. We found that Hero11 permeates into the condensate formed by the LCD of TDP-43 (TDP-43-LCD) and induces changes in conformation, intermolecular interactions, and dynamics of TDP-43-LCD. We also examined possible Hero11 structures in atomistic and coarse-grained MD simulations and found that Hero11 with a higher fraction of disordered region tends to assemble on the surface of the condensates. Based on the simulation results, we have proposed three possible mechanisms for Hero11’s regulatory function: (i) In the dense phase, TDP-43-LCD reduces contact with each other and shows faster diffusion and decondensation due to the repulsive Hero11–Hero11 interactions. (ii) In the dilute phase, the saturation concentration of TDP-43-LCD is increased, and its conformation is relatively more extended and variant, induced by the attractive Hero11–TDP-43-LCD interactions. (iii) Hero11 on the surface of small TDP-43-LCD condensates can contribute to avoiding their fusion due to repulsive interactions. The proposed mechanisms provide new insights into the regulation of biomolecular condensation in cells under various conditions.


  • Multiple sub state structures of SERCA2b reveal conformational overlap at transition steps during the catalytic cycle

    Yuxia Zhang, Chigusa Kobayashi, Xiaohan Cai, Satoshi Watanabe, Akihisa Tsutsumi, Masahide Kikkawa, Yuji Sugita, Kenji Inaba
    Sarco/endoplasmic reticulum Ca2+ ATPase (SERCA) pumps Ca2+ into the endoplasmic reticulum (ER). Herein, we present cryo-electron microscopy (EM) structures of three intermediates of SERCA2b: Ca2+-bound phosphorylated (E1P·2Ca2+) and Ca2+-unbound dephosphorylated (E2·Pi) intermediates and another between the E2P and E2·Pi states. Our cryo-EM analysis demonstrates that the E1P·2Ca2+ state exists in low abundance and preferentially transitions to an E2P-like structure by releasing Ca2+ and that the Ca2+ release gate subsequently undergoes stepwise closure during the dephosphorylation processes. Importantly, each intermediate adopts multiple sub-state structures including those like the next one in the catalytic series, indicating conformational overlap at transition steps, as further substantiated by atomistic molecular dynamic simulations of SERCA2b in a lipid bilayer. The present findings provide insight into how enzymes accelerate catalytic cycles.
  • Use of multistate Bennett acceptance ratio method for free-energy calculations from enhanced sampling and free-energy perturbation

    Yasuhiro Matsunaga, Motoshi Kamiya, Hiraku Oshima, Jaewoon Jung, Shingo Ito, and Yuji Sugita
    Multistate Bennett acceptance ratio (MBAR) works as a method to analyze molecular dynamics (MD) simulation data after the simulations have been finished. It is widely used to estimate free-energy changes between different states and averaged properties at the states of interest. MBAR allows us to treat a wide range of states from those at different temperature/pressure to those with different model parameters. Due to the broad applicability, the MBAR equations are rather difficult to apply for free-energy calculations using different types of MD simulations including enhanced conformational sampling methods and free-energy perturbation. In this review, we first summarize the basic theory of the MBAR equations and categorize the representative usages into the following four: (i) perturbation, (ii) scaling, (iii) accumulation, and (iv) full potential energy. For each, we explain how to prepare input data using MD simulation trajectories for solving the MBAR equations. MBAR is also useful to estimate reliable free-energy differences using MD trajectories based on a semi-empirical quantum mechanics/molecular mechanics (QM/MM) model and ab initio QM/MM energy calculations on the MD snapshots. We also explain how to use the MBAR software in the GENESIS package, which we call mbar_analysis, for the four representative cases. The proposed estimations of free-energy changes and thermodynamic averages are effective and useful for various biomolecular systems.
  • Implementation of residue-level coarse-grained models in GENESIS for large-scale molecular dynamics simulations.

    Cheng Tan, Jaewoon Jung, Chigusa Kobayashi, Diego Ugarte La Torre, Shoji Takada, Yuji Sugita
    Residue-level coarse-grained (CG) models have become one of the most popular tools in biomolecular simulations in the trade-off between modeling accuracy and computational efficiency. To investigate large-scale biological phenomena in molecular dynamics (MD) simulations with CG models, unified treatments of proteins and nucleic acids, as well as efficient parallel computations, are indispensable. In the GENESIS MD software, we implement several residue-level CG models, covering structure-based and context-based potentials for both well-folded biomolecules and intrinsically disordered regions. An amino acid residue in protein is represented as a single CG particle centered at the Cα atom position, while a nucleotide in RNA or DNA is modeled with three beads. Then, a single CG particle represents around ten heavy atoms in both proteins and nucleic acids. The input data in CG MD simulations are treated as GROMACS-style input files generated from a newly developed toolbox, GENESIS-CG-tool. To optimize the performance in CG MD simulations, we utilize multiple neighbor lists, each of which is attached to a different nonbonded interaction potential in the cell-linked list method. We found that random number generations for Gaussian distributions in the Langevin thermostat are one of the bottlenecks in CG MD simulations. Therefore, we parallelize the computations with message-passing-interface (MPI) to improve the performance on PC clusters or supercomputers. We simulate Herpes simplex virus (HSV) type 2 B-capsid and chromatin models containing more than 1,000 nucleosomes in GENESIS as examples of large-scale biomolecular simulations with residue-level CG models. This framework extends accessible spatial and temporal scales by multi-scale simulations to study biologically relevant phenomena, such as genome-scale chromatin folding or phase-separated membrane-less condensations.
  • The inherent flexibility of receptor binding domains in Sars-Cov-2 spike protein

    Hisham M Dokainish, Suyong Re, Takaharu Mori, Chigusa Kobayashi, Jaewoon Jung, Yuji Sugita
    Spike (S) protein is the primary antigenic target for neutralization and vaccine development for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It decorates the virus surface and undergoes large motions of its receptor binding domains (RBDs) to enter the host cell. Here, we observe Down, one-Up, one-Open, and two-Up-like structures in enhanced molecular dynamics simulations, and characterize the transition pathways via inter-domain interactions. Transient salt-bridges between RBDA and RBDC and the interaction with glycan at N343B support RBDA motions from Down to one-Up. Reduced interactions between RBDA and RBDB in one-Up induce RBDB motions toward two-Up. The simulations overall agree with cryo-electron microscopy structure distributions and FRET experiments and provide hidden functional structures, namely, intermediates along Down-to-one-Up transition with druggable cryptic pockets as well as one-Open with a maximum exposed RBD. The inherent flexibility of S-protein thus provides essential information for antiviral drug rational design or vaccine development.