## Developing and applying new theoretical and computational methods to study complex condensed phase systems

## Gregory A. Voth

Haig P. Papazian Distinguished Service Professor

Department of Chemistry

Google Scholar Page

## Material for Download

### RAPTOR® Charge Transport Simulation Software

A Modern Perspective on the Hydrated Excess Proton (aka "Hydronium")

**Multi-scale Coarse-graining (MS-CG) Force Matching (FM) code is now publicly available for download**

### Coarse-Grained Force Fields from the Perspective of Statistical Mechanics: Better Understanding of the Origins of a MARTINI Hangover

The popular MARTINI coarse-grained model is used as a test case to analyze the adherence of top-down coarse-grained molecular dynamics models (i.e., models primarily parametrized to match experimental results) to the known features of statistical mechanics for the underlying all-atom representations. Specifically, the temperature dependence of various pair distribution functions, and hence their underlying potentials of mean force via the reversible work theorem, are compared between MARTINI 2.0, Dry MARTINI, and all-atom simulations mapped onto equivalent coarse-grained sites for certain lipid bilayers. It is found that the MARTINI models do not completely capture the lipid structure seen in atomistic simulations as projected onto the coarse-grained mappings and that issues of accuracy and temperature transferability arise due to an incorrect enthalpy–entropy decomposition of these potentials of mean force. The potential of mean force for the association of two amphipathic helices in a lipid bilayer is also calculated, and especially at shorter ranges, the MARTINI and all-atom projection results differ substantially. The former is much less repulsive and hence will lead to a higher probability of MARTINI helix association in the MARTINI bilayer than occurs in the actual all-atom case. Additionally, the bilayer height fluctuation spectra are calculated for the MARTINI model, and compared to the all-atom results, it is found that the magnitude of thermally averaged amplitudes at intermediate length scales are quite different, pointing to a number of possible consequences for realistic modeling of membrane processes. Taken as a whole, the results presented here show disagreement in the enthalpic and entropic driving forces driving lateral structure in lipid bilayers as well as quantitative differences in association of embedded amphipathic helices, which can help direct future efforts to parametrize CG models with better agreement to the all-atom systems they aspire to represent.

### A Multiscale Coarse-Grained Model of the SARS-CoV-2 Virion

Significance: This study reports the construction of a molecular model for the SARS-CoV-2 virion and details our multiscale approach toward model refinement. The resulting model and methods can be applied to and enable the simulation of SARS-CoV-2 virions.

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the COVID-19 pandemic. Computer simulations of complete viral particles can provide theoretical insights into large-scale viral processes including assembly, budding, egress, entry, and fusion. Detailed atomistic simulations are constrained to shorter timescales and require billion-atom simulations for these processes. Here, we report the current status and ongoing development of a largely “bottom-up” coarse-grained (CG) model of the SARS-CoV-2 virion. Data from a combination of cryo-electron microscopy (cryo-EM), x-ray crystallography, and computational predictions were used to build molecular models of structural SARS-CoV-2 proteins, which were then assembled into a complete virion model. We describe how CG molecular interactions can be derived from all-atom simulations, how viral behavior difficult to capture in atomistic simulations can be incorporated into the CG models, and how the CG models can be iteratively improved as new data become publicly available. Our initial CG model and the detailed methods presented are intended to serve as a resource for researchers working on COVID-19 who are interested in performing multiscale simulations of the SARS-CoV-2 virion.

### A new one-site coarse-grained model for water: Bottom-up many-body projected water (BUMPer). I. General theory and model

Water is undoubtedly one of the most important molecules for a variety of chemical and physical systems, and constructing precise yet effective coarse-grained (CG) water models has been a high priority for computer simulations. To recapitulate important local correlations in the CG water model, explicit higher-order interactions are often included. However, the advantages of coarse-graining may then be offset by the larger computational cost in the model parameterization and simulation execution. To leverage both the computational efficiency of the CG simulation and the inclusion of higher-order interactions, we propose a new statistical mechanical theory that effectively projects many-body interactions onto pairwise basis sets. The many-body projection theory presented in this work shares similar physics from liquid state theory, providing an efficient approach to account for higher-order interactions within the reduced model. We apply this theory to project the widely used Stillinger–Weber three-body interaction onto a pairwise (two-body) interaction for water. Based on the projected interaction with the correct long-range behavior, we denote the new CG water model as the Bottom-Up Many-Body Projected Water (BUMPer) model, where the resultant CG interaction corresponds to a prior model, the iteratively force-matched model. Unlike other pairwise CG models, BUMPer provides high-fidelity recapitulation of pair correlation functions and three-body distributions, as well as *N*-body correlation functions. BUMPer extensively improves upon the existing bottom-up CG water models by extending the accuracy and applicability of such models while maintaining a reduced computational cost.

### A new one-site coarse-grained model for water: Bottom-up many-body projected water (BUMPer). II. Temperature transferability and structural properties at low temperature

A number of studies have constructed coarse-grained (CG) models of water to understand its anomalous properties. Most of these properties emerge at low temperatures, and an accurate CG model needs to be applicable to these low-temperature ranges. However, direct use of CG models parameterized from other temperatures, e.g., room temperature, encounters a problem known as transferability, as the CG potential essentially follows the form of the many-body CG free energy function. Therefore, temperature-dependent changes to CG interactions must be accounted for. The collective behavior of water at low temperature is generally a many-body process, which often motivates the use of expensive many-body terms in the CG interactions. To surmount the aforementioned problems, we apply the Bottom-Up Many-Body Projected Water (BUMPer) CG model constructed from Paper I to study the low-temperature behavior of water. We report for the first time that the embedded three-body interaction enables BUMPer, despite its pairwise form, to capture the growth of ice at the ice/water interface with corroborating many-body correlations during the crystal growth. Furthermore, we propose temperature transferable BUMPer models that are indirectly constructed from the free energy decomposition scheme. Changes in CG interactions and corresponding structures are faithfully recapitulated by this framework. We further extend BUMPer to examine its ability to predict the structure, density, and diffusion anomalies by employing an alternative analysis based on structural correlations and pairwise potential forms to predict such anomalies. The presented analysis highlights the existence of these anomalies in the low-temperature regime and overcomes potential transferability problems.

#### Minimal Experimental Bias on the Hydrogen Bond Greatly Improves Ab Initio Molecular Dynamics Simulations of Water

Experiment directed simulation (EDS) is a method within a class of techniques seeking to improve molecular simulations by minimally biasing the system Hamiltonian to reproduce certain experimental observables. In a previous application of EDS to ab initio molecular dynamics (AIMD) simulation based on electronic density functional theory (DFT), the AIMD simulations of water were biased to reproduce its experimentally derived solvation structure. In particular, by solely biasing the O–O pair correlation function, other structural and dynamical properties that were not biased were improved. In this work, the hypothesis is tested that directly biasing the O–H pair correlation (and hence the H–O···H hydrogen bonding) will provide an even better improvement of DFT-based water properties in AIMD simulations. The logic behind this hypothesis is that for most electronic DFT descriptions of water the hydrogen bonding is known to be deficient due to anomalous charge transfer and over polarization in the DFT. Using recent advances to the EDS learning algorithm, we thus train a minimal bias on AIMD water that reproduces the O–H radial distribution function derived from the highly accurate MB-pol model of water. It is then confirmed that biasing the O–H pair correlation alone can lead to improved AIMD water properties, with structural and dynamical properties even closer to experiment than the previous EDS-AIMD model.

### Atomic-scale characterization of mature HIV-1 capsid stabilization by inositol hexakisphosphate (IP6)

Inositol hexakisphosphates (IP6) are cellular cofactors that promote the assembly of mature capsids of HIV. These negatively charged molecules coordinate an electropositive ring of arginines at the center of pores distributed throughout the capsid surface. Kinetic studies indicate that the binding of IP6 increases the stable lifetimes of the capsid by several orders of magnitude from minutes to hours. Using all-atom molecular dynamics simulations, we uncover the mechanisms that underlie the unusually high stability of mature capsids in complex with IP6. We find that capsid hexamers and pentamers have differential binding modes for IP6. Ligand density calculations show three sites of interaction with IP6 including at a known capsid inhibitor binding pocket. Free energy calculations demonstrate that IP6 preferentially stabilizes pentamers over hexamers to enhance fullerene modes of assembly. These results elucidate the molecular role of IP6 in stabilizing and assembling the retroviral capsid.

### Density Functional Theory-Based Quantum Mechanics/Coarse-Grained Molecular Mechanics: Theory and Implementation

Quantum mechanics/molecular mechanics (QM/MM) is a standard computational tool for describing chemical reactivity in systems with many degrees of freedom, including polymers, enzymes, and reacting molecules in complex solvents. However, QM/MM is less suitable for systems with complex MM dynamics due to associated long relaxation times, the high computational cost of QM energy evaluations, and expensive long-range electrostatics. Recently, a systematic coarse graining of the MM part was proposed to overcome these QM/MM limitations in the form of the quantum mechanics/coarse-grained molecular mechanics (QM/CG-MM) approach. Herein, we recast QM/CG-MM in the density functional theory formalism and, by employing the force-matching variational principle, assess the method performance for the two model systems: QM CCl4 in the MM CCl4 liquid and the reaction of *tert*-butyl hypochlorite with the benzyl radical in the MM CCl4 solvent. We find that density functional theory (DFT)-QM/CG-MM accurately reproduces DFT-QM/MM radial distribution functions and three-body correlations between the QM and CG-MM subsystems. The free-energy profile of the reaction is also described well, with an error <1–2 kcal/mol. DFT-QM/CG-MM is a general, systematic, and computationally efficient approach to include chemical reactivity in coarse-grained molecular models.

### Anisotropic Motions of Fibrils Dictated by Their Orientations in the Lamella: A Coarse-Grained Model of a Plant Cell Wall

Plant cell walls are complex systems that exhibit the characteristics of both rigid and soft material depending on their external perturbations. The three main polymeric components in a plant primary cell wall are cellulose fibrils, hemicellulose, and pectins. These components interact in a hierarchical fashion giving rise to mesoscale structural features such as cellulose bundles, lamella stacking, and so on. Although several studies have focused on understanding these unique structural features, a clear picture linking them to cell wall mechanics is still lacking. As a first step toward this goal, a phenomenological model of plant cell wall has been developed in this work by using available experimental data to investigate the underlying connections between mesoscale structural features and the motions of fibrils during deformation. In this model cellulose fibrils exhibit motions such as angular reorientations and kinking upon forced stretching. These motions are dependent on the orientation of fibrils with respect to the stretch direction, i.e., fibrils that are at an angle to the stretch direction exhibit predominant angular reorientations, while fibrils transverse to the stretch direction undergo kinking as a result of transverse compression. Varying the chain length of pectin had negligible effects on these motions. One of the main contributions from this work is the development of a simple model that can be easily fine-tuned to test other hypotheses and extended to include additional experimental knowledge about the structural aspects of cell walls in the future.