Klauda Lab Simulation Wiki

Umbrella Sampling Method for PMF Created by Robert Allsopp

This is an implementation of umbrella sampling to generate a PMF in NAMD systems this example uses distanceZ for a Z dimensional reaction coordinate.

Process Summary:

Setp 1: Implementing a moving Colvar (Collective Variables) to restrain the peptide, antibiotic… along the reaction coordinate. Umbrella sampling is required to move and hold the object of interest as it populates a path along the reaction coordinate that isn't favorable. This is then used to generate “windows” individual simulations at different points along the reaction coordinate, that are equilibrated and then run to collect data that is processed by WHAM.

Step 2: Implement a stationary Colvar that holds the peptide, antibiotic, or object of interest in place while it equilibrates for a 1-4ns

Step 3: Running simulation for official data collecting and calculating the PMF using WHAM. WHAM lookes at the deviation between the equilibrium coordinates of the spring and the spring constant to calculate the reaction coordinate.

####It is a good idea since these will run in the debug queue to change the partition to the standard instead of the high prioritey partition. This is done by changing the dyn.scr file to say

    #!/bin/bash

    sbatch -A energybio dyn.csh

Step 1: Copy the files in the folder into the directory. dyn_Umbrella.csh is a modified version of the dyn.csh file that implements the moving colective variable restraint. The csh file does all the work with sed and echo commands to update the position of the restraints and pass the information down a line of files.

Before moving on be sure to recenter the trajectory so that the protein is in the center of the cell. The recenter_PMF folder is the exact same proceedure as the one used for the EDP but the references to MEMB are replaced with PROA. Run this on the last dcd file in the trajectory of data.

Run this with ./recenter1.scr, ./recenter2.scr, ./recenter3.scr but wait after each command until the previous one leaves the queue check with squeue -u $username

Then copy the file for instance dyn150-recenter.dcd to the mkpdb_2 folder to make a PDB and PSF of the file for restarting the program.

Then open the restart.inp file and check that the setup is correct like the temperature and stuff.

Then in the restart.inp file check these lines, and go back to the mkpdb_2 folder and open the getframe.out file and scroll down to the bottom and copy the ?XTLA ?XTLB ?XTLC values and past them in the a, b, c values below

cellBasisVector1 $a 0.0 0.0; # vector to the next image cellBasisVector2 $d $b 0.0; cellBasisVector3 0.0 0.0 $c; cellOrigin 0.0 0.0 $zcen; # the *center* of the cell

cellBasisVector1 65.0759254 0.0 0.0; # vector to the next image cellBasisVector2 $d 65.0759254 0.0; cellBasisVector3 0.0 0.0 79.2041191; cellOrigin 0.0 0.0 $zcen; # the *center* of the cell

Run the restart.csh file by typing sbatch restart.csh

Now the system should be shifted so that all of the water in the system is above the peptide or antibiotic. This is necessary because the membrane will be restrained and can't move as the peptide is pulled off, and the peptide can't be pulled across the Periodic Boundary Condition.

Check by downloading the restart.psf and the restart_after_recenter.dcd file. Load them into VMD and first the .psf then the .dcd, and under Graphics→ representations select create Rep change the selected Atoms in the first to be segname PROA or the name of the antibiotic in the PDB. A second selection of segname MEMB, and a third selection of water. This should make clear where the water is and make sure the system is correct before proceeding.

What needs to be changed now is the dyn-2.inp files the bottom part needs to match the example. I think the run line is redundant. Everything in the file should corespond to something in the existing file except for a couple of things at the end. If in doubt it is possible to check with the step7*inp file since that is the template of the dyn-2.inp file and then just eliminate the lines at the end and save over dyn-2.inp.

Also change the step5_input.colvar_pulling.str file to match the membrane restraints that existed during the charmm-gui equilibration phase. This change is to the variable names as well as to the path of the files in the restraints.

Next it is important to update and define the atoms that are going to be held by the restraint. There are a few options all of which have been implemented to show the different senarios. First it is possible to use

      psfSegID PROA
atomNameResidueRange CA 1-20

or if it is just one atom it is also possible to use this.

atomNumbers 302

Lastly there is a third option similar to the first option that uses VMD.

1. Open the pdb file in vmd 2. Then in vmd console: set sel [atomselect top all] $sel set beta 0 $sel set occupancy 0 set smd [atomselect top “protein and resid 1 and name CA”] $smd set occupancy 1 $sel writepdb prot_smd.ref $sel set occupancy 0 set fixed [atomselect top “protein and not resid 1 and backbone and noh”] $fixed set beta 1 $sel writepdb prot_fixed.ref

This is helpful to generate a constraints file similar to how it is written in the first membrane restraints.

Basically this is how to tell namd which atoms to connect onto the spring.

The other step is to set up constraints if needed on the object that is being pulled. If it is being pulled and a constraint is needed to preserve the conformation then make a second .ref file that has 0 in all the 9 th column and add 1s in all the rows of the atoms that need to be constrainted.

Adjust and calibrate the constraint constants as needed.

The second thing that needs to be updated are these two lines:

set center = `echo “$center/4 -0.25” | bc -l` set targetcenter = `echo “$targetcenter/4 - 0” | bc -l`

These two lines define the starting and ending points specifically the -15.25 and the -15 lines are the starting center and the targetcenter but the value gets incremented by $center/4, (whatever is inside the center file divided by 4). After recentering the peptide this should be as shown above. and targetcenter is the end point of the center of mass that needs to be defined before use. The center refers to the center of mass of the defined peptide, antibiotic … and is a user defined selection of atoms. This value can be guessed from VMD but it is much easier to run the NAMD program with the wrong values on purpose to check the output of the dyn.*traj file and use those values, use a weak spring constants and pick the first value. It will likely say something about the periodic cell has become too small or something along those lines but there should be at least the first step in the *traj file that has the starting positions of all the atoms that have colvars turned on for them. So replace all the values in the step5_input.colvar_pulling.str file except for the ones that have $ in front of them as these are specified in the .csh file.

The other variables to check are the values inside the files center and targetcenter they should exist and be set to 0

Now it is time to look in more detail at the step5_input.colvar_pulling.str file. In this file the lines start by stating the frequency for the colective variables and the output. This can be adjusted but don't go below 50 becuase preformance suffers below about a frequency of 50. There are two components, the colvar and the harmonic. The colvar defines that the restraint is going to be in the z direction and is named dmps_head_upper for instance, however there are many options to do different direcitons but for pulling off the membrane it is simplest to stick with this. The stuff under the main{} and atomsFile … is how the atoms get selected in a similar process to the one that was just performed by tagging the atoms to be held by a restraint this time. The harmonic section links the colvar that was defined above to a harmonic spring with an equilibrium center at -3.2 and a force constant that is defined in the csh file. Output centers tells the program to output the how close the centers of mass come to the center value and this is useful to collect lots of data without having to do any post processing.

I needed to use the membrane restraints while the peptide is pulled. Otherwise the membrane will start drifting up with the peptide or antibiotic.

Scroll down some to see my colvar defined as peptide_pull the program ignores the stuff that has been hashtaged out of the input file. So the colvar section is about the same as before for the membrane section the difference is that the harmonic section uses targetCenter and targetNumSteps and targetNumStages. These implement a moving center so that the system has time to respond and does'nt give an error for being forced too fast.

Section out all the rest of the colvars so that there is only the peptide_pull for now. The other ones were used to restrain atoms on the ends of the peptide to flatten the orientation of the peptide as it leaves the membrane surface.

Also make sure that all the references to the run number of steps are consistent between the dyn.csh and the dyn-2.inp and that the smit value is also consistent with the run value. Also it may be necessary to calibrate the spring, pulling speed numstages numsteps all of that, but for most applications it should be fine.

It is a good idea to use steps of 0.25 A as is done in this example to have flexibility to make a finer grid when needed.

Everything is setup by now and should be ready to run the simulation. These are short runs that can be handelled in the debug queue of deepthought2. Just calculate how far that the object should be moved and set the last.seqno accordingly to stop the simulation at the desired time.

Step 2: This step should be done on expanse or another super computer that is faster since these will be production runs.

This is the exact same as step1 except the lines in the step5_input.colvar_pulling.str, delete the lines that causes this to be moving constratin. Specifically the targetcenter line and the numstages numsteps lines.

To do this efficiently there is a protocol to help with setting everything up.

Run some test simulaions. Do not use the pulled simulaitons since this was pulled, start from a pulled simulation point by gunzip the dyn301.vel dyn301.coor dyn301.xsc and copy them to dyn.vel dyn.coor dyn.xsc. Then run a short or long simulaiton to get a good idea of the spread of the simulaion. Plot the histogrm of the peptide or antibiotic position in one of the windows. This can be done by opening the dyn301.colvars.traj file for instance in excel from an existing opened excel window choosing space deliminated, open the file and delete all of the columns except for the peptide_pull column (this will be automated laster on to work with WHAM). Plot the historgram distribution. For the peptide example it covered a spread of 0.75 A approximately so I spaced the windows 0.5 angstroms apart. There must be overlap between the simulation windows for the WHAM to work.

Next use the already decided upon points of for the start and end of the reaction coordinate and divide this into equally spaced grid that achieves overlapp. I used 0.5 spacing with a spread of 0.75. Then these sets of files that have all the trajectory data will be copied to new folders and named window1_-14.5, window2_-14 window3_-13.5 … window15… window70… until the grid is completed. But before proceeding make sure all the files are correctly updated before hand.

For instance the last.seqno should be set to 2 steps ahead of where it is currently (we want to run one set of 2ns simulaiton to equilibrate the system and another simulation of clean data to collect for data analysis)

Update the dyn.csh there are a few options. I recommend just leaving it the same and running it how it is written. Just update this line to whatever the window z position is set to. There is a way to automate the process so that it is feasible to do 70 windows. To do this place the automation.sed script in the folder that can access all the windows. Then open this file in nano to view it. Just update the path to the windows where the dyn_production.csh file is located, add the /namd/, and replace this for all of the lines. Then adjust all of the final positions to hold the restraints. Lastly it is a good idea to switch the output name from dyn.csh to something else that is new for all the windows. Do not use dyn.csh unless you delete it from all the windows. Just update the dyn.scr to activate the new .csh file.

Lastly type chmod u+x automation.sed and then type ./automation.sed

This will correct this line of code to hold it stationary for the next 4 ns or so of production run simulation time.

set center = `echo “6.5” | bc -l`

Run this for two production runs as indicted above. The first time it is running for equilibration and the second run is the actual data that will go into WHAM. Lastly, I recommend adjusting the frequency in the step5*pulling.str file to maybe 50 since more data is good on these steps.

Step3) Collecting the data and inputing it into WHAM.

Outline of the last steps: Use an automated process to copy all the individual .traj files back with new names to deepthought2 in a new folder Use sed commands to trim the .traj files to something useable Lastly, use the wham command to calculate the PMF. Plot it by opening it as a space deliminated file Done

So to get started move the automated_copy file to a place where it can easily access all of the winodws folders. Then adjust the paths and the names of the files. All you have to do is type in the password for each copy instead of taking a more coplicated approach.

Next use the program “data_3” again it is necessary to chomd u+x data_3 and then update the names of the files what this does is erases all the columns except the ones that are needed for the WHAM calculation.

Then in the DT2 directory move the file INPT_2 into the directory that has all of the .traj files. Open this file and see that it has the path to all the files. The lines at the top are the examples of how to run the program with the hashtag removed. The other things that need to be done is the first number after the path is the equilibrium position of the spring. The second number is the K constant. Be careful with the K constant because if the grid is turned on and the width is not 1 then the k value is rescaled.