[[Analysis]]

=====Brief instruction to obtain H-bond by CHARMM script for a Single Lipid Bilayer=====
Authors: Xiaohong Zhuang and Dr. Jeffery B. Klauda 

This instruction is explain the procedure to calculate the number of H-bonding of a multiple lipid bilayer. For the H-bond of single lipid system, please use the instruction: [[hydrogen_bonds_for_single-component_bilayers|]].

! Special Note: the CHARMM version to run Xiao's H-bond script should be newer or equal to c41b2, versions before this may fail to produce the correct number of hydrogen bonds.

Script in ZT1: **/afs/shell.umd.edu/project/energybio/shared/jbklauda/scripts/xiao/lipids_analyses/hb_chm_mult_lip**\\
Example in ZT1: **/afs/shell.umd.edu/project/energybio/shared/jbklauda/scripts/xiao/lipids_analyses/hb_chm_mult_lip_eg**

Download Script: {{ ::hb_mult.tar.gz |}}

=====Initial Steps=====

1. Copy the script folder provided to your path.\\
2. Go to your path that you would like to work on H-bond analysis.\\
3. Use the following command to copy the script folder to your path

<code commend>
cp -r /afs/shell.umd.edu/project/energybio/shared/jbklauda/scripts/xiao/lipids_analyses/hb_chm_mult_lip ./
</code>

For multiple lipids system, the intralipid Hbonds and all Hbonds are calculated in section 1 and 2 separately and the interlipid Hbonds are calculated in Section 3. The example system provided has 1 sterol/cholesterol lipid: CHL1, 2 glycerol lipids: DPPC, DOPE, and 1 sphingo lipid: NSM.

=====Section 1 (Intralipid Hbonds)=====

1.1	Edits to: **1_calc_hb_run.csh**

1.1.1 Update the path of the file based on your system\\

''set sdir = "/lustre/mda003/hlms_310.15/pe_pc_nsm_chol/a_traj"''

1.1.2  Update the equilibrated dcd file range based on your system

<code commend>
@ f = 101       # first dcd file to read
@ l = 150       # last dcd file to read
</code>

1.1.3 Update lipid information including lipid type, lipid name, and headgroup name, as well as whether glycerol lipids are lyso lipids and have carbonyl groups
<code commend>
# 1. Update lipid type [type] for Glycerol lipids:G ;                Sphingo lipids:S ;  
# 2. Update lipid resdue name [rn]  
# 3. Update headgroups [h] for Glycerol lipids:PA/PC/PE/PG/PI/PS;    Sphingo lipids: SM/CER ;
# 4. For glycerol lipids, specify the following for the lipid tails:
#      a) If there is a carbonyl moiety [iscrb] as Y (usually case), otherwise N (if one tail of each, use Y)
#      b) If there is no tail (just an alcohol group) [islys] as Y, otherwise N (if one tail of each, use Y)
# DON'T include the sterol,cholestrol lipids here as it has no H acceptor, so no intralipid HB.
# DON'T include the PC glycerol lipids here as it has no H donor, so no intralipid HB.
# Exception: lyso PC lipid should be included, as alcohol group provides H donor

# List lipids in your system that have both H donor and acceptor for intra-lipid H-bonding
$chm f=$f l=$l p=${sdir} type=G rn=DOPE h=PE iscrb=Y islys=N < 1_calc_hb_intra.inp >& 1_calc_hb_intra-DOPE.out
$chm f=$f l=$l p=${sdir} type=S rn=NSM  h=SM                 < 1_calc_hb_intra.inp >& 1_calc_hb_intra-NSM.out
</code>

1.2	Update your **toppar.str** file

Update the full path in your toppar.str so that you don’t need to copy the toppar folder and it allows to call your topology files correctly, i.e. change each short path in your **toppar.str** from

<code commend>
open read card unit 10 name toppar/top_all36_prot.rtf
read rtf card unit 10
</code>

To the full path as following but with your path

<code commend>
open read card unit 10 name /lustre/xzhuang/simulation/toppar/top_all36_prot.rtf
read rtf card unit 10
</code>


If you have not done this, it is recommended that you can make one **toppar.str** and toppar folder in your path for all your analyses and update the topology files in that particular toppar folder when necessary, so that you don’t need to update the path.

1.3	 In **1_calc_hb_intra.inp**, you may need to update the following when necessary

1.3.1. After you have the full path, change the **toppar.str** path to your **toppar.str** path.

''stream "/lustre/xzhuang/simulation/toppar.str"''

1.3.2 The path of trajectory file is based on common CHARMM-GUI package setup, if your trajectory files are in a different location, please update the path 

''open read unit @u file name @P/namd/dyn/dyn@F.dcd''  

1.3.3 Update data point per dyn file when necessary

''calc ntot = 1000 * @N''

Number of data points in the DYN files = Total time steps divided by dcdfreqency. It can be found by reading your **dyn-2.inp** file to find dcd frequency and total time steps. It is usually 1000 (total times steps 1000000 per dyn / dcd frequency 1000).

1.4	Submit the calculation job by ''1_calc_hb_run.scr''.

After the run is completed, you will see multiple *.txt, which shows the number of intra-lipid H-bonds of each H donor-Acceptor pair of the lipid of each frame. The last column is the total H-bond of each frame.

1.5	Calculate the statistics of the Hbond by command ''2_calc_avg_percent.scr''

The results will be shown in two files: **avg-intra.txt** and **avg-perc-intra.txt**.

In **avg-intra.txt**, as shown in example output below, the 1st columns is lipid name, the 2nd column is number of each lipid, the 3rd column is the H donor and acceptor pair group number (the final row is the Hbond of all the pairs). The 4th and 5th columns show the block averaged number of Hbond and standard error of the block average, respectively. The block size used here are each dyn. You may update the **avg_percent.py** if you want to use different block size. The 6th and 7th columns are the block averaged number of Hbond per lipid and standard error of it.

If you are interested in detail of each pair, you may refer to the files **def_dn_ac.str**, and **def_hb_intra.str** to see the atoms in each pair of your lipid.

<code commend>
Lip	Nlip	Pair	N_avg	N_se	Na/lip	Nse/lip
DOPE	12	1	0.5637	0.0165	0.0470	0.0014
DOPE	12	2	0.1809	0.0169	0.0151	0.0014
DOPE	12	3	0.0000	0.0000	0.0000	0.0000
DOPE	12	4	0.0000	0.0000	0.0000	0.0000
DOPE	12	5	0.0000	0.0000	0.0000	0.0000
DOPE	12	6	0.0000	0.0000	0.0000	0.0000
DOPE	12	TotG	0.7446	0.0223	0.0620	0.0019
…
</code>

In **avg-perc-intra.txt**, as shown below, the total donor-acceptor pairs of the block averaged number of Hbond and standard error of the block average are shown in 3rd and 4th columns, respectively. The 6th and 7th columns are the block averaged number of Hbond per lipid and standard error of it.

<code commend>
Lip	Nlip	N_avg	N_se	Na/lip	Nse/lip
DOPE	12	0.7446	0.0223	0.0620	0.0019
NSM	12	6.5965	0.0231	0.5497	0.0019
 
TotL	24
</code>

=====Section 2 (All-lipid Hbonds)=====

2.1	Edits in **1_calc_hb_run.csh**

2.1.1 Update the path of the file:

''set sdir = "/lustre/mda003/hlms_310.15/pe_pc_nsm_chol/a_traj"''

2.1.2 Update the equilibrated dcd file range

<code commend>
@ f = 101       # first dcd file to read
@ l = 150       # last dcd file to read
</code>

2.1.3 Update lipid type, residue name and headgroup type (//It is different from intra lipid HB as all the H donor lipid, including sterol lipids should be listed here//.)

<code commend>
# Lipid type for Glycerol lipids:G ;                Sphingo lipids:S ;           Sterol/Cholestrol lipids: C ; 
# headgroups for Glycerol lipids:PA/PC/PE/PG/PI/PS; Sphingo lipids: SM/CER ;
# DON'T put PC lipid here as it has no H donor, but included PC lipids in 1_calc_hb_all.inp as it is H acceptor.
# Exception: lyso PC lipid should be included, since alcohol group gives H donor

# Update the lipids with H donors with lipid_type, lipid_name, and headgroup in each of the following lines
# For glycerol lipids: specify [islys], as with part 1 of the script (but [iscrb] not specified here)
# Donot need to specify headgroup for Sterol lipids

$chm f=$f l=$l p=${sdir} type=C rn=CHL1              < 1_calc_hb_all.inp >& 1_calc_hb_all_CHL1.out
$chm f=$f l=$l p=${sdir} type=G rn=DOPE h=PE islys=N < 1_calc_hb_all.inp >& 1_calc_hb_all_DOPE.out
$chm f=$f l=$l p=${sdir} type=S rn=NSM  h=SM         < 1_calc_hb_all.inp >& 1_calc_hb_all_NSM.out
</code>

2.2	Update your **toppar.str** file (Please skip to 2.3 if you already done this in Section 1.2).  See details of this in Section 1.2

2.3 Edits for **1_calc_hb_all.inp**
 
2.3.1	Update total number of H acceptors [includes glycerol and shingo lipid but not Sterol] 

<code commend>
set gtyp 2
set styp 1
</code>

2.3.2 List glycerols and sphingo H acceptor lipids below [No Sterol lipids, as it has no H aceptor], should be consistent as above

<code commend>
! List glycerols and sphingo lipids below [No Sterol lipids, as it has no H aceptor]
! Use rng and hg for resname and headgroup for glycerol lipids, and rns and hs for sphingo lipids
! Specify whether the glycerol lipids have a carbonyl moiety [iscrb] as Y (usually case), otherwise N
! H acceptors, should be consistent as above

set rng1 DPPC
set hg1 PC
set iscrb1 Y
set rng2 DOPE
set hg2 PE
set iscrb2 Y

set rns1 NSM
set hs1 SM
</code>

2.3.3. After you have the full path, change the **toppar.str** path to your **toppar.str** path:

''stream "/lustre/xzhuang/simulation/toppar.str"''

2.3.4 The path of trajectory file is based on common CHARMM-GUI package setup, if your trajectory files are in different location, please update the path of it:

''open read unit @u file name @P/namd/dyn/dyn@F.dcd''

2.3.5 Update data point per dyn file when necessary

''calc ntot = 1000 * @N''

Number of data points in the DYN files = Total time steps divided by dcdfreqency. It can be found by reading your **dyn-2.inp** file to find dcd frequency and total time steps. It is usually 1000 (total times steps 1000000 per dyn / dcd frequency 1000).

2.4 Submit the calculation job by ''1_calc_hb_run.scr''

After the run completed, you will see multiple **Dlip1-Alip.txt**, which shows the number of all H-bonds of each H donor-Acceptor pair of the lipid of each frame. The last column is the total H-bond of each frame.

2.5	Update **2_calc_avg_percent.scr**

<code commend>
# Update H donor [Dlip] and acceptor lipid [Alip] name 
# Sterol lipids are H donor only, not H acceptor, so it is in Dlip list , not in Alip 
# PC glycerol lipids are not H donor, but a H acceptor only, so it is not in Dlip, but in Alip list.

foreach Dlip ( CHL1 DOPE NSM )
foreach Alip ( DPPC DOPE NSM )

echo $Dlip-$Alip >> lippair.txt

end
end
</code>

2.6	Calculate the statistics of the Hbond by command ''2_calc_avg_percent.scr''

The results will be shown in two files: **avg-all.txt** and **avg-perc-all.txt**.

In **avg-all.txt**, as shown in example output below, the 1st columns is Donor lipid – Acceptor lipid names, the 2nd column is the H donor and acceptor pair group number (the final row is the Hbond of all the pairs). The 3rd and 4th columns show the block averaged number of Hbond and standard error of the block average, respectively. The block size used here are each dyn. You may update the avg_percent.py if you want to use different block size. 

If you are interested in details of each pair, you may refer to the files **def-donor.str**, **def-acceptor-g.str**, **def-acceptor-s.str**, and **def_hb_all.str** to see the atoms in each pair of your lipid.

<code commend>
Dlip-Alip	pair	N_avg	N_se

chl1-dppc	1	2.0728	0.1151
chl1-dppc	2	1.4138	0.0714
chl1-dppc	3	0.0000	0.0000
chl1-dppc	TotG	3.4866	0.1316
…
</code>

In **avg-perc-all.txt**, as shown below, the total lipid number of lipid is shown in 2nd column, the total donor-acceptor pairs of the block averaged number of Hbond and standard error of the block average are shown in 3rd and 4th columns, respectively. The 6th and 7th columns are the block averaged number of Hbond per lipid and standard error of it. The total number of lipids are shown in the last row.

<code commend>
Lip	Nlip	N_avg	N_se	Na/lip	Nse/lip
CHL1	64	9.6250	0.3658	0.1504	0.0057
DOPE	12	7.8334	0.3803	0.6528	0.0317
DPPC	12	5.5743	0.2490	0.4645	0.0208
NSM	12	12.4328	0.3289	1.0361	0.0274
 
TotL	100
</code>

=====Section 3 (Inter-Lipid HBond)=====

''Ninter_lipid = Ninter_plus_intra - Nintra_lipid''

After finish the 1-intra and 2-all calculation (complete all the steps in them including the average calculation), go to 3-inter folder, calculate inter-lipid Hbond by command ''3_run_calc_inter.scr''. 

The file **avg-perc-inter.txt** will be obtained:

<code commend>
Lip	Nlip	N_avg	N_se	Na/lip	Nse/lip
CHL1	64	9.6250	0.3658	0.1504	0.0057
DPPC	12	5.5743	0.2490	0.4645	0.0208
DOPE	12	7.0888	0.3580	0.5907	0.0298
NSM	12	5.8363	0.3058	0.4864	0.0255
</code>