This is an old revision of the document!

Backup Data Instructions

Contained in this wiki are instructions for backing your data on the Klauda Lab Backup Server. The name of this server is klauda-bkup1.umd.edu and can only be reached on computers on the UMD network. You'll need to logon to DT1/DT2 first if you need to access this server outside campus without VPN.

Policies

If you have simulation data, then it MUST be backed up in two places in the event of failure of hardware. Data is categorized as either being published or unpublished. As you start in the lab your work is in progress for publication but essential for a potential future publication. Published data is after you've had your work accepted by a scientific journal and no longer requires more simulation or analysis.

Unpublished Data

This will be the data you are currently running simulations and analysis on. This will be backed up on a weekly incremental basis and a full backup every month. If by chance you mistakenly delete data it may be recoverable from these sources but must be done before the full backup that occurs the beginning of the month. You will need to setup backup scripts (see below) on DT1 and DT2, as appropriate for your simulations. If you run on MARCC or XSEDE resources, this data needs to be copied to DT1 or DT2 for data analysis and backups.

Published Data

Once you have data that is published, you need to reduce the trajectory file sizes and make a single tarred and compressed file (.tar.gz). The name of this file should make it clear to me what publication this is from and be placed in your backup directory on the backup server. Then you will create another backup on an external hard drive (see Dr. Klauda) and remove data from DT1/DT2. So you still have two locations (external HD and backup server). The DCD/trajectory files should be reduced to save space. If you have 100-400ns of data, then reduce this to have only 10ps frames. If longer, consult with Dr. Klauda on the frame rate.

1. Create your login and password on backup server

Use same username as your deepthought account but a different strong password. Myself or Pouyan will help you with this in the lab. Please get in touch with one of us to set a time for this.

2. Setting up password-less login to backup server from deepthought

a. login to deepthought b. From you home on deepthought open .ssh/known_hosts file and delete the line starting with klauda- bkup1.umd.edu. Save and exit c. Type ssh-keygen d. Press y (when asked y/n?) and 'return' key otherwise e. ssh-copy-id -i ~/.ssh/id_rsa.pub username@klauda-bkup1.umd.edu (Your username on backup machine) f. Try ssh username@klauda-bkup1.umd.edu and you should not be asked for a password

3. Creating backup directories

a. Once you are logged in on the backup server, go to /local0/backup and create a directory of your username b. Inside that create two directories named full and weekly c. Inside weekly create 12 directories corresponding to 12 months 01, 02, 03 … 12

4. Creating backup scripts on deepthought

a. Create a directory named backup in your home directory on deepthought b. Copy backup-full.scr and backup-weekly.scr from /homes/jbklauda/backup to this backup directory c. Change the username from 'pypendse' to your username everywhere in the script d. There will be two paths - i) Path of your directory on deepthought that will be backed up on the server. It should generally be “/export/lustre_1/username” and (ii) Path of directory on server whe re it will be backed up. This will be “/local0/backup/username/full” or “weekly/xx” e. Double check whether you have changed the username. Otherwise when you run your backup, my data w ill be lost.

5. Setting up crontab

Crontab will schedule your weekly and monthly backups a. When you are on deepthought, type crontab -e, a crontab window will open up b. Use the following as a template to create your crontab

SHELL=/bin/csh # Weekly Incremental Backups # Sat at 0X:0X am 0X 0X * * 6 /homes/usrname/backup/backup-weekly.csh # Monthly Full Backups # 1st day of month at 11:01 am 0X 0X 1 * * /homes/usrname/backup/backup-full.csh c. 0X:0X am is the time at which your data will be automatically backed up every Saturday and first day of the month. We keep our backup times few hours apart from each other. Once everything else is completed, we can decide on these times for everyone. d. When you save and exit, it will install crontab e. Check whether crontab is installed or not using crontab -l

Listed below are instructions for setting up accounts on deepthought1 (DT1) and deepthought2 (DT2) located at UMD.

Instructions:

You will assigned one or both clusters based on your simulation needs by Dr. Klauda. These instructions work for both, so do these on the cluster(s) you are assigned. Dr. Klauda will submit a request to add you as a user to my account on DT1 and/or DT2. Once this is done, you should get a generic email from IT regarding the resource.

Details of DT1 and DT2 can be found:

http://www.oit.umd.edu/hpcc/

Now for the cluster setup:

If you are not familiar with the use of LINUX, I would first go through some on-line tutorials for commands to use in LINUX and here (Linux Guide). To login to DT1 (login.deepthought.umd.edu) or DT2 (login.deepthought2.umd.edu) with Windows you will need to have PuTTY (Extra Software) to connect to deepthought:

http://www.chiark.greenend.org.uk/%7Esgtatham/putty/

And the following program will be useful to transfer files between deepthought and your computer

http://winscp.net/eng/index.php

Built-in commands for MAC users can be used to connect via the command line (scp) and apps for file transfer (eg, cyberduck).

Then, take a general look at the website of the HPCC OIT to get a general idea on how to login and submit jobs. Your jobs should not run in your home directory. They should instead be in the /lustre directory. You should first make a subdirectory in this with your username (mkdir is the command). Running simulations should be in organized directories. So think of some naming scheme for simulations. Before starting a simulation, you will need to make a slight change to you .cshrc file in your home directory. Using the program pico or vm type the following when you are in your home directory:

 pico .cshrc.mine

At the end of the file add the following line:

 unset noclobber

This is needed for the simulations to allow for files to write over existing files.

One more recent change is that you will need to allow me to look into your /lustre files. After you've made the directory (described above), you will need to allow access to the files by typing and replacing username with yours:

cd /lustre/username 
chmod a+r . 
chmod -R a+r * 
find . -type d -exec chmod a+x {} \;