## Requesting an account
Request an account by registering at the [genome center computing page](https://computing.genomecenter.ucdavis.edu/)
## Logging on
Use `ssh` or `mosh` to logon to the cluster head node at barbera.genomecenter.ucdavis.edu
first modify `.ssh/config` on your computer to contain
GSSAPIAuthentication=yes
then at the Linux/Unix command line on your computer, enter
kinit -l 14d jmaloof@GENOMECENTER.UCDAVIS.EDU #14 day max; change the username to yours...
Once you logon to Barbera with `mosh` or `.ssha` you will find that you do not have permission for your home directory. To authorize yourself:
kinit -l14d #only needs to be done once every 14 days
aklog # enables your home directory. If you get a message about not being authorized, then do kinit first.
## Working directory
Your home directory has little storage space. For analyses, please use the Maloof Lab share
cd /share/malooflab
Please create your own directory within `malooflab`
## Shared files
Please keep genome fasta files, etc at
cd /share/malooflab/ref_genomes
## Using the cluster
Most analyses are done by submitting a batch script to the queue
__Do no analyses from the head node!!__
## modules
You will need to load _modules_ that contain the programs you want to use. You can see what is available with:
module avail
## Slurm Script
You need to create edit a slurm script to submit commands for processing. These scripts can either be simple (a single job) or an array job.
### single job
The script below is to run STAR on one fastq file at a time, using 16 cpus
#!/bin/bash
#SBATCH --partition=production # partition to submit to
#SBATCH --job-name=Sol150_Star_Run # Job name
#SBATCH --nodes=1 # single node, anything more than 1 will not run
#SBATCH --ntasks=16 # equivalent to cpus
#SBATCH --mem=100000 # in MB, memory pool all cores, default is 2GB per cpu
#SBATCH --time=1-00:00:00 # expected time of completion in days, hours, minutes, seconds, default 1-day
#SBATCH --output=Sol150_Star_Run_single.out # STDOUT
#SBATCH --error=Sol150_Star_Run_single.err # STDERR
#SBATCH --mail-user=jnmaloof@ucdavis.edu #
#SBATCH --mail-type=ALL #
# This will be run once for a single process
/bin/hostname
start=`date +%s`
# Load STAR Module 2.5.2b
module load star/2.5.2b
# Change directory
cd /share/malooflab/Julin/Solanum/Sol150
#files=`ls fastq`
files=`ls -1 fastq_cat | head -n 1`
for f in $files
do
fbase=`basename $f .fastq.gz`
mkdir -p STAR/${fbase}.STARout
STAR \
--genomeDir /share/malooflab/ref_genomes/S_lycopersicum/SL3.00_STAR_REF \
--readFilesIn /share/malooflab/Julin/Solanum/Sol150/fastq_cat/${f} \
--quantMode TranscriptomeSAM GeneCounts \
--twopassMode Basic \
--alignIntronMax 10000 \
--runThreadN 16 \
--outSAMtype BAM SortedByCoordinate \
--outFileNamePrefix ./STAR/${fbase}.STARout/${fbase}_ \
--outReadsUnmapped Fastx \
--outSAMattrRGline ID:${fbase} \
--readFilesCommand zcat
done
end=`date +%s`
runtime=$((end-start))
echo $runtime seconds to completion
### array job
this script creates a separate job for each fastq file.
#!/bin/bash
#SBATCH --partition=production # partition to submit to
#SBATCH --job-name=Brapa_Kallisto # Job name
#SBATCH --array=0-63 #for this script adjust to match number of fastq files
#SBATCH --nodes=1 # single node, anything more than 1 will not run
#SBATCH --ntasks=01 # equivalent to cpus, stick to around 20 max on gc64, or gc128 nodes
#SBATCH --mem=4000 # in MB, memory pool all cores, default is 2GB per cpu
#SBATCH --time=0-01:00:00 # expected time of completion in hours, minutes, seconds, default 1-day
#SBATCH --output=Kallisto_%A_%a.out # STDOUT
#SBATCH --error=Kallisto_%A_%a.err # STDERR
#SBATCH --mail-user=jnmaloof@ucdavis.edu #
#SBATCH --mail-type=ALL #
# This will be run once for a single process
/bin/hostname
start=`date +%s`
# Load Kallisto
module load kallisto
# Change directory
cd /share/malooflab/Julin/Brapa_microbes/20180202-samples/
# Identify each array run
echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
# create an array of file names:
filelist=($(ls 20180202-data/raw-fastq/*/*gz))
# now pick the file that corresponds to the current array
# note that for this script the number of arrays should equal the number of files
f=${filelist[${SLURM_ARRAY_TASK_ID}]}
# trim off directory info and file extensions:
outdir=$(basename $f .fastq.gz)
echo "file stem: " $outdir
kallisto quant \
--index /share/malooflab/ref_genomes/B_rapa/V3.0/B_rapa_CDS_V3.0_k31_kallisto_index \
--output-dir 20180202-data/kallisto_outV3.0/$outdir \
--plaintext \
--single \
-l 250 \
-s 40 \
$f
end=`date +%s`
runtime=$((end-start))
echo $runtime seconds to completion
## submitting your script
```
sbatch script.slurm
```
## checking on your job status
```
squeue -u jmaloof #change to your username
```
## interactive session
If you need to install packages (i.e. for R) or compile programs, or move large files (i.e. sftp) you should start an interactive session. Logon to Barbera first, and then from Barbera:
screen
srun -p production -N 1 -n 1 --time=0-04 --mem=4000 --pty /bin/bash