DO NOT FORGET TO CHANGE PERMISSION OF YOUR HOME DIRECTORY MENTIONED IN STEP 5-4 and 5-10
Majority of NGS sequencing facilities provide Illumina sequencing data to clients using Illumina’s cloud service Basespace. Following steps describe this transfer process.
Data-transfer from sequencing facility to clients basespace account.
- Step1: The client creates an account in basespace (free service) and provide account details (email associated with basespace account) to sequencing facility.
- Step2: sequencing facility either shares data or transfers the ownership to the client using clients email ID used to create basespace account. Client gets an email notification of the event.
- Step3: client logs in basespace account and accept the ownership of data and the data is transferred to clients account. To check if the data is transferred successfully to your account see the step4 of data download section ibelow.
Data download (Downloading data from users basespace account)
Data download can be done at command line interface or with a script. For command line interface, Please use interactive sessions (qlogin on BBC and srun –qos=general –pty bash on Xanadu).
- Step 4: (Executed on Basespace website) Please login in your Basespace account and make sure that you have data files in your account related to your project.
Select project and then samples to see the available datasets.
- Step5: (To be executed on HPC/Cluster)
- Login in your HPC/cluster account. Start an interactive session using srun command.
$ srun --pty -p general --qos=general --mem=2G bash
- Create following two directories in your home directory or the directory where you would like to copy the data from basespace with commands (If destination
directory
is not home directory replace~
with/path/to/directory
)$ mkdir basespace #(Following mounting of this directory, contents of your basespace account can be seen in this directory ) $ mkdir dinosaur #(replace dinosaur with your project name, This is the directory where data will be transferred into from Basespace)
- Start an interactive session using srun. Check available modules using
$ module avail
and look for basemount/0.13.3.1573 (preferred) /. We will use basemount in this tutorial. Load module using the command$ module load basemount/0.13.3.1573
- Change permissions to allow mount and unmount of basespace directory. Use command
$ chmod 777 ~ $ basemount basespace
- Copy the URL from your output shown in the example in red box and paste in browser. (Do not copy from the example above, copy from your output !!!) This will open up the Basespace login window. Login using your credentials and that will authenticate the connection. It may not ask for authentication in future sessions.
- Your datasets (fastq.gz files) are now available in
basespace/Projects/$PROJECT_NAME/Samples/$SAMPLE_NAME/Files/
The directory names changes based on projects and samples that were sequenced, so replace them with the appropriate project and sample name.e.g.
basespace/Projects/dinosaur/Samples/trex/Files/
in here they were replaced as
$PROJECT_NAME=dinosaur (ProjectName) $SAMPLE_NAME=trex (Sample Name)
- Next step is to copy fastq.gz files to local directory, here directory dinosaur which we created in step5-3. To do so we will use
cp
command$ cd basespace/Projects/dinosaur/Samples/ $ sample_list=$(ls) $ for each in $sample_list > do > mkdir ~/dinosaur/$each > cp $each/Files/*.fastq* ~/dinosaur/$each/ > done $ cd ~
Once this is completed, Check to ensure the transfer.
- Next step is to unmount the directory. Use the commands below only if you are downloading in home directory otherwise simply execute the command
basemount --unmount basespace
$ basemount --unmount basespace $ chmod 700 ~
- Login in your HPC/cluster account. Start an interactive session using srun command.
If you want to run it as a part of script: Please compose the script as (add computational resources header appropriate to the cluster)
Here the computational resources header is for xanadu (SLURM).
________________________________________________________________________
#!/bin/bash # Submission script for Xanadu #SBATCH --job-name=Basespace_dwnld #SBATCH --mail-user=first.last@uconn.edu #SBATCH --mail-type=ALL #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 module load basemount/0.13.3.1573 PROJECT_NAME=$1 mkdir -p basespace chmod 777 ~ mkdir $PROJECT_NAME basemount basespace cd basespace/Projects/$PROJECT_NAME/ sample_list=$(ls) for each in $sample_list do mkdir ~/$PROJECT_NAME/$each cp $each/Files/*.fastq* ~/$PROJECT_NAME/$each/ done cd ~ basemount --unmount basespace chmod 700 ~
***This is a single continuous command line
Save the script as basespace_download.sh or any other appropriate name.
Run the script on xanadu
$ SBATCH basespace_download.sh $PROJECT_NAME
As an example $PROJECT_NAME=dinosaur. So the command will be
SBATCH basespace_download.sh dinosaur