THE CLUSTER

ibm-supercomputer-p690-cluster.jpg

Non-computationally-intensive analyses can be run on your personal desktop or laptop computer. But for computationally intensive analyses (read: fMRI data), you'll need to run them to the Discovery Cluster. (Not sure what a cluster is? Look here.)

CLUSTER RESOURCES

  • For cluster questions, John Hudson is our go-to person. He and other members of the Cluster team can help you debug, optimize, and parallelize your code.
  • This tutorial explains more in depth about how the Discovery works, and how to connect with it. You can find other handy tutorials on that website as well.
  • Research computing also sometimes holds intro classes on using the Cluster; look out for them on their calendar.

ACCESSING THE CLUSTER

We have a lab share on Dartmouth's Discovery Supercomputing Cluster, with subfolders for each individual lab member's data and analyses. You can request an account to access the cluster here. Once you have your account, you'll need to email John Hudson to grant it permissions to access the lab share.

Logging Into the Cluster

To log into the cluster, you'll need to use the command line. Never used the command line before? Check out this tutorial.

Once you have a terminal window open, log in using the following command:

ssh -Y yourDartmouthID@ndoli.dartmouth.edu

To exit the cluster, simply enter:

exit

If you want to connect to the cluster off campus, you can use ssh to log into dexter.dartmouth.edu first, then log into Discovery from there. You may have a different username/password for dexter.dartmouth.edu (ask Jed if you need help).

Navigating to Your Folder on the Lab Share

Once you've logged in, you can navigate to the lab share using the following command:

cd /dartfs-hpc/rc/lab/M/MeyerM/

Each lab member runs analyses in their personal folder on the lab share. If you don't already have one, you can create one:

mkdir yourlastname

Never, EVER edit or delete another lab member's folder (obviously).

Running Analyses on the Cluster

Running analyses on the Cluster is just like running analyses on your personal computer: you'll execute a script (e.g. Python, FSL, etc.) that reads in your data, analyzes it, and outputs results. The main difference is that on a supercomputer, you have a lot more options for exactly where and how you execute your analysis script. To specify these options, you'll submit a "job" via a master script called a "pbs script" (because it ends in ".pbs")

The pbs script tells Discovery: "Run this analysis script, in this folder, with these specifications." You can find a basic template here. Some notes on the pbs script:

  • "walltime" is the maximum time your job is allowed to run before it terminates. For more complex analyses and more subjects, you'll need longer walltimes.
  • Where it says "cd $PBS_O_workdir", replace PBS_O_workdir with the path to whatever directory you want your analyses to run in.
  • Where it says "./program_name arg1 arg2 ...", replace ./program_name with the path to & name of your analysis script, and replace arg1 arg2 ... with any input arguments your script needs.
  • The pbs script outputs a .o file for the job output, and a .e file that logs errors. These are not your analysis results, but a record of the job itself!

Once you have your pbs script ready to go, make sure you're in the directory where it's located. Then submit your job using the following command (replace myscript with whatever you named your pbs script):

mksub myscript.pbs

Some useful commands to check up on your job as it runs:

  • myjobs—shows you what jobs you're currently running
  • checkjob ####—shows you what's going on if your job is failing (replace #### with your job number [can find using myjobs])
  • qr—query the resources you're using
  • qshow—gives you an overview of who's using the cluster and how much space they're using

Using Modules

On your home computer, you couldn't run a MATLAB script without opening MATLAB. Similarly, to run a script on the cluster, you'll need to open the corresponding program first. You do this through a module. Discovery already has modules for most common programs, but if you need one that isn't on there, you can get it added. To open a module (eg. MATLAB):

  • Make sure you included -Y in your login command. This opens XQuartz, which allows GUIs to be opened.
  • To explore what modules are available, enter the following command:

module avail

  • To load a module, enter the following command (replace matlab with your desired program):

module load matlab

You won't see a MATLAB GUI window open up as you would on your computer, but you can now run MATLAB scripts. You can unload the module with the same command, using "unload" instead of "load.

  • To open a MATLAB GUI, simply enter:

matlab