Non-computationally-intensive analyses can be run on your personal desktop or laptop computer. But for computationally intensive analyses (read: fMRI data), you'll need to run them to the Discovery Cluster. (Not sure what a cluster is? Look here.)
- For cluster questions, John Hudson is our go-to person. He and other members of the Cluster team can help you debug, optimize, and parallelize your code.
- This tutorial explains more in depth about how the Discovery works, and how to connect with it. You can find other handy tutorials on that website as well.
- Research computing also sometimes holds intro classes on using the Cluster; look out for them on their calendar.
ACCESSING THE CLUSTER
We have a lab share on Dartmouth's Discovery Supercomputing Cluster, with subfolders for each individual lab member's data and analyses. You can request an account to access the cluster here. Once you have your account, you'll need to email John Hudson to grant it permissions to access the lab share.
Logging Into the Cluster
To log into the cluster, you'll need to use the command line. Never used the command line before? Check out this tutorial.
Once you have a terminal window open, log in using the following command:
ssh -Y yourDartmouthID@ndoli.dartmouth.edu
To exit the cluster, simply enter:
Navigating to Your Folder on the Lab Share
Running Analyses on the Cluster
Running analyses on the Cluster is just like running analyses on your personal computer: you'll execute a script (e.g. Python, FSL, etc.) that reads in your data, analyzes it, and outputs results. The main difference is that on a supercomputer, you have a lot more options for exactly where and how you execute your analysis script. To specify these options, you'll submit a "job" via a master script called a "pbs script" (because it ends in ".pbs")
The pbs script tells Discovery: "Run this analysis script, in this folder, with these specifications." You can find a basic template here. Some notes on the pbs script:
- "walltime" is the maximum time your job is allowed to run before it terminates. For more complex analyses and more subjects, you'll need longer walltimes.
- Where it says "cd $PBS_O_workdir", replace PBS_O_workdir with the path to whatever directory you want your analyses to run in.
- Where it says "./program_name arg1 arg2 ...", replace ./program_name with the path to & name of your analysis script, and replace arg1 arg2 ... with any input arguments your script needs.
- The pbs script outputs a .o file for the job output, and a .e file that logs errors. These are not your analysis results, but a record of the job itself!
Once you have your pbs script ready to go, make sure you're in the directory where it's located. Then submit your job using the following command (replace myscript with whatever you named your pbs script):
Some useful commands to check up on your job as it runs:
- myjobs—shows you what jobs you're currently running
- checkjob ####—shows you what's going on if your job is failing (replace #### with your job number [can find using myjobs])
- qr—query the resources you're using
- qshow—gives you an overview of who's using the cluster and how much space they're using
On your home computer, you couldn't run a MATLAB script without opening MATLAB. Similarly, to run a script on the cluster, you'll need to open the corresponding program first. You do this through a module. Discovery already has modules for most common programs, but if you need one that isn't on there, you can get it added. To open a module (eg. MATLAB):
- Make sure you included -Y in your login command. This opens XQuartz, which allows GUIs to be opened.
- To explore what modules are available, enter the following command:
- To load a module, enter the following command (replace matlab with your desired program):
module load matlab
You won't see a MATLAB GUI window open up as you would on your computer, but you can now run MATLAB scripts. You can unload the module with the same command, using "unload" instead of "load.
- To open a MATLAB GUI, simply enter: