CNL Wiki

Docs: Running parallel analyses on Hoffman

Updated on October 13, 2023

The Hoffman2 wiki has extensive documentation on running python jobs in parallel: https://www.hoffman2.idre.ucla.edu/Using-H2/Computing/Computing.html?highlight=parallel#running-array-jobs

A few hints that will help you get off the ground:

• Here’s an example .py file and an example .sh file. The .sh file tells the SGE scheduler how to run the .py file in parallel. Notice that the .sh file has lines to load Anaconda and activate your environment (which you need to create by sshing into your Hoffman account, loading Anaconda “module load anaconda3”, and creating an env as typically done with Anaconda).

• The “-t 1-30:1” line tells SGE to run 30 parallel instances with inputs of 1 through 30. In the .py program, these are read in using “int(os.environ[‘SGE_TASK_ID’])”. So each of your 30 instances will be given a single number from 1-30 and you have to design your program to index the input parameters you want and save the outputs you want using that number.