How do you want the processes to be distributed?
- All are on same node to reduce the network latencies
- Scatter distribution of jobs to increase overall memory bandwidth
- Even distribution of processes across nodes
- Let scheduler choose
All are on same node to reduce the network latencies:
sample script – sample.sub
#!/bin/bash # Submission script: "tasks are all grouped on same node" # Job name #SBATCH --job-name=mpi_mm # Output file name #SBATCH --output=mpi_mm_v2.out #SBATCH --error=mpi_mm_v2.err # # Set the required partition [change] #SBATCH --partition=long # Number of processes #SBATCH --ntasks=32 # Number of nodes #SBATCH --nodes=1 # Memory per process #SBATCH --mem-per-cpu=100 # # Total wall-time #SBATCH --time=00:05:00 # # The below statement is required if the code is floating-point intensive and CPU-bound [optional] #SBATCH --threads-per-core=1 # # To get email alert [Optional] # NOTE: Remove one "#" and "write your email ID" (ex: #SBATCH --mail-user=hemanta.kumar@icts.res.in) ##SBATCH --mail-user= email id ##SBATCH --mail-type=ALL # date mpirun --mca btl_openib_allow_ib 1 /home/hemanta.kumar/slurm_test/mpi_mm date
Scatter distribution of jobs to increase overall memory bandwidth:
sample script – sample.sub
#!/bin/bash # Submission script: "tasks are scattered across distinct nodes" # Job name #SBATCH --job-name=mpi_mm # Output file name #SBATCH --output=mpi_mm_v3.out #SBATCH --error=mpi_mm_v3.err # # Set the required partition [change] #SBATCH --partition=long # Number of processes #SBATCH --ntasks=2 #SBATCH --ntasks-per-node=1 # Memory per process #SBATCH --mem-per-cpu=100 # # Total wall-time #SBATCH --time=00:05:00 # # The below statement is required if the code is floating-point intensive and CPU-bound [Optional] #SBATCH --threads-per-core=1 # # To get email alert [Optional] # NOTE: Remove one "#" and "write your email ID" (ex: #SBATCH --mail-user=hemanta.kumar@icts.res.in) ##SBATCH --mail-user= email id ##SBATCH --mail-type=ALL # date mpirun --mca btl_openib_allow_ib 1 /home/hemanta.kumar/slurm_test/mpi_mm date
Even distribution of processes across nodes:
sample script – sample.sub
#!/bin/bash # Submission script: "tasks are evenly distributed across nodes" # Job name #SBATCH --job-name=mpi_mm # Output file name #SBATCH --output=mpi_mm_v1.out #SBATCH --error=mpi_mm_v1.err # # Set the required partition [change] #SBATCH --partition=long # Number of processes #SBATCH --ntasks=32 # Process distribution per node #SBATCH --ntasks-per-node=8 # Number of nodes #SBATCH --nodes=4 # Memory per process #SBATCH --mem-per-cpu=100 # # Total wall-time #SBATCH --time=00:05:00 # # The below statement is required if the code is floating-point intensive and CPU-bound [Optional] #SBATCH --threads-per-core=1 # # To get email alert [Optional] # NOTE: Remove one "#" and "write your email ID" (ex: #SBATCH --mail-user=hemanta.kumar@icts.res.in) ##SBATCH --mail-user= email id ##SBATCH --mail-type=ALL # date mpirun --mca btl_openib_allow_ib 1 /home/hemanta.kumar/slurm_test/mpi_mm date
Let scheduler choose:
sample script – sample.sub
#!/bin/bash # Submission script: "no plan" # Job name #SBATCH --job-name=mpi_mm # Output file name #SBATCH --output=mpi_mm_v4.out #SBATCH --error=mpi_mm_v4.err # # Set the required partition [change] #SBATCH --partition=long # Number of processes #SBATCH --ntasks=64 # Memory per process #SBATCH --mem-per-cpu=100 # # Total wall-time #SBATCH --time=00:05:00 # # The below statement is required if the code is floating-point intensive and CPU-bound [Optional] #SBATCH --threads-per-core=1 # # To get email alert [Optional] # NOTE: Remove one "#" and "write your email ID" (ex: #SBATCH --mail-user=hemanta.kumar@icts.res.in) ##SBATCH --mail-user= email id ##SBATCH --mail-type=ALL # date mpirun --mca btl_openib_allow_ib 1 /home/hemanta.kumar/slurm_test/mpi_mm date
Submit job:
sbatch sample.sub
The job’s status in the queue can be monitored with squeue; (add -u username to focus on a particular user’s jobs).
The job can be deleted with scancel <job_id> .
When the job finishes (in error or correctly) there will normally be one file created in the submission directory with the name of the form slurm-NNNN.out (where NNNN is the job id).
Submit script flags
Resource | Flag Syntax | Description | Notes |
---|---|---|---|
job name | -J, --job-name=hello_test | Name of job | default is the JobID |
partition | -p, --partition=devel | Partition is a queue for jobs | default partition maked with *, devel is the default partition on Mario |
time | -t, --time=01:00:00 | Time limit for the job. Acceptable time formats include minutes, minutes:seconds, hours:minutes:seconds, days-hours, days-hours:minutes and days-hours:minutes:seconds | here it is given as 1 hour |
nodes | -N, --nodes=2 | Number of compute nodes for the job | default is 1 compute node |
number tasks | -n, --ntasks=1 | A maximum of number tasks and to provide for sufficient resources. | default is 1 task per node |
ntasks on each node | --ntasks-per-node=8 | Request that ntasks be invoked on each node. If used with the --ntasks option, the --ntasks option will take precedence and the --ntasks-per-node will be treated as a maximum count of tasks per node | default is 1 task per node |
memory | --mem=32000 | Memory limit per compute node for the job. Do not use with mem-per-cpu flag | by default memory in MB |
memory per CPU | --mem-per-cpu=1000 | per core memory limit. Do not use with mem flag | by default memory in MB |
output file | -o, --output=test.out | Name of file for stdout | default is the JobID |
error file | -e, --error=test.err | Name of file for stderr | default is the JobID |
email address | --mail-user=username@buffalo.edu | User's email address | send email on submition and complition of job OR omit for no email |
email notification | --mail-type=ALL –mail-type=END | When email is sent to user. | omit for no email |