1. Home
  2. Tetris Cluster
  3. Submitting MPI jobs (multi-process)

Submitting MPI jobs (multi-process)

How do you want the processes to be distributed?

All are on same node to reduces the network latencies:

sample script – sample.sub

#!/bin/bash
# Submission script: "tasks are all grouped on same node"

# Job name
#SBATCH --job-name=mpi_mm
# Output file name
#SBATCH --output=mpi_mm_v2.out
#SBATCH --error=mpi_mm_v2.err
#
# Set the required partition
#SBATCH --partition=short
# Number of processes
#SBATCH --ntasks=32
# Number of nodes
#SBATCH --nodes=1
# Memory per process
#SBATCH --mem-per-cpu=1
#
# Total wall-time
#SBATCH --time=00:05:00
#
# Uncomment the following line if your work is floating point intensive and CPU-bound.
### SBATCH --threads-per-core=1
#
# Uncomment to get email alert
### SBATCH --mail-user=hemanta.kumar@icts.res.in
### SBATCH --mail-type=ALL

date
mpirun /home/it/slurm_test/mpi_mm
#srun /home/it/slurm_test/mpi_mm
date

Scatter distribution of jobs to increase overall memory bandwidth:

sample script – sample.sub

#!/bin/bash
# Submission script: "tasks are scattered across distinct nodes"

# Job name
#SBATCH --job-name=mpi_mm
# Output file name
#SBATCH --output=mpi_mm_v3.out
#SBATCH --error=mpi_mm_v3.err
#
# Set the required partition
#SBATCH --partition=short
# Number of processes
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=1
# Memory per process
#SBATCH --mem-per-cpu=1
#
# Total wall-time
#SBATCH --time=00:05:00
#
# Uncomment the following line if your work is floating point intensive and CPU-bound.
### SBATCH --threads-per-core=1
#
# Uncomment to get email alert
### SBATCH --mail-user=hemanta.kumar@icts.res.in
### SBATCH --mail-type=ALL

date
mpirun /home/it/slurm_test/mpi_mm
#srun /home/it/slurm_test/mpi_mm
date

Even distribution of processes across nodes:

sample script – sample.sub

#!/bin/bash
# Submission script: "tasks are evenly distributed across nodes"

# Job name
#SBATCH --job-name=mpi_mm
# Output file name
#SBATCH --output=mpi_mm_v1.out
#SBATCH --error=mpi_mm_v1.err
#
# Set the required partition
#SBATCH --partition=short
# Number of processes
#SBATCH --ntasks=32
# Process distribution per node
#SBATCH --ntasks-per-node=8
# Number of nodes
#SBATCH --nodes=4
# Memory per process
#SBATCH --mem-per-cpu=1
#
# Total wall-time
#SBATCH --time=00:05:00
#
# Uncomment the following line if your work is floating point intensive and CPU-bound.
### SBATCH --threads-per-core=1
#
# Uncomment to get email alert
### SBATCH --mail-user=hemanta.kumar@icts.res.in
### SBATCH --mail-type=ALL

date
mpirun /home/it/slurm_test/mpi_mm
#srun /home/it/slurm_test/mpi_mm
date

Let scheduler choose:

sample script – sample.sub

#!/bin/bash
# Submission script: "no plan"

# Job name
#SBATCH --job-name=mpi_mm
# Output file name
#SBATCH --output=mpi_mm_v4.out
#SBATCH --error=mpi_mm_v4.err
#
# Set the required partition
#SBATCH --partition=short
# Number of processes
#SBATCH --ntasks=64
# Memory per process
#SBATCH --mem-per-cpu=1
#
# Total wall-time
#SBATCH --time=00:05:00
#
# Uncomment the following line if your work is floating point intensive and CPU-bound.
### SBATCH --threads-per-core=1
#
# Uncomment to get email alert
### SBATCH --mail-user=hemanta.kumar@icts.res.in
### SBATCH --mail-type=ALL

date
mpirun /home/it/slurm_test/mpi_mm
#srun /home/it/slurm_test/mpi_mm
date
Submit job:
sbatch sample.sub

The job’s status in the queue can be monitored with squeue; (add -u username to focus on a particular user’s jobs).

The job can be deleted with scancel <job_id> .

When the job finishes (in error or correctly) there will normally be one file created in the submission directory with a name of the form slurm-NNNN.out (where NNNN is the job id).

Submit script flags

ResourceFlag SyntaxDescriptionNotes
job name--job-name=hello_testName of jobdefault is the JobID
partition--partition=develPartition is a queue for jobsdefault partition maked with *, devel is the default partition on Mario
time--time=01:00:00Time limit for the job. Acceptable time formats include minutes, minutes:seconds, hours:minutes:seconds, days-hours, days-hours:minutes and days-hours:minutes:secondshere it is given as 1 hour
nodes--nodes=2Number of compute nodes for the jobdefault is 1 compute node
cpus/cores--ntasks-per-node=8Corresponds to number of cores on the compute nodedefault is 1 task per node
memory--mem=32000Memory limit per compute node for the job. Do not use with mem-per-cpu flagby default memory in MB
memory per CPU--mem-per-cpu=1000per core memory limit. Do not use with mem flagby default memory in MB
output file--output=test.outName of file for stdoutdefault is the JobID
error file--error=test.errName of file for stderrdefault is the JobID
email address--mail-user=username@buffalo.eduUser's email addresssend email on submition and complition of job OR omit for no email
email notification--mail-type=ALL –mail-type=ENDWhen email is sent to user.omit for no email
Was this article helpful to you? Yes No