Karolina (IT4I)

The Karolina cluster is located at IT4I, Technical University of Ostrava.

Introduction

If you are new to this system, please see the following resources:

  • IT4I user guide

  • Batch system: PBS

  • Filesystems:

    • $HOME: per-user directory, use only for inputs, source and scripts; backed up (25GB default quota, 5k entries)

    • /scatch/: production directory; very fast for parallel jobs (10TB, 10M entries per user)

    • /mnt/: project file system (20TB, 5M entries per project)

For convenience, you can add the following variables to your .bashrc:

export SCRDIR="/scratch/project/dd-23-83/${USER}"
export WRKDIR="/mnt/proj2/dd-23-83/${USER}"

where dd-23-83 is the project identifier, which can be different in your case.

Preparation

Use the following commands to download the SynchRad source code:

# optionally, remove any previous installs if necessary
rm -rf $HOME/src/synchrad
rm -rf $HOME/sw/karolina/gpu/venvs/synchrad

git clone https://github.com/berceanu/synchrad.git $HOME/src/synchrad

On Karolina, we recommend running on the accelerator nodes with fast A100 GPUs.

We use system software modules, add environment hints and further dependencies via the file $HOME/karolina_synchrad.profile. Create it now:

cp $HOME/src/synchrad/Tools/machines/karolina-it4i/karolina_synchrad.profile.example $HOME/karolina_synchrad.profile
Script Details
Listing 1 $HOME/src/synchrad/Tools/machines/karolina-it4i/karolina_synchrad.profile.example.
# please set your project account
export proj="DD-23-83"  # change me!

# remembers the location of this script
export MY_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your $MY_PROFILE file! Please edit its line 2 to continue!"; return; fi

# required dependencies
module purge
ml CMake/3.23.1-GCCcore-11.3.0
ml Boost/1.79.0-GCC-11.3.0
ml OpenBLAS/0.3.20-GCC-11.3.0
ml Python/3.10.4-GCCcore-11.3.0-bare
ml OpenMPI/4.1.4-GCC-11.3.0-CUDA-11.7.0

ml git/2.36.0-GCCcore-11.3.0-nodocs

# Python virtual env
if [ -d "${HOME}/sw/karolina/gpu/venvs/synchrad" ]
then
  source ${HOME}/sw/karolina/gpu/venvs/synchrad/bin/activate
fi

# compiler environment hints
export CC=$(which gcc)
export CXX=$(which g++)
export FC=$(which gfortran)
export CUDACXX=$(which nvcc)
export CUDAHOSTCXX=${CXX}

Edit the 2nd line of this script, which sets the export proj="" variable. For example, if you are member of the project DD-23-83, then run vi $HOME/karolina_synchrad.profile. Enter the edit mode by typing i and edit line 2 to read:

export proj="DD-23-83"

Exit the vi editor with Esc and then type :wq (write & quit).

Important

Now, and as the first step on future logins to Karolina, activate these environment settings:

source $HOME/karolina_synchrad.profile

You can also add the line above to your $HOME/.bashrc file so that it is loaded on each login.

Finally, since Karolina does not yet provide software modules for some of our dependencies, install them once, and activate the newly created Python virtual environment. Further environment activations will be done automatically from inside karolina_synchrad.profile.

bash $HOME/src/synchrad/Tools/machines/karolina-it4i/install_dependencies.sh
source $HOME/sw/karolina/gpu/venvs/synchrad/bin/activate
Script Details
Listing 2 $HOME/src/synchrad/Tools/machines/karolina-it4i/install_dependencies.sh.
#!/bin/bash

# Exit on first error encountered #############################################
#
set -eu -o pipefail


# Check: ######################################################################
#
#   Was karolina_synchrad.profile sourced and configured correctly?
if [ -z ${proj-} ]; then echo "WARNING: The 'proj' variable is not yet set in your karolina_synchrad.profile file! Please edit its line 2 to continue!"; exit 1; fi


# Remove old dependencies #####################################################
#
SW_DIR="${HOME}/sw/karolina/gpu"
rm -rf ${SW_DIR}
mkdir -p ${SW_DIR}

# remove common user mistakes in python, located in .local instead of a venv
python3 -m pip uninstall -qq -y synchrad
python3 -m pip uninstall -qqq -y mpi4py 2>/dev/null || true


# Python ######################################################################
#
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade virtualenv
python3 -m pip cache purge
rm -rf ${SW_DIR}/venvs/synchrad
python3 -m venv ${SW_DIR}/venvs/synchrad
source ${SW_DIR}/venvs/synchrad/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade wheel
python3 -m pip install --upgrade black 
python3 -m pip install --upgrade numpy
python3 -m pip install --upgrade pandas
python3 -m pip install --upgrade scipy
python3 -m pip install --upgrade matplotlib
python3 -m pip install --upgrade openpmd-api
python3 -m pip install --upgrade openpmd-viewer
python3 -m pip install --upgrade mpi4py --no-cache-dir --no-build-isolation --no-binary mpi4py
python3 -m pip install --upgrade numba
python3 -m pip install --upgrade mako
python3 -m pip install --upgrade pyopencl

Finally, install SynchRad itself. This will install the package in “editable” mode, meaning any changes you make to the local source code will immediately be reflected in the installed package:

cd $HOME/src/synchrad
python3 -m pip install -e .

Now, you can submit Karolina compute jobs for SynchRad Python scripts.

Update SynchRad & Dependencies

If you already installed SynchRad in the past and want to update it, start by getting the latest source code:

cd $HOME/src/synchrad

# read the output of this command - does it look ok?
git status

# get the latest SynchRad source code
git fetch
git pull

# read the output of these commands - do they look ok?
git status
git log # press q to exit

And, if needed,

As a last step, reinstall SynchRad:

python3 -m pip install -e .

This is only needed in case the dependencies have changed, otherwise the “editable” install should automatically reflect the latest changes pulled from git, without needing to reinstall the package.

Running

The batch script below can be used to run a SynchRad simulation on TODO GPU nodes (change #PBS -l select= accordingly) on the supercomputer Karolina at IT4I. This partition has up to 72 nodes. Every node has 8x A100 (40GB) GPUs and 2x AMD EPYC 7763, 64-core, 2.45 GHz processors.

Replace descriptions between chevrons <> by relevant values, for instance <proj> could be DD-23-83. Note that we run one MPI rank per GPU.

Script Details
Listing 3 $HOME/src/synchrad/Tools/machines/karolina-it4i/karolina_gpu.qsub.
#!/bin/bash -l

#PBS -q qgpu
#PBS -N synchrad
#PBS -l select=1:ncpus=16:ngpus=1:mpiprocs=1:ompthreads=16,walltime=00:05:00
#PBS -A DD-23-83

cd ${PBS_O_WORKDIR}

# Python interpreter & input script here
INTERPRETER=python3
SCRIPT=${PBS_O_WORKDIR}/synchrad_script.py

# OpenMP threads
export OMP_NUM_THREADS=16

# run
mpirun -n 1 bash -c "
    export CUDA_VISIBLE_DEVICES=\${OMPI_COMM_WORLD_LOCAL_RANK};
    ${INTERPRETER} ${SCRIPT}" \
  > output_1_gpu.txt

To run a simulation, copy the lines above to a file karolina_gpu.qsub

mkdir -p $SCRDIR/runs/synchrad
cp $HOME/src/synchrad/Tools/machines/karolina-it4i/karolina_gpu.qsub $SCRDIR/runs/synchrad

and run

cd $SCRDIR/runs/synchrad
qsub karolina_gpu.qsub

to submit the job.