This cluster is for small and medium jobs (32 x86_64-cores, 256GB memory). It has excellent memory speed (8xDDR4) but 10GbE between nodes only. Since 2020 it is also for preparation of GPU-jobs. Access is possible after login using ssh (secure shell) to hpc18.urz.uni-magdeburg.de (intra-uni-network only). Operating system is Linux CentOS. This cluster is not suited for personal data.
#!/bin/bash # this is first draft (bad cpu pinning is a problem) # #SBATCH -J jobname1 #SBATCH -N 2 # Zahl der nodes, 110 GB per, node range: 1..12 #SBATCH --ntasks-per-node 32 # range: 1..32 (max 32 cores per node) #SBATCH --time 0:59:00 # set 59min walltime #SBATCH --mem 110000 # please use max. 110GB for better priorisation # exec 2>&1 # send errors into stdout stream echo "DEBUG: SLURM_JOB_NODELIST=$SLURM_JOB_NODELIST" echo "DEBUG: SLURM_NNODES=$SLURM_NNODES" echo "DEBUG: SLURM_TASKS_PER_NODE=$SLURM_TASKS_PER_NODE" #env | grep -e MPI -e SLURM echo "DEBUG: host=$(hostname) pwd=$(pwd) ulimit=$(ulimit -v) \$1=$1 \$2=$2" scontrol show Job $SLURM_JOBID # show slurm-command and more for DBG /usr/local/bin/quota_all # show quotas (Feb22) module load mpi/openmpi module list HOSTFILE=slurm-$SLURM_JOBID.hosts scontrol show hostnames $SLURM_JOB_NODELIST > $HOSTFILE # one entry per host awk '{print $1,"slots=32"}' $HOSTFILE > $HOSTFILE.2 echo "DEBUG: taskset= $(taskset -p $$)" NPERNODE=$SLURM_NTASKS_PER_NODE if [ -z "$NPERNODE" ];then NPERNODE=32; fi # default 32 mpi-tasks/node echo "DEBUG: NPERNODE= $NPERNODE" export OMP_NUM_THREADS=$[32/NPERNODE] export OMP_WAIT_POLICY="PASSIVE" # reduces OMP energy consumption export OMPI_MCA_mpi_yield_when_idle=1 # untested, low energy OMPI ??? export OMPI_MCA_hwloc_base_binding_policy=none # pin-problem work arround??? # default Core-binding is bad, two tasks bound on 2 hyperthreads of same core # but helps for srun only, not for direct mpirun (no mpi standard!?) # obsolete, if hyperthreading is disabled by BIOS TASKSET="taskset 0xffffffff" # 2019-01 ok for 1*32t, 32*1t # try to fix 2 most simple cases here: if [ $NPERNODE == 32 ];then export SLURM_CPU_BIND=v,map_cpu:1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31 fi if [ $NPERNODE == 1 ];then TASKSET="taskset 0xffffffff" # 32bit bask fi #mpirun -np 1 --report-bindings --oversubscribe -v --npernode $NPERNODE bash -c "taskset -p $$;ps aux" # hybrid binary mpi+multithread mpirun --report-bindings --oversubscribe -v --npernode $NPERNODE $TASKSET mpi-binary -t$OMP_NUM_THREADSrun with: sbatch jobfile
2018-02 installation 12 nodes 2018-04 bad CPU pinning of slurm(?),openmpi(?), workarround script 2018-05 set MTU=9000 (dflt 1500) to improve 10GbE network speed (200%) 2019-02 +5 nodes 2019-05-14 set overcommit_memory=2 to avoid linux crashes on memory pressure 2019-06-12 power loss due to short during work on the room electric 2020-03-03 upgrade of 1GbE to 10GbE for the ssh-access 2020-10-14 system reconfiguration due to instable 10GbE in progress 2021-06-15 slurm partition reconfiguration (about one week testing) 2021-07-07 network and config problems after system update 2021-08-09 remote shutdown tests for 2021-08-16 2021-08-16 planned maintenance, no clima, 12h down time 2022-02-23 quota_all added, showing user quotas 2022-05-16 downtime due to slurm security update 2023-06-21 fix default routing issues on node01 + node03 2023-06-21 firewall outgoing worldwide traffic, ask the admin on need (security) 2023-06-21 dbus.service and rpcbind deactivated at nodes (security, stability) 2023-06 planed: partly memory + network upgrade