Computer Cluster for Analysis of Data from SLS and SwissFEL Facilities

The official cluster name is Ra and the cluster will be referenced below by this name.

User Mailing List

All users are strongly invited to subscribe to the ra-users mailing list at https://psilists.ethz.ch/sympa/subscribe/ra-users. where important announcement that may impact users' activity, like downtimes and changes in policy will be sent.

For any problem or further information please contact us at: ra-admins@lists.psi.ch

To be able to use Ra cluster you need :

  • PSI account
  • authorisation to use the Ra cluster.

PSI Account

If you don't have PSI account, please follow this procedure

Authorization for the Ra cluster

Please contact your beamline manager and provide the following information:

  • your PSI account
  • data identifier (Proposal ID or e-account used to collect the data) for the data you need to access.

For the beamline manager:

Login in DUO, select "Beamline Manager" on the left-hand side menu and then "Access to ra-cluster".

For any problem or further information please contact: ra-admins@lists.psi.ch or PSI Helpdesk at helpdesk@psi.ch. Please do not call the admins unless urgent (e.g. beamtime is severely impacted by the issue)

The names of the login nodes are

  • ra-l-003.psi.ch
  • ra-l-004.psi.ch
  • ra-l-005.psi.ch
  • ra-l-006.psi.ch

They are also available under the alias ra.psi.ch. From the PSI subnet, you can connect to login nodes directly via ssh or with NoMachine using ra-nx.psi.ch as target host. From outside the PSI subnet (e.g. when not onsite, or not using VPN), you have two options to connect to RA. 

Please note that for both options (ssh over hop.psi.ch and NoMachine connection through rem-acc.psi.ch) you need Multi-Factor Authentication (MFA) enabled: please follow the instructions here https://www.psi.ch/en/computing/change-to-mfa

Option one: connect to the hop.psi.ch system and from there go via ssh to the one of the Ra login node

$ ssh <PSI_account>@hop.psi.ch $ Password: <password> AFTER login $ssh ra-l-003.psi.ch $ Password: <password>

For further details see

Option two: (recommended) NoMachine Connection via Remote Access Server

Please follow these instructions

From PSI subnet and using PSI-VPN, one need to use ra-nx as target for NoMachine connection, to reduce load on VPN and network infrastructure.

rem-acc is accessible from PSI subnet, but not from PSI-VPN.

NoMachine hints:

  • Set the display preferences of the NoMachine ("Ctrl+Alt+0") to "Resize remote screen". This will provide a better default resolution matching your monitor to the remote desktop on the Ra login node.
  • In case of heavy graphic application try to run it prepending vglrun before the application name. See yourself the difference by running "glxgears" vs. "vglrun glxgears".
  • If a NoMachine sessions is not being accessed for 5 days it gets automatically terminated.

Login nodes usage limits

Login nodes are the entry point to the Ra cluster and they are the common place for every user. For this reason please avoid to overload the login nodes with CPU or memory intensive job, since for that ones the batch system should be used instead.

 

There are several file systems for different purposes on the Ra cluster:

File system Path Default Quota Access rights Access mode Used for
HOME /mnt/das-gpfs/home/$USER 5 GB user only read-write user home directory, code, private data, etc.
SLS raw data /sls/$BEAMLINE - the PI* group only read-only SLS data
work data /das/work/$PGR/$PGROUP 4 TB the PI* group only read-write derived data

*) PI -- principal investigator or Main Proposer

Example:

/sls/X06SA/Data10/e15874 # contains the raw data for the PI group 15874 /das/work/p15/p15874 # contains the derived data for the PI group 15874


To see the group members for a given group use the getent group command, e.g.:

getent group p15874


To check the quota for home, use homequota command being on one of the login node:

homequota

Work area

The p-groups inside the work area are split between the so called internal and external area. The default quota per p-group only applies to the external p-groups. The internal ones are members of one and only one unit, that provides a certain amount of space on the work area to all its p-groups. The p-group of a unit don't have a default quota so a single p-group could fill up all the space on the work area for its unit.

To see to which unit a pgroup is member of and if is internal or external:

[talamo_i@ra-l-002 ~]$ /das/support/users/space_usage/pgroup_info p17277 Name: p17277 Unit: tomcat Kind: internal Used: 63 GB Members: ozerov_d,talamo_i,gsell Unit quota: 500 TB

Files permission and ownership

All files inside a specific p-group directory are considered to be owned by the specific pgroup, meaning that the unix group of the files should be the p-group and that the group should have read-write access, ie. all its members have read-write access. For this reason a regular process checks the files permission and ownership and fixes them.

The fix happens automatically every hour, but in case you would need to change permission/ownership sooner, you can run the following commands for a file:

chmod g+w file chgrp your-p-group file


And in case you want to do it recursively on a directory:

chmod -R g+ws dir chgrp -R your-p-group dir

The data analysis and advanced development software is available via PSI Environment Modules (Pmodules) system. Use the following commands to manage your environment:

module list # show the loaded modules module avail # show the available modules module search # show all modules in hierarchy module help # if you do not remember what to do module add module_name # add module_name module rm module_name # remove module_name


Example:

module load matlab/2015b # load Matlab 2015b


There are MX-beamline specific environment configuration files in the /etc/scripts/ directory:

 

  • /etc/scripts/mx_sls.sh - the default configuration for the analysis of SLS data
  • /etc/scripts/mx_fel.sh - the default configuration for the analysis of data from FEL sources

Source the corresponding configuration file to use the predefined settings, for example:

source /etc/scripts/mx_sls.sh


The environment settings done with the module are effective only in the current shell and all its children processes.

You may wish to edit the .bashrc file in home directory to make permanent changes of your environment: see the comments therein for more details.

Scientific Applications

The list of the available software includes: Matlab, python, intel and gcc compilers, Fiji, standard MX software like xds, shelx, hkl2map, adxv, CBFlib, ccp4, dials, mosflm, phenix, software used for serial crystallography like crystfel, cheetah .... For the complete list use the following commands:

module use unstable module avail

The Ra login node should be used mainly for development, small and quick tests, and work with the graphical applications.

For CPU intensive work, the compute nodes must be used. There are presently 48 computing nodes in the cluster, each with 256GB of RAM, InfiniBand interconnect, and 10GbE network.

Computing node Processor on each node Number of cores on each node
c-033..048 2xIntel Xeon Gold 6140 (2.30GHz ) 36 (2x18)
c-049..72 2xIntel Xeon Gold 6230 (2.10GHz) 40 (2x20)
c-073..084 2xIntel Xeon Gold 6230R CPU (2.10GHz) 52 (2x26)
c-085..096 2xAMD EPYC 7453 (2.75GHz) 56 (2x28)
gpu-001

2xIntel Xeon CPU E5-2690 v4 (2.60GHz)

4xNVIDIA V100 16 GB RAM

28 (2x14)
gpu-002

Intel Xeon Gold 6248R  (3.00GHz)

8xNVIDIA A100 40GB RAM

48 (2x24)

Access to the compute nodes is controlled by Slurm, a modern workload manager for Linux clusters. You can allocate compute node for interactive use or submit batch jobs using Slurm.

Useful commands:

sinfo # view information about Slurm nodes, in particular, idle (free) nodes squeue # view information about jobs in the scheduling queue (useful to find your nodes) salloc # request a job allocation (a set of nodes) for further use srun # allocate compute nodes and run a command inside the allocated nodes sbatch # submit a batch script


The present Slurm configuration at Ra implements two modes of allocation of the resources on the computing node: a) "shared" allocation (partition "shared"): your job will share computing resources with other jobs on the node; b)the "whole node" allocation (partitions "day" and "week"): you will have an exclusive access to the allocated compute nodes (not shared with the other users) within the requested time limits. By default (if you don't specify the partition name) jobs land on the "shared" partition.

  Partition name shared/exclusive access to computing node default allocation time maximum allocation time
default hour shared 1 hour 1 hour
  day shared 4 hours 24 hours
  week shared 2 days 8 days
  gpu shared -  dedicated to specific accounts - 14 days
  gpu-week shared - 7 days

Example:

sbatch job.sh # to submit job to the default partition, with allocation time of 1 hour sbatch -p week job.sh # to submit to the partition with longer allocation time (2 days if not specified) sbatch -p week -t 4-5:30:00 job.sh # to submit job with time limit of 4 days, 5 hours and 30 minutes (max. allowed time limit is 8 days)

When the time limit is reached, all the unfinished process will be killed.

Please, do not forget to release the resources you do not need any more, otherwise they will remain unavailable to other users until your allocation expires. Holding on idle nodes will have negative impact on the scheduling priority of your future jobs. The priority of your future Slurm jobs depends on your past usage according to the fair share mechanism.

For interactive use of Ra cluster one can use jupyter notebooks as well.

More examples and details how to use Slurm on Ra cluster, you will find in the Ra help pages.

ra-help.png

Migration Guide

On September 30 2022, Slurm configuration will migrate away from full node allocation as default. This paragraph will quickly guide you on the changes you might need to implement in your submission scripts:

  • by default you will be allocated 1 core and 6 GB of RAM per core. 
    • if you need more cores, please use the --cpus-per-task option
    • if you need more memory per core, please use the --mem option, otherwise memory will scale automatically with the number of cores
    • if you need to allocate a GPU (partitions gpu and gpu-week)  please use --gpus="[<type>:]<num_gpus>", e.g. --gpu=2 or --gpu=A100:2 if you need to specify the GPU type 
  • to quickly restore the old behavior (exclusive access to a node), please use the --exclusive option. Please note that this could waste resources and impact your queue priority, so use it only when necessary.

The rest of the Slurm documentation will be adapted after the migration

My heavy graphical application (coot for example) runs slow via NoMachine

Try to reduce in your NoMachine client settings (Ctrl+Alt+0, then "Display", then "Settings") the resolution to find a compromise between speed, comfort and network latency

More detailed documentation on use of slurm on Ra, description of NoMachine setup is available in the internal PSI page  (accessible from PSI subnet, PSI-VPN or in the NoMachine session (Ra help))