Guidance on submitting a technical assessment for Sulis

Applications to use Sulis via the EPSRC Access to HPC call require submission of a technical assessment (TA) form to sulis@warwick.ac.uk in advance of full proposal submission.

Some specific guidance on completing a TA for Sulis is below. If you are unsure how a particular question relates to your project then please feel free to contact us on sulis@warwick.ac.uk for help completing the form. It may be the case that not all questions are relevant to your project or that you have a use case we haven't previously encountered. We are keen to work with prospective users on this.

Exploratory access to Sulis for running tests necessary to complete the TA can easily be provided and is strongly encouraged for first-time applicants to the service. Please contact us on the above email address to arrange this.

Section 1.5 (Proposed start data of Tier-2 use)

Project should expect to start at the date advised by EPSRC.

Section 1.6 (Project length)

Projects can be six or twelve months in duration.

Section 3.1 (Support requirements)

Sulis targets high throughput and ensemble computing workflows. Please use this section to indicate how you will structure your workflow:

Built-in "task farming" capability within software packages, for example partitions in LAMMPS or jobfarms in MCSS Towhee.
Use of job arrays in SLURM.
Use of high-level frameworks for managing concurrent execution of functions over multiple inputs, such as joblib or Dask in Python or mclappy in R.
Command line tools for passing different arguments to multiple concurrent instances of a program, such as GNU Parallel.

Typically workloads will use more than one of these mechanisms. For example launching multiple workers to populate a full 128-core node in each element of a job array with several hundred elements.

It is also desirable to avoid workflows which require creation of large numbers of small files. This can negatively impact performance of the file system. If this applies to your proposed project then use this section to indicate where support might be required to mitigate this.

Please also indicate if your project will require use of Singularity containers.

Section 4.1 Job size mix

Use this section to indicate the size of the overall jobs you will submit to the batch system, not the size of the individual simulations/tasks/elements within each job. Please include information on how many simulations/tasks/elements you will run per node, and how many nodes will be used in total for each of the three job categories.

Compute nodes in Sulis contain 128 cores.

Section 4.2 Disk space requirements

We intend to allocate each project 2TB of storage per user. If your project needs larger capacity then please indicate here and justify the amount requested.

Section 6 Scaling evidence

Jobs submitted to Sulis may benefit from multiple levels of parallelism. In many cases there will be no communication required between individual simulations/elements/tasks (trivial or "embarrassing" parallelism) and so the highest level of parallelism will exhibit perfect scaling. If this is the case for your project then please state this clearly. In such situations there is no need to provide numerical evidence for this.

In other cases there may be a need for light communication between the multiple simulations/elements/tasks that constitute the overall job. If so please provide information to indicate how this impacts scaling of the overall job to the sizes you indicated in section 5.1. Please also provide information on how that communication is implemented as some mechanisms may be unsupported.

Individual simulations/elements/tasks within the job may also be executed in parallel via MPI, OpenMPI similar. For example a single job might run 16 8-core instances of an MPI code per node, and use 12 nodes for the overall job. In these cases please also provide data demonstrating that those instances scale efficiently to (in this case) 8 cores.