SIngle Node Performance
Here is a record of simple performace and scaling tests for Lare3d from August 2018. These were generated on a 36 core linux workstation with Intel Xeon CPU E5-2695 v4 @ 2.10GHz processors using ARM Forge and Reports software version 18.0.2 on ubuntu 16.04.5 LTS. Code compilerd with gfortran and OpenMPI.
Code base was a branch from Lare3d version 3.4.1.
Lare3d with 643 grid using idealMHD and running for 100 steps on 1 core - ARM Report
Lare3d with 643 grid using idealMHD and running for 100 steps on 8 core - ARM Report
Lare3d with 1283 grid using idealMHD and running for 10 steps on 1 core - ARM Report
Lare3d with 1283 grid using idealMHD and running for 10 steps on 8 core - ARM Report
Lare3d with conduction and resistivity on and 1283 grid using idealMHD and running for 10 steps on 8 core - ARM Report
LARE3D and 2D both hard scale to large numbers of processors. The above figure shows hard scaling of LARE3D simulations on a cluster using dual socket Xeon E5-2670s (16 cores per package). The simulation sizes (1283 grid) are comparable to the smallest simulations often used in Solar MHD studies when doing initial parameter scans prior to full-scale simulations on larger grids. The smallest simulation scales from 16 processors (single package) to 512 processors showing a scaling of 0.99 of ideal. Scaling does then break for 1024 cores. This is shown to be a limit only of the hard scaling behavior by repeating with a larger (2562x128) grid. These latter simulations show super-linear scaling from 256 processors to 1024. LARE2D behavior is comparable to LARE3D. Note as scaling of LARE2D and LARE3D are linear the core hour required per job is independent of the number of cores used.