How to Run an AcCoRD Simulation
This page has instructions for running AcCoRD simulations, including discussions of random number seeds and running AcCoRD on a compute cluster. These instructions assume that you have already prepared a configuration file. If not, then please refer to the AcCoRD Configuration page for details or the How to Use AcCoRD page for an overview. Once your simulations have been run, you can refer to the AcCoRD Output page for reading the files and importing into MATLAB.
Instructions for Running AcCoRD
Once you have a configuration file ready, then you can try to run it:
- Open a command line window (e.g., run cmd.exe in Windows or open a terminal in Linux).
- Navigate to the AcCoRD "bin" directory using the "cd" command (e.g., enter "cd \PATH_TO_ACCORD\AcCoRD-1.0\bin\" on Windows)
- Run either the optimal or debug executable. Generally, the optimal executable is recommended. The call syntax for each is the same: "MY_EXECUTABLE MY_CONFIG SEED_VALUE", where the executable MY_EXECUTABLE is:
- accord_win.exe or accord_win_debug.exe on Windows
- ./accord_dub.out or ./accord_dub_debug.out on Debian or Ubuntu Linux
- ./accord_rc.out or ./accord_rc_debug.out on RHEL or CentOS Linux
While AcCoRD runs, the information printed to the command line will include the following:
- Version information.
- Where the configuration file was found (if at all).
- Warnings from the configuration file parameters. If there are any warnings, and the "Warning Override" property is false, then execution will pause and you will be prompted to continue or cancel the simulation.
- A summary of the configuration (e.g., number of regions, subvolumes, actors, etc.)
- Location and name of the two output files that will be created. A "results" folder will be created inside the "bin" directory if it did not exist and if there is no "results" sibling directory. The files will be named MY_OUTPUT_SEEDX.txt and MY_OUTPUT_SEEDX_summary.txt, where MY_OUTPUT is the "Output Filename" defined in the configuration file, and X is the seed value. If the output files already exist, then they will be over-written.
- Simulation progress with estimated time remaining (if there are multiple realizations being simulated)
- Simulation run time.
Here is a sample command line output on Windows running one of the sample configuration files:
Here is a sample command line output on Windows running a modified sample configuration file where two unnecessary properties have been added:
As we can see above, warnings appeared and told the user what action was being taken to address them. The user also had to press "enter" in order to continue the simulation.
Taking Advantage of Seeds
One of the global parameters in an AcCoRD simulation is the random number "seed". The seed is used to initialize the random number generator (RNG). The RNG drives all of the random behavior in a simulation, such as the values of random bits, the displacement of microscopic molecules, and when chemical reactions occur. A key desired property of an RNG is that it can generate a long sequence of values that are effectively independent, such that someone who knows all the prior values in the sequence cannot guess the next value. Furthermore, sequences that are initialized with different seeds should also be independent. One of the primary features of AcCoRD is that its design facilitates repeating a simulation a large number of times. It can do this in one of two ways:
- In a single execution, a simulation is repeated multiple times according to the "Number of Repeats" property in the "Simulation Control" object. Each of these realizations should be independent.
- Every execution uses a seed. A simulation can be called multiple times, each with a different seed, and then all of the realizations in every execution should be independent.
Of course, every execution creates its own output. Why would you want to use different seeds and create more output files? The main reason is that AcCoRD is a single-threaded program. Multi-threaded execution can be achieved by running multiple instances of the same simulation at the same time, each with its own seed. This concept can be extended to running AcCoRD on a compute cluster (see below). In terms of the number of output files, this is generally not an issue, since the AcCoRD import utility for MATLAB can combine output from a simulation that was run with different seed.
Using a Compute Cluster
If you have access to a computing cluster, then it can be a great resource for running simulations. Clusters might be managed locally, such as in a single research lab or department, or they could be multi-institutional infrastructure, such as the WestGrid and the Centre for Advanced Computing, which have both been used at different stages of AcCoRD's development. Unfortunately, different clusters can have very different ways to set up and execute code, so detailed instructions cannot be provided here and the potential for support by the AcCoRD developer is limited. However, here are some general guidelines and tips to give you an idea of what might be involved:
- You often need some form of SSH access to log in to a remote terminal that serves as the gateway to the cluster. The cluster would usually has some kind of registration procedure that once completed will give you SSH credentials (username/password). On Windows, PuTTY is commonly used for SSH. Linux usually has SSH via command line by default.
- You will need a way to transfer files to and from the cluster. A full install of PuTTY includes utilities to transfer files. Alternatively, an interface-based program like FileZilla can be very helpful to do this.
- You will (probably) need to compile the AcCoRD executable directly on the cluster. Refer to the "Compile from Source" instructions on the AcCoRD Downloads page, but replace "preferred directory" with "directory on the cluster" and replace "command line terminal" with "SSH session". If the cluster is using a Linux-based OS (which it most likely is), then the Linux-based build scripts should work fine.
- A cluster usually has a job submission system where you submit a "job" to run one simulation. The syntax for this can vary greatly. Once you figure out how to submit a job (usually there are sample commands provided), it can be very helpful to write a script with a for-loop that submits one job for each value in a range of seed values.
- A cluster usually has some kind of fairness protocol to limit how much many computing resources a user has access to. You may need to trade off between the number of seeds vs the number of "Repeats" in each simulation, in order to run your simulations in a reasonable time.
- The command line output on a cluster is usually written to a unique data file for each simulation. Generally, there is no mechanism to deal with configuration warnings, so be sure that your configuration file doesn't generate any warnings (or - not recommended - set "Warning Override" to true).
- If you run many simulations with a lot of seed values, then you may need to transfer large numbers of files. It can be much faster to combine files into a "zip" file (or some other compression method) and then transfer the zip file. Total file size can be reduced by up to an order of magnitude, and the transfer rate is much faster when there are fewer individual files.
The Next Step
Once you have run your simulation(s), you can examine your output and import it to MATLAB.