Remote Access

Remote Machines

From your desktop machine, you will often want to access other machines, such as Warwick's godzilla or the SCRTP cluster facilities.

Note: when accessing remote machines, you often see a welcome message. Do not ignore this! It can contain important information. For example at the time of writing this, Warwick's Orac cluser machine shows the message below. Notice the section about where you may run compute jobs.

Orac welcome message

The message above refers to login nodes as compared to compute nodes. When you access most cluster systems, you go through such an externally accessible login machine. On this machine you have access to all your documents, but cannot run compute jobs. Instead, you prepare a submission script and submit this to a queue to be run on compute nodes. Smaller systems may not make this distinction. Most systems allow compilation and file compression and copying. Some also allow basic data visualisation on login-nodes. Check the welcome messages and documentation to be sure.

Using Machines

Information on the SCRTP high performance facilities is available here and the Cluster of Workstations (COW) is here. The former provides about 2400 processors available to groups and researchers by arrangement. Ask your supervisor about access if required. The COW allows running of small jobs in the background across all SCRTP workstations and some dedicated machines. Both of these are controlled by a queueing system. More information is available on the wiki (login required).

In general, you are allowed to do tasks needed to write and compile your code on the login nodes directly, and also tasks associated with moving files, for example compressing them. All jobs must be submitted to the queue, including most visualisation and analysis tasks. For these, an interactive queue is available, which lets you work at a shell equivalent to the desktop. Small and/or simple data plotting tasks are allowed on login nodes, but be please be courteous.

If you need to run very large jobs (generally about 512 processors or more) you will want to apply for time on a large facility. For example, Warwick is part of HPC Midlands Plus. Many facilities are summarised here.

SSH

The basic command to log-on to a remote machine is

ssh [username]@[machinename]

The [username] can be left out if it is the same as on your local (current) machine.

Often you want to be able to open windows on the remote computer and see them locally, for which the flag -X is used. This allows the remote device to forward windows to the local one.

In some cases, particularly when connecting from a Mac, -Y may be required, which skips some validation of the graphical forwarding.

Note: when you first log on to a machine, you may see a message about host authenticity (see image below). You can usually answer 'yes' to this, providing the computer name is correct. Sometimes you see a similar message about a changed host (see second image below). This may look concerning, but usually just means the computer has been changed in some way. Usually you may accept the new key and proceed, but if you are working with sensitive data and have not been informed about this change it may be prudent to contact the system administrator.

Changed host message

RSA Keys

Some SCRTP machines such as Minerva or Orac do not allow you to log on using your password for security reasons. Instead, they use RSA keys. In short, this means a pair of files such that something encrypted using the first file file can only be decrypted using the second. The first is called the private key, and should be kept safe. The second is the public key. Those with the public key can read the message, but a decrypted message must have come from you.

Full details of how to generate keys and get SCRTP access are here. Once your keys are set up, they will be used where possible to log on to all SCRTP machines. Logging on using keys is very similar to using a password, but you will now be asked for the passphrase that you used to set up the key rather than your password.

SCP

The 'ssh' command is the secure (prefixed s) version of the command sh, which opens a (new) command line process. To copy files to and from remote machines, there is the secure analogue to the copy, 'cp', command, scp. Just like copy, this takes 2 file or directory names, the source and the destination.

Either or both of these can be prefixed with a user-name@computer-name or computer-ip section, followed by a colon ':', like this:

scp tmp.text user@godzilla.csc.warwick.ac.uk:~/Examples/

Notice that the '~' is used to put the file into a subdirectory of user's home on the remote machine, godzilla.

The * and ? wildcards can, of course, be used in local and remote filenames, but note that tab-completing the filename will always look on the local machine.

scp uses ssh behind the scenes, so if you're using multiple RSA keys or similar, you can supply the name of the identity file to use with the -i option.

FTP and File Fetching

You are probably familiar with HTTP from webpages, where it stands for Hyper-Text (the content of webpages) Transfer Protocol. FTP similarly stands for File Transfer Protocol and is a system for transferring files between computers. Different systems may store files slightly differently: for example, Windows and Linux machines differ in how they signal an end-of-line in text files. FTP could ensure the proper conversions were done when files were transferred.

There are a variety of programs, both command line and graphical, for fetching files using FTP, HTTP and other protocols. The former include ftp, wget and curl. The latter two can also fetch HTTPS webpages.

Note: scraping or crawling web pages using scripts is subject to some restrictions and on some websites fetching content too often can lead to you being banned. Check for any terms-of-service if you're accessing more content than you reasonably could from your browser, and obey any rate limits or other restrictions.

The basic use of wget is just wget [url] or wget -r [root_url] which recursively follows subdirectories and gets everything in them. The most likely use of this command is fetching source code or data files, and it is likely that instructions will be provided. Otherwise, more details are available here.

Next Section

The Bash Shell Environment