Running jobs on a cluster is quite different to your personal laptop. In this post, I'll try to list a few tips which may help you get setup and running. I'll use the OzStar cluster as an example, but the general advise equally applies to other clusters.
Bookmark the documentation
If someone has taken the time to write documentation for the cluster. Make sure you know where it is and have read at least the overview and any advise they provide. For example, the OzStar documentation can be found here.
Usually, you'll be given an ssh address to log into by the documentation.
You'll be using this every time you log in so create an alias. First check that
you can login with the advise given. Once your happy with that, create an
alias to avoid having to
remember the ssh address. To do so, add something
like this to your
alias cluster='ssh -X firstname.lastname@example.org
Here the user name and address should be replaced, additionally you may prefer
to name the alias something like
ozstar instead of
-X enables "X-forwarding", the means that you'll be able to
display graphical output on your local computer from the cluster.
A cluster will typically not be located in the same building as you, nor even the same continent. As such, it can often be slow to transfer data to/from your local computer to the cluster. While it is technically possible to run an integrated development environment (IDE) or other graphical text editor on a cluster, there will likely be times when its too slow to be useable.
Instead, you should get familiar with at least one command-line text editor. There is plenty of unsolicited advice around about why you should/shouldn't use this one or that one. If you just want to get on with it, I suggest you use nano. Its on most clusters and has a help bar at the bottom to show you how to use it. To get started, simply run
$ nano text-file-you-want-to-edit.txt
and follow the instructions.
Persistent sessions and multiple screens
As you work more on a cluster, you may find it frustrating that it isn't easy to get back to where you where after logging out and logging back in. Moreover, you may wish to have multiple terminals. One for editing a file and the other for running some script.
A solution to both of these problems is tmux. There are many good articles on everything that can be done with tmux, here I'll just give a quick guide to using it on a cluster.
First, write following to a configuration file
unbind C-b set -g prefix C-a bind -n C-h select-pane -L bind -n C-j select-pane -D bind -n C-k select-pane -U bind -n C-l select-pane -R # Enable mouse mode (tmux 2.1 and above) set -g mouse on # Enable highlight of active window set -g window-style 'fg=colour247,bg=colour236' set -g window-active-style 'fg=colour250,bg=black'
Then, add the following alias to your
alias s='tmux new-session \; split-window -h \;'
(you can of course name the alias however you like).
Now, to start a session, at the command line run
This will open a new window with two panes. You can switch between them with
CTRL+l (note the plus here means press
both at the same time). these choices are not default tmux, but where set in
the configuration file above. You can now edit files in one pane and run
scripts with the other.
To log out of the pane, press
CTRL+a and then
d. This will drop
you back into the session (known as dettaching) you where at before. Now run
$ tmux ls
This will print a list of all the active sessions. You can log into one by running
$ tmux a -t 1
Where the number is the number (or name) of the session. The key point is that you can now log out of the cluster, log back in and attached to a running session which will be in exactly the state you left it before logging out.
A word of warning. It's possible with tmux to set jobs running on the head node and leave them while you log out. This may be okay for some short-medium length script, but in general should be avoided as it clogs up resources on the head node. It's much kinder to other users to always submit jobs through the proper queue.
Note, in the configuration file above. We set the tmux prefix to be
CTRL+a. The default is
One downside to using tmux, is that it is possible for the X-forwarding to
get out of sync. Effectively, when you log in to the cluster with X-forwarding,
an environment variable
$DISPLAY is set. You can view what its set to
$ echo $DISPLAY localhost:11.0
for example. If, when running in tmux, you get warnings related to the DISPLAY varibale. Try checking that it is set to the same value in the tmux session as in the standard login shell. To set the variable, run
$ export DISPLAY='localhost:14.0'