Using the Batch Cluster
The FLC batch cluster is part of the common BIRD infrastructure at DESY which makes use of the Sun N1 Grid Engine cluster management software. You may have to contact one of your administrators to enable this resource for your account in the registry.
To use the batch cluster, log in to one of the DESY pal servers (or FLC machines lc3 or lc4) and type 'ini bird' (or source the file /usr/sge/default/common/settings.sh in your shell to set up various environment variables). Then write a shell script which will run on the cluster nodes – it should do whatever is needed for your job (prepare input data, process it, and store output data) and invoke other commands and programs as needed. This script can then be submitted to the cluster as a batch job. The main commands for job control are qsub (submit a job), qstat (display the job status), qmon (invoke a graphical monitoring tool), and qdel (delete a job). These commands have many options, some of which are:
“-P” – specify your project name (should be “-P flc”, optional if you belong to one collaboration only)
“-l h_rt=time” – request run time, e. g. “-l h_cpu=02:30:00” for 2.5 hours
“-l arch=type” – request a hardware architecture (“type” can be “x86” or “amd64” or “x86|amd64”)
“-l os=type” – request an operating system (“type” can be “sld3”, “sld4” or “sld5”, or a combination of them like “sld3|sld4”)
“-w” – specify a warning level, e. g. “-w e” to reject jobs with invalid requests as errors
“-j” – to merge or not to merge stderr with stdout, e. g. “-j yes”
“-m” – notify the job owner by e-mail, e. g. “-m ae” to send a mail when the job is aborted or ends.
“-M” – specify the mail address to which the notification should be sent
“-v variable1[=value],variable2[=value]” – set or export the current values of environment variables for your job
“-cwd” – (try to) run the job in the current working directory(may not be always available, be careful), also put the logfiles (stdout, stderr) there
Note that you can supply these options either as command line arguments or in the form of special comments beginning with “#$” at the top of your job script, e.g “#$ -cwd”
There will be a temporary directory $TMPDIR assigned to your job on the worker node. Make sure that you change to that directory immediately at the beginning of your job script with the command “cd $TMPDIR”. There is no need to retrieve the text output of your job manually – the cluster management software will automatically return the contents of the stdout and the stderr stream when your job is done.Make sure to copy over any other files produced by your jobs. For that purpose AFS is available on BIRD nodes.
For further information, see the manpages of sge_intro(1) and the various cluster-related commands or write to the mailing list sge-users@desy.de .