NAF2.0 for ILC users
Why and How to run my jobs on working nodes (Marlin ... ...)
BIRD: There are more than 7000 CPUs and a lot of memory.
ILC login machine has only 12 CPUs. Please do not run your production jobs on these 12 CPUs.
You may create this and paste your program at the end.
#!/bin/zsh # #(execute my job from the current directory) #$ -cwd # #(the same OS as login machine) #$ -l arch=amd64 #$ -l os=sld6 # #(the cpu time for my job) #$ -l h_rt=23:59:00 # #(the maximum memory for my job) #$ -l h_vmem=5G # #(send email when my job ends) #$ -m ae # #(send to this email address) ##$ -M email@example.com # #(paste my init soft here) # #(paste my program here) ...
Now you can use the 7000 CPUs with qsub like this.
More about BIRD, can be found at http://bird.desy.de/info/index.html.en
- NAF2.0 for ILC users
- ILC/CALICE software
- Useful Tricks
- More Documentations
Getting a NAF2.0 account
DESY, University of Hamburg or Humboldt University member:
- If you are a DESY user with a full account, and please check your ID with command "id".
[flc desktop] id <your_accout> uid=NNNNN(<your_account>)gid=NNN(flc) groups=1417(flc),5295(af-ilc)
- If you have seen "5295(af-ilc)", you should be able to login already to the WGS.
If you do NOT have "5295(af-ilc)", please send email to naf-ilc-support<at>desy.de.
Do you have more questions about the "af-ilc" namespace, please send email to naf-ilc-support<at>desy.de
Login to a NAF2.0 ILC Workgroupserver
There are two work group servers:
- nafhh-ilc01.desy.de SL6
- nafhh-ilc02.desy.de SL6
nafhh-ilc01 has been recently migrated to sl6, and has the identical configuration as nafhh-ilc02.
Now both could access the CVMFS ilcsoft installation, GRID storage element dCache, and the new NAF2 scratch storage space DUST.
For login, simply do an ssh to your favourite work group server.
ssh -X -l yourusername nafhh-ilc01.desy.de ssh -X -l yourusername nafhh-ilc02.desy.de
nafhh-ilc01 and nafhh-ilc02 are both rather powerful machines with which you can test your jobs without causing any issues. However, please do not copy/move large or many data files on these machines. This consumes the complete bandwidth and slows down the machines dramatically. Instead, use the qrsh command (see further down below) to access a bird machine and run your copy-job there.
Batch system - BIRD cluster
The NAF2.0 uses the general purpose batch system BIRD (Batch Infrastructure Resource at DESY).
To get details about your jobs (and only yours), use
A summary of current queue usage can be produced by doing
qstat -g c
Start the GUI:
Direct login: Sometimes it's useful to run some commands interactively on the node similar to the one you plan to submit your jobs to.
qlogin -q login.q
or, if you wish to specify a particular node
qlogin -q firstname.lastname@example.org
or, if you want to request specify resource node
qrsh -l distro=sld6 -l arch=amd64 -l h_vmem=2G -l h_rt=12:00:00
Information on how to use it can be found at http://bird.desy.de/info/index.html.en
Note: The BIRD cluster have the same configuration as our work group server. All nodes could access the CVMFS ilcsoft installation, GRID storage element dCache, and the new NAF2 scratch storage space DUST.
When you find your jobs can run on most of the BIRD nodes, but have problem on one specify node, please send email to bird.service<at>desy.de, stating the BIRD node name, and the problem.
Note In order to be able to send your jobs to BIRD cluster, you need permission to access resource "batch(IT)". You may check it by yourself by login onto https://registry.desy.de/registry/ with your desy account, and looking at the column "Current resource access" to verify the content "batch(IT)". If it did not exist, you may ask one administrator to help you. You will find the administrator list by clicking on "Administrators" at the left column on your login screen.
The main scratch space is DUST, Please note that this is scratch space, i.e. there is no backup!
Report Storage Issues
If you feel a problem about the storage, and want to know "How to report storage issues?". Please read this documentation https://naf-wiki.desy.de/ReportStorageIssues
AFS: The AFS cell /afs/desy.de is used to provide each user with a home directory.
To learn about your current quota usage,
Large scratch space: The technology DUST is used to provide fast scratch space. To access your working space, do
The scratch has been mounted on the BIRD cluster nodes, your working space can be accessed from BIRD cluster nodes, too.
Question about DUST quota, please send mail to naf-ilc-support<at>desy.de, and stating your space name.
When you finished your jobs, after clean up, and want to free some space, please send mail to naf-ilc-support<at>desy.de, stating your space name and the quota that you still want to keep.
The old quota management page is now obsolete for both users and administrators.
To check your current quota and quota usage, please go to this link. https://amfora.desy.de
As a registry namespace administrator, you will see an additional button "GPFS Management".
The following DUST related tasks can be managed via Amfora:
- Managing quotas for user and group filesets
- Creating new group filesets
/nfs/dust/ilc/group/ild /nfs/dust/ilc/group/flctpc /nfs/dust/ilc/group/flchcal
Access to experiments data on dCache: Fast access is provided to the DESY dCache systems where experiments data is hosted.
The ILC users may access the data at /pnfs/desy.de/ilc/.
All the BIRD working nodes and WGS could access the data directly with the full path. It is NOT necessary to copy them to any DUST space!
- GRID UI is available on both SL6 WGS (nafhh-ilc01/nafhh-ilc02).
Please use clean bash/zsh environment to initialize your GRID certificate. Do NOT run any initialization script in your profile unless you know it will not affect your GRID UI system.
- If you need to access Grid:
You can initialize a Gird VO ilc/calice on the NAF2.0 ILC work group server both nafhh-ilc01.desy.de (SL6) and nafhh-ilc02.desy.de (SL6).
ssh -X -l yourusername nafhh-ilc02.desy.de # DO NOT need any ini, or script, the grid envoriment is ready for the user. # currently, please do "export X509_USER_PROXY=$HOME/k5-ca-proxy.pem", or put it into your "$HOME/.zshrc" if you use zsh. export X509_USER_PROXY=$HOME/k5-ca-proxy.pem # If your grid certificate has been initialized, you can read it, or re-initialize it. It will be saved into the same "$HOME/k5-ca-proxy.pem". # Which can be accessed from AFS, NAF2, BIRD working nodes voms-proxy-info -all voms-proxy-init -voms ilc -valid 24:00
Example for "zsh" and "bash" shell users. You may find out what kind of shell you are using by the command "echo $SHELL".
echo $SHELL /bin/zsh
echo $SHELL /bin/bash
ILCsoft CVMFS installation
[@nafhh-ilc02] ls -all /cvmfs/ilc.desy.de/sw drwxr-xr-x 23 cvmfs cvmfs 4096 Nov 25 17:23 x86_64_gcc48_sl6 drwxr-xr-x 17 cvmfs cvmfs 4096 Jan 19 11:01 x86_64_gcc49_sl6 [@nafhh-ilc02] ls -all /cvmfs/ilc.desy.de/sw/x86_64_gcc48_sl6 drwxr-xr-x 63 cvmfs cvmfs 4096 Jan 11 2016 v01-17-07 drwxr-xr-x 64 cvmfs cvmfs 4096 Jan 11 2016 v01-17-08 drwxr-xr-x 65 cvmfs cvmfs 4096 May 3 2016 v01-17-09 drwxr-xr-x 57 cvmfs cvmfs 4096 Oct 14 02:42 v01-17-10 drwxr-xr-x 59 cvmfs cvmfs 4096 Nov 15 16:59 v01-17-11 drwxr-xr-x 56 cvmfs cvmfs 4096 Nov 25 17:14 v01-19 [@nafhh-ilc02] ls -all /cvmfs/ilc.desy.de/sw/x86_64_gcc49_sl6 drwxr-xr-x 57 cvmfs cvmfs 4096 Jan 20 09:30 v01-19-01
For the NAF2.0 ILC users, (NAF2.0 are 64bit machine now), please use:
qlogin -q login.q source /cvmfs/ilc.desy.de/sw/x86_64_gcc49_sl6/v01-19-01/init_ilcsoft.sh
Please checkout ILDConfig. And following the README.md to run the sim/reco jobs.
git clone https://github.com/iLCSoft/ILDConfig.git less ILDConfig/StandardConfig/lcgeo_current/README.md
Additional information can be found here: http://ilcsoft.desy.de, and https://confluence.desy.de/display/ILD+Software+Working+Group
For the CALICE, the DESY HCAL group provides the CALICE software.
more information about calice software: https://twiki.cern.ch/twiki/bin/view/CALICE/SoftwareNews
- How to know you are properly registered for access the BIRD resource?
Please check if you are properly registered by doing qconf -suserl | grep yourusername. If nothing shows up, you need to add the resource batch in the registry. Please contact FLC namespace administrator or UCO for this.
More information about UCO, please checkout this link: http://it.desy.de/services/uco/index_eng.html.
- Please find the following links, if you want to read something more.
Modules on SLD6: