Main page for the documentation of the Test-Beam Analysis Framework (TBAF)
The TBAF is a software project initiated in the FLC group to create a structured and reusable set of software components to use in the analysis of the test-beam data for the large prototype using the ILCSoft packages and ROOT.
The TBAF depends on ROOT, LCIO, GEAR, LCCD, Marlin and MarlinTPC. Before starting developing this library is advisable to have a good confidence with the previous packages. At the moment is built using a custom Makefile but it is planned to move to CMake for generating these. To limit the troubles involved in the ROOT dictionaries it was decided to avoid them unless strictly necessary, and mainly use them for classes external to the library itself, in particular dictionaries of the STL containers.
Getting familiar with ILCSoft - Hints, Tips, Tricks, How-To, Examples and Exercises
Assuming you have almost no previous knowledge of the tools that this library depends on I will include some link to some introductory material and some practical hint and tips on how to use it and practice with the software components
LCIO is the software that defines the persistence model for the data you use in your analysis. It's main page is here. For the testbeam analysis we usually use the following LCIO classes and their implementations. You can find a small User Manual and the full code reference in the main LCIO page. For quick reference:
TrackerRawData: for the source data from the ADC
TrackerData: The raw data from the ADC after calibration where the internal ADC spectrum is casted from int to float
TrackerPulse: Data of the Pulses reconstructed from the Raw data using some Pulse Finder
TrackerHit: Position of the hits in the tracker obtained merging pulses together using a center of gravity algorithm
- Track: Description of a Track in the tracker obtained from the Hits using some tracking algorithm
An LCIO file is based on a list of events each of which is composed by a set of collection which are containers (vector-like) of one or more of the previous objects.
Some instructions for installing LCIO in your own machine are in this wiki page. If you are using a Desy computer you can just use the preinstalled AFS installation.
There are two basic programs in the bin folder of the LCIO installation that are very useful to give a quick look to a data file and check that everything works:
- anajob prints a list of all the events in the file with a list of the collections in each of the files and the number of items in each collection
- dumpevent allows to print the detailed content of a single event
The first step to get used to the software is to get a sample data file and take a look at its content using this two programs, then you could just built a quick test program to open a file, print some internal data on screen and save some parameters in the event headers before closing the file.
Marlin is a Framework to operate on the LCIO data keeping some kind of "history" of the processing done on the data and provides a framework for the algorithm to operate into. The basic building block is the ""Processor"" that perform this basic algorithm:
Initialize all the algorithms necessary Apply some algorithm to the whole run and store the results in the run header Take on or more input collection from the LCIO file; Apply some algorithm to each event in the run; If necessary do some final computation, plotting ... Write one or more output collection to the same file
This doesn't mean that the algorithm must be defined in the Marlin Processor. In fact, as it is also nicely worded in slide 22 of this talk the Marlin processor is merely a container to allow your algorithm to run in a Marlin program, which means that, for algorithm of decent complexity is usually better, to decrease the maintenance effort and increase the flexibility, to develop you code independently from Marlin but in such a way that you can plug it in if necessary.
The home page for the Marlin package links to the documentation of the package. There is also a wiki page containing some useful instructions on how to install and compile Marlin on your own pc. If you use a DESY computer is much better to just use the available AFS installation.
A Marlin program is steered by an xml file that describes which processors to load, in which sequence and with which parameters. The parameters are described by a predefined unique name and one or more values of a predefined type. There is also a GUI to easily generate a steering file interactively adding the necessary processors and configuring them. The GUI program can be launched with the MarlinGUI command, once the Marlin framework has been initialized.
The first step to get used to the software is to generate a simple steering file using the GUI and run it on the data file that you worked on in the previous step. Then you should try to create a simple processor as it is explained in the wiki page to perform the same type of operation you did in the previous step using your test program. Finally you should try to separate the algorithm of your program from the Marlin processor designing a small class performing your computation and using it in the processor.
Why do we need this framework
The ILCSoft framework provides the tools to analyze data sets stored in LCIO formatted files using a well defined processing flow. In particular this framework provides the tools to perform the reconstruction of TPC test-beam data to create several predefined objects out of the raw data obtained by the DAQ system.
During the test-beam campaign of 2011 we realized that we also needed software tools to quickly review basic features of the acquired and reconstructed data to monitor their quality. Additionally we needed to develop the software to analyze them at different reconstruction levels to extract the performances of the GridGEM module. Finally, to avoid duplicating efforts within the LCTPC collaboration, we wanted to develop this tools so that they could be integrated within the common ILCSoft framework to be eventually used to analyze and compare the data obtained with other readout systems as well. To achieve all these goals we develop a new software library called TBAF.
At that time three common paradigms were used within the collaboration to perform this type of tasks:
Develop the analysis of the reconstructed data contained within the LCIO file in a Marlin processor shared through MarlinTPC. This simplifies the sharing of the analysis algorithm within the collaboration and their integration with the other common tools of ILCSoft. The most relevant drawback of this system is that it does not allow to interactively review, analyze and plot the data contained in the LCIO file as it is so easy to do using the ROOT framework. In fact the LCIO format is optimized to provide parallel access to all the variables in a single event while ROOT data format is very efficient accessing the same variable in all entries in the same tree. Moreover ROOT provides already an integrated system to interactively analyze the data contained in a ROOT Tree. For this reason many developers in the collaboration use a second processing paradigm;
- The second type of processing flow used by many people within the LCTPC collaboration, requires to create a ROOT tree out of the data contained in the LCIO file and then using this source to review and analyze the data. This system solves the previous issue but makes the sharing of the code through ILCSoft practically impossible. Moreover one important feature of many LCIO objects is that they often contain pointers to those objects used to create them. While this is possible within a ROOT Tree as well it is more complicated and error prone.
- The third commonly used approach to the analysis of the data contained in an LCIO file is to directly access this data in a standalone C++ program and eventually using ROOT for plotting the data. This approach allows a greater freedom in the definition of the analysis algorithms because the user is not constrained by the Marlin processing flow and can use all the features of the LCIO framework to access the data. On the other hand it joins all the drawbacks of the other two methods because it is practically impossible to bring the algorithms back within the ILCSoft framework and does not allow for the quick and interactive access to the data that is provided by the ROOT framework.
Designing the we aimed to create a system that allows to obtain the advantages of all the three paradigms at the same time limiting their drawbacks. In particular, the framework allows to create a set of ROOT trees from the LCIO files which can be interactively reviewed. The same objects used to populate the ROOT Trees can then be used to develop analysis routines that can be packaged and steered in a standalone program or in a Marlin processor so that it can be integrated within the ILCSoft framework. Decoupling the data sources from the data types and the processing routines from the steering system allows to reuse most of the code which eases the maintenance of the system and helps developing the analysis routines in such a way that they can be easily shared afterward. Within this system the developer can interactively work on the data through ROOT, then refine and test the analysis routine freely in a standalone program, depending either on the ROOT tree or on the LCIO file, and finally reuse the same algorithms in a Marlin processor without incurring in a heavy overhead.
To further simplify the development of new analysis routines the system include a set of predefined interfaces to process, analyze, filter and present the data. When an algorithm is defined within these interfaces it is possible to use the additional tools contained within the framework which make it possible, for example, to create a list of summary plot for each run taken during the test-beam.
Framework implementation and its basic components
The main page where these are described and listed is: TestBeamAnalysisConverters
Configuration file, Python and Bash Scripts
There are also a set of useful scripts
Inventory of the available tools
The first group of tools are the summaries. The list and description of the available summaries and control plots is contained in this page