Skip to content

AOD to HepMC tool#12038

Merged
sawenzel merged 60 commits intoAliceO2Group:devfrom
cholmcc:cholmcc_aod_to_hepmc
Nov 10, 2023
Merged

AOD to HepMC tool#12038
sawenzel merged 60 commits intoAliceO2Group:devfrom
cholmcc:cholmcc_aod_to_hepmc

Conversation

@cholmcc
Copy link
Copy Markdown
Contributor

@cholmcc cholmcc commented Oct 7, 2023

The class o2::eventgen::AODToHepMC converts events stored in AOD
tables to HepMC3::GenEvent objects. The objects are kept in-memory
so that clients (e.g., a future o2::analysis::RivetWrapper or similar)
can pick up that event structure and process it.

The converter class will convert o2::aod::McCollisions and
o2::aod::McParticles information, as well as auxiliary information in
o2::aod::HepMCXSections, o2::aod::HepMC::PdfInfos, and
o2::aod::HeavIons.

The built HepMC3::GenEvent data structures can optionally be stored on
disk in HepMC3 ASCII files.

The application o2-aod-mc-to-hepmc uses this converter class to
convert AOD files to HepMC files.

Note that the converter only depends on
Framework/AnalysisDataFormats.h and not any other AOD information.

Compilation of Generators/src/AODToHepMC.cxx is contingent on HepMC3
being found at configure time.

The code is heavily documented and commented.

cholmcc and others added 30 commits September 19, 2023 13:53
A number of keys into the ME event header information
mapping is defined.   This is to ensure that code will
use the same keys when ever information is set.

Additional, non-predefined keys, are still possible.

This makes it much more robust when we ask for specific
MC information from the event header, such as

- cross-section(s)
- weight(s)
- Heavy-ion "geometry" parameters
  - Npart in projectile and target
  - Ncoll in various views
    - Overall
    - Hard
    - wounded-nucleon on nucleon
    - nucleon on wounded-nucleon
    - wounded on wounded
- Parton distribution function parameters

This is crucial for building a HepMC event structure which
can be passed on to say for example Rivet analyses.
The generator has been changed so that it exports
_all_ relevant and available information from Pythia to
the MC event header, including heavy-ion "geometry"
parameters.  In particular, the information is stored
in an HepMC compatible way for later user by f.ex.
Rivet.

Note, the current code counts up the number of collisions
by it self.  However, the authors of Pythia have another
way of doing that.   The code is now there to do it the
same way as the Pythia authors, but is currenly disabled.

We should decide which is the appropriate way to count
Ncoll.  I would recommend to follow how the Pythia
authors do it.
This change does two things:

**Full header**

_All_ information available in the HepMC event header is
propagated to the MC event header information map.  This
includes

- Heavy-ion "geometry" parameters (b,Ncoll,Npart,...)
- Cross-section(s)
- Weight(s)
- PDF information
- and other attributes defined

This is so that we can build a full HepMC event structure later -
for example to pass to Rivet analyses

** External program **

The functionality of the generatator is expanded so that it may
spawn an event generator program, say `eg`.

- The generator opens a FIFO
- The generator then executes the program `eg` in the background
  - The `eg` program is assumed to write HepMC event records on
    standard output, which is then redirected to the FIFO
- The generator reads events from the FIFO

For this to work, a number of conditions _must_ be met by the
`eg` program:

- It _must_ write events in the HepMC event format
- It _must_ write the HepMC event records to standard output
- It _cannot_ write anything else but the HepMC event record to
  standard output
- It _must_ accept the command line option `-n NEVENTS` to
  set the number of events to generate.

If a particular `eg` program does not meet these requirements, then
a simple shell script can be defined to wrap the `eg` appropriately.
For example, the CRMC program `crmc` _can_ write HepMC events to
standard output, but it will also dump other stuff there.  Thus,
we can provide the script

    #!/bin/sh

    crmc $@ -o hepmc3 -f /dev/stdout | \
       sed -n 's/^\(HepMC::\|[EAUWVP] \)/\1/p'

which simply filters the output of `crmc`.  Another EG program
may not accept the `-n EVENTS` command line option, but rather has
the command line option `--nevents`, so then we would do something
like

    #!/bin/sh
    cmdline="eg-program -o /dev/stdout "

    while test $# -gt 0 ; do
       case x$1 in
       x-n) cmdline="$cmdline -n $2"; shift ;;
       *)   cmdline="$cmdline $1" ;;
       esac

       shift
    done

    $cmdline

The command line to run is specified as

    --configKeyValues "HepMC.progCmd=<program and options>"

and can include not only the program name but also other
options to the program.  For example

    --configKeyValues "HepMC.progCmd=crmc -m 5 -i 20800820 -I 20800820"

for Pb-Pb collisions with Hijing.

With this change, we can use _any_ event generator which is capable to
write out its event records in the HepMC format.
The generator `GeneratorTParticle` will read in particles
from a `TChain` containing a branch with a `TClonesArray` of
`TParticle` objects.

The generator can operate in two modes

- Data is read from a file(s)
- Data is read from a file being generated by a child
  program

The first mode is selected by

    -g tparticle --configKeyValues "TParticle.fileNames=foo.root,bar.root"

The second mode is selected by

    -g tparticle --configKeyValues "TParticle.progCmd=<program and options>"

For this latter mode, see also recent commit to `GeneratorHepMC`

Above `<program and options>` specifiy a program to spawn in the
background which will write to a specified file (temporary file).
Suppose the program is called `eg`, then the following _must_ be
possible

    eg -n NEVENTS -o OUTPUT_FILENAME

That is, `eg` _must_ accept the option `-n` to set the number of
events to produce, and the option `-o` to set the output file name
(a ROOT file).

The name of the `TTree` object in the file(s) can be set with

    --configKeyValues "TParticle.treeName=<name>"

(defaults to `T`), and similar for the branch that contains the
`TClonesArray` of `TParticle`

    --configKeyValues "TParticle.branchName=<name>"

(defaults to `Particles`).

The generator `GeneratorTParticle` _does not_ import any header
information into the simulation event record.   Some proper
convention could be decided upon, e.g., one that tracks the
HepMC event record format.
Please consider the following formatting changes to AliceO2Group#11913
Please consider the following formatting changes to AliceO2Group#11913
The classes `GeneratorHepMC` and `GeneratorTParticle` is refactored to
derive from the (second) base class `GeneratorFileOrCmd`.

`GeneratorFileOrCmd` provides common infrastructure to specify

- File(s) to read events from (ROOT files in case of
  `GeneratorTParticle`, and HepMC files in case of `GeneratorHepMC`),
  _or_
- Which commmand to execute and with which options.

It also provides infrastructure to make unique temporary names, a FIFO
to read from (child program writes to), and so on.

These are all configured through configuration keys prefixed by
`FileOrCmd.`.

Other changes include

- `GeneratorHepMC` will open _any_ file that HepMC supports - ASCII,
  compressed ASCII, HEPEVT, etc.
- Through the use of `GeneratorFileOrCmd` the command line option flags
  for specifying seed (default: `-s`), number of events (default `-n`),
  largest impact parameter (defautl: `-b`), output (default: `>`), and
  so on can be configured via configuration key values
- `GeneratorHepMC` and `GeneratorTParticle` are passed the
  `GeneratorFileOrCmdParam` as well as specific `GeneratorHepMCParam`
  and `GeneratorTParticleParam`, respectively, objects by
  `GeneratorFactor` and sets the internal parameters accordingly. This
  hides the specifics of the parameters from `GeneratorFactory`.
…nd makes life so much more difficult than it needs to be for very little gain
- Forgot to change key for test in `run/CMakeLists.txt`
- `crmc.sh` uses `-o hepmc` instead of `-o hepmc3` to accomodate
  older installation of CRMC with `aliBuild`.
@cholmcc cholmcc closed this Oct 23, 2023
@cholmcc cholmcc force-pushed the cholmcc_aod_to_hepmc branch from 494110c to fd03db0 Compare October 23, 2023 07:22
Added braces around single-statement branching constructs (sigh)

I'm not sure why

    if (foo)
      bar

is considered less clear than

    if (foo) {
      bar
    }

The latter adds superflous spaces and braces and obscures the code.  Of
course, one _sometimes_ should disambiguate, like in

   if (foo)
     if (bar)
       baz
     else
       gnus

but it is really only needed in those cases, and only because humans
read it.  The compiler knows what to do.
@cholmcc cholmcc reopened this Oct 23, 2023
@cholmcc cholmcc marked this pull request as ready for review October 23, 2023 10:00
@cholmcc cholmcc changed the title WIP: AOD to HepMC tool AOD to HepMC tool Oct 23, 2023
@cholmcc
Copy link
Copy Markdown
Contributor Author

cholmcc commented Oct 23, 2023

build/AliceO2/O2/o2/macOS failes somewhere deep when ROOT tries to AcLIC some scripts. Not related to the changes of this MR. Same for build/AliceO2/O2/o2/macOS-arm. build/O2/fullCI failed because of missing braces around a single-statement branching - even though there is no ambiguity - sigh :-/

@alibuild
Copy link
Copy Markdown
Collaborator

Error while checking build/O2/fullCI for c62aa7e at 2023-10-23 15:30:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:97:30: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:100:30: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:269:12: error: use nullptr [modernize-use-nullptr]
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:301:12: error: use nullptr [modernize-use-nullptr]
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:378:42: error: statement should be inside braces [readability-braces-around-statements]
/sw/SOURCES/O2/12038-slc8_x86-64/0/Generators/src/AODToHepMC.cxx:379:12: error: use nullptr [modernize-use-nullptr]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

@cholmcc
Copy link
Copy Markdown
Contributor Author

cholmcc commented Oct 24, 2023

Failures of build/AliceO2/O2/o2/macOS-arm and build/AliceO2/O2/o2/macOS are not related to this MR.

The problem is that Apache arrow uses the symbol PREALLOCATE which MacOSX defines as a preprocessor macro to be 0x00000001. However, Apache arrow uses that name in places where an identifier is expected.

MacOSX defines PREALLOCATE in line 126 the header

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/sys/vnode.h

Apache arrow uses it for example in line 535 of

include/arrow/compute/kernel.h

like

  MemAllocation::type mem_allocation = MemAllocation::PREALLOCATE;

which wont work. Perhaps a work-around is to introduce the lines

#ifdef PREALLOCATE 
#undef PREALLOCATE 
#endif           

in the appropriate header of the ALICE arrow fork.

Funny thing is that it only shows up when ROOT tries to AcLIC a script, which would indicate that ROOT's AcLIC includes headers implicitly - specifically the vnode.h header. Otherwise, if arrow somehow included that header directly or indirectly, then compilation of the ALICE fork of arrow would fail too.

In any case, these two failures are false negatives

@alibuild
Copy link
Copy Markdown
Collaborator

Error while checking build/O2/fullCI for 750baee at 2023-10-28 22:44:

## sw/BUILD/O2-latest/log
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'
c++: error: unrecognized command-line option '--rtlib=compiler-rt'


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
/sw/SOURCES/O2/12038-slc8_x86-64/0/GPU/GPUTracking/Global/GPUChainITS.cxx:27:9: error: annotate this function with 'override' or (rarely) 'final' [modernize-use-override]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

@cholmcc
Copy link
Copy Markdown
Contributor Author

cholmcc commented Oct 30, 2023

Hi all,

Any news on this? Thanks.

@cholmcc
Copy link
Copy Markdown
Contributor Author

cholmcc commented Nov 7, 2023

Hi all,

Please advice as to what to do next. Thanks.

Yours,
Christian

@cholmcc
Copy link
Copy Markdown
Contributor Author

cholmcc commented Nov 10, 2023

I just resynced this branch against dev.

Please advice as to the next steps. Thank you.

Yours,
Christian

@sawenzel sawenzel requested a review from pzhristov November 10, 2023 15:01
@pzhristov
Copy link
Copy Markdown
Contributor

The errors in the CI are not related to this PR, we can merge it.

@sawenzel sawenzel merged commit bfa7c72 into AliceO2Group:dev Nov 10, 2023
@ktf
Copy link
Copy Markdown
Member

ktf commented Nov 14, 2023

The errors in the CI are actually related. ;-) This PR exposed some arrow header to cling, which is not actually able to digest it, due to some injected system header. Fixed by #12248.

@cholmcc cholmcc deleted the cholmcc_aod_to_hepmc branch November 21, 2023 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants