DPL: attempt at adding the run number to timers and enumerations #14135

ktf · 2025-04-01T21:46:48Z

No description provided.

github-actions · 2025-04-01T21:46:55Z

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

ktf · 2025-04-01T21:48:18Z

@shahor02 this should be close to what is needed. Not tested yet since I am rebuilding the world...
@knopers8 do you use the tfCounter of timers in any meaningful way? If not, any objections to make it seconds since EPOCH or something like that?

alibuild · 2025-04-02T00:32:03Z

Error while checking build/O2/fullCI_slc9 for fcdd7f3 at 2025-04-02 13:43:

## sw/BUILD/O2-full-system-test-latest/log
Detected critical problem in logfile digi.log
digi.log-[20040:SimReader]: [13:43:34][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-sim-digitizer-workflow, device shutting down. Reason: Cannot find N2o29framework17DataTakingContextE service using a global salt.
[20040:SimReader]: [13:43:34][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-sim-digitizer-workflow, device shutting down. Reason: Cannot find N2o29framework17DataTakingContextE service using a global salt.
[ERROR] Workflow crashed - PID 20040 (SimReader) did not exit correctly however it's not clear why. Exit code forced to 128.


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/98c3703005fe7ad79bdd33137f30356664a8f9cc/slc9_x86-64/o2checkcode/1.0-local63/etc/modulefiles
++ cat
--

Full log here.

knopers8 · 2025-04-02T09:49:53Z

@knopers8 do you use the tfCounter of timers in any meaningful way? If not, any objections to make it seconds since EPOCH or something like that?

We use it only with data inputs. It should be safe to change it with regards to QC. Unless it can break the latest possible timeframe computations?

ktf · 2025-04-02T10:25:24Z

In principle timers are already skipped for that...

davidrohr · 2025-04-02T12:20:01Z

Should we perhaps reserve special runnumbers we put in there, instead or initializing to 0 in case of failure? Then we would at least know why we get invalid run numbers.

ktf · 2025-04-02T12:25:53Z

yes, I was thinking about it however I wanted to minimise changes to the current behaviour.

ktf · 2025-04-02T12:35:19Z

Framework/Core/src/LifetimeHelpers.cxx

+    try {
+      dh.runNumber = atoi(services.get<DataTakingContext>().runNumber.c_str());
+    } catch (...) {
+      dh.runNumber = 0;


Suggested change

dh.runNumber = 0;

dh.runNumber = -1;

@davidrohr something like this?

Probably I would rather reserve a range of ~100 invalid positive numbers with meaning, like we have a range for unanchored MC. You can discuss the range with @ehellbar and RC.
Then, the invalid runNumber check would also need to check for that range. And if we get an error with a number from that range, it is clear how it happened.

alibuild · 2025-04-02T17:32:10Z

Error while checking build/O2/fullCI_slc9 for 3f8e858 at 2025-04-02 19:32:

## sw/BUILD/O2Physics-latest/log
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:
Error in cling::AutoLoadingVisitor::InsertIntoAutoLoadingState:


## sw/BUILD/O2-full-system-test-latest/log
Detected critical problem in logfile digi.log
digi.log:[21522:internal-dpl-ccdb-backend]: [19:31:57][ERROR] Exception while running: Fatal error. Rethrowing.
digi.log-[21522:internal-dpl-ccdb-backend]: [19:31:57][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-sim-digitizer-workflow, device shutting down. Reason: Fatal error
[21522:internal-dpl-ccdb-backend]: [19:31:54][ERROR] CCDBDownloader CURL transfer error - Timeout was reached
[21522:internal-dpl-ccdb-backend]: [19:31:54][ERROR] CcdbDownloader finished transfer http://alice-ccdb.cern.ch/CTP/Calib/OrbitReset for 1550600800000 (agent_id: alimetal04.cern.ch-1743615103-qwInpc) with http code: 0
[21522:internal-dpl-ccdb-backend]: [19:31:54][ERROR] File CTP/Calib/OrbitReset could not be retrieved. No more hosts to try.
[21522:internal-dpl-ccdb-backend]: [19:31:54][FATAL] Unable to find CCDB object CTP/Calib/OrbitReset/1550600800000
[21522:internal-dpl-ccdb-backend]: [19:31:57][ERROR] Exception while running: Fatal error. Rethrowing.
[21522:internal-dpl-ccdb-backend]: [19:31:57][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-sim-digitizer-workflow, device shutting down. Reason: Fatal error
[ERROR] Workflow crashed - PID 21522 (internal-dpl-ccdb-backend) did not exit correctly however it's not clear why. Exit code forced to 128.


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/858be680255d42de56e650d92b49dec242332dbf/slc9_x86-64/o2checkcode/1.0-local66/etc/modulefiles
++ cat
--

Full log here.

ktf requested a review from a team as a code owner April 1, 2025 21:46

DPL: attempt at adding the run number to timers and enumerations

3f8e858

ktf force-pushed the pr14135 branch from fcdd7f3 to 3f8e858 Compare April 2, 2025 12:04

ktf changed the title ~~Attempt at adding the run number to timers and enumerations~~ DPL: attempt at adding the run number to timers and enumerations Apr 2, 2025

ktf commented Apr 2, 2025

View reviewed changes

ktf merged commit 75153a0 into AliceO2Group:dev Apr 3, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DPL: attempt at adding the run number to timers and enumerations #14135

DPL: attempt at adding the run number to timers and enumerations #14135

Uh oh!

ktf commented Apr 1, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

ktf commented Apr 1, 2025 •

edited

Loading

Uh oh!

alibuild commented Apr 2, 2025 •

edited

Loading

Uh oh!

knopers8 commented Apr 2, 2025

Uh oh!

ktf commented Apr 2, 2025

Uh oh!

davidrohr commented Apr 2, 2025

Uh oh!

ktf commented Apr 2, 2025

Uh oh!

ktf Apr 2, 2025

Uh oh!

davidrohr Apr 2, 2025

Uh oh!

alibuild commented Apr 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

DPL: attempt at adding the run number to timers and enumerations #14135

DPL: attempt at adding the run number to timers and enumerations #14135

Uh oh!

Conversation

ktf commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

ktf commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alibuild commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knopers8 commented Apr 2, 2025

Uh oh!

ktf commented Apr 2, 2025

Uh oh!

davidrohr commented Apr 2, 2025

Uh oh!

ktf commented Apr 2, 2025

Uh oh!

ktf Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

davidrohr Apr 2, 2025

Choose a reason for hiding this comment

Uh oh!

alibuild commented Apr 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

ktf commented Apr 1, 2025 •

edited

Loading

ktf commented Apr 1, 2025 •

edited

Loading

alibuild commented Apr 2, 2025 •

edited

Loading