Skip to content

Wrong interaction of DefinePerSample with downstream varied action #22367

@vepadulano

Description

@vepadulano

Check duplicate issues.

  • Checked for duplicates

Description

The following high-level analysis steps lead to triggering an assert in RDefinePerSample:

  • Use DefinePerSample to define some metadata
  • Vary an existing dataset column
  • Define a downstream variable that depends both on the varied column as well as the column created with DefinePerSample
  • Create a dependent action
  • Trigger a varied action

The reason is that the varied action requests column readers for its variation name. When this depends from a previous Define there is logic in place to return the correct define reader for that variation. But since DefinePerSample cannot depend on variations, it was imagined that it shouldn't be called in a context with variations.

The assert triggered is:

Fatal: false && "This should never be called" violated at line 88 of `/home/vpadulan/Programs/rootproject/rootbuild/rdf-range-overreading-distrdf-debug/include/ROOT/RDF/RDefinePerSample.hxx'
aborting

Reproducer

#include <ROOT/RDFHelpers.hxx>
#include <ROOT/RDataFrame.hxx>
#include <TFile.h>
#include <TTree.h>

void write_ttree(const char *datasetname, const char *filename) {
  auto f = std::make_unique<TFile>(filename, "recreate");
  auto t = std::make_unique<TTree>(datasetname, datasetname);

  float qcd_scale{};
  float pt{};
  t->Branch("qcd_scale", &qcd_scale);
  t->Branch("pt", &pt);
  std::vector<float> vals{0.5f, 1.f, 1.5f, 2.f, 2.5f};
  for (auto val : vals) {
    qcd_scale = val;
    pt = val * 20.f;
    t->Fill();
  }
  f->Write();
}

void run_analysis() {
  auto datasetname{"events"};
  auto filename{"events.root"};
  write_ttree(datasetname, filename);
  ROOT::RDF::RNode df = ROOT::RDataFrame(datasetname, filename);
  df = df.DefinePerSample(
      "xs", [](unsigned, const ROOT::RDF::RSampleInfo &) { return 0.005f; });
  // Comment the line above and uncomment the next line to make the example work
  //  df = df.Define("xs", []() -> float { return 0.005f; });
  df = df.Vary("qcd_scale",
               [](float s) { return ROOT::RVecF{s * 1.1f, s * 0.9f}; },
               {"qcd_scale"}, {"up", "down"});
  df = df.Define("weight", [](float scale, float xs) { return scale * xs; },
                 {"qcd_scale", "xs"});
  auto nominal =
      df.Histo1D<float, float>({"h", "h", 100, 0, 50}, "pt", "weight");
  auto vars = ROOT::RDF::Experimental::VariationsFor(nominal);
  vars["qcd_scale:up"];
}

int main() { run_analysis(); }

ROOT version

Any

Installation method

Build from source

Operating system

Any

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugexperimentAffects an experiment / reported by its software & computimng expertsin:RDataFrame

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions