Skip to content

MGM: fix WeightedRandomPlacement::access selectedIndex and division by zero-weight disk possible#57

Open
bodoque-01 wants to merge 1 commit into
cern-eos:masterfrom
bodoque-01:fix/mgm-weighted-random-access
Open

MGM: fix WeightedRandomPlacement::access selectedIndex and division by zero-weight disk possible#57
bodoque-01 wants to merge 1 commit into
cern-eos:masterfrom
bodoque-01:fix/mgm-weighted-random-access

Conversation

@bodoque-01
Copy link
Copy Markdown

Fixes two bugs in WeightedRandomPlacement::access() that prevent it from being safely configured as a per-space access strategy.

Bug 1 selectedIndex semantics

access() stored the chosen fsid in args.selectedIndex and then validated it with best_index <= args.selectedfs.size(). The downstream consumer (mgm/ofs/XrdMgmOfsFile.cc:473-523) uses selectedIndex as a
position into arrays (firewalleps[fsIndex], proxys[fsIndex]) and iterated them in parallel.

The mismatch produces two failure modes:

  • Spurious ENOENT whenever the chosen fsid exceeds the replica list size (as far as I know should be the typical case?), weighted-random access silently fails.

  • Out-of-bounds read on the match case (small fsid in a small cluster), since the guard also uses <= where the position must be strictly less than the size, the same thing happens

Fix: iterate by index, store the loop position, and use std::numeric_limits<size_t>::max() as the "nothing picked" sentinel.

Bug 2 division by zero on zero-weight disks

score = hashFid(args.inode, fsid) / wt had no guard against wt == 0. Disk weight is std::atomic<uint8_t> default-initialised to 0 (mgm/placement/ClusterDataTypes.hh:66). validDiskPlct does not filter on weight (mgm/placement/PlacementStrategy.hh:318-339). A file with a replica on a zero-weight disk would SIGFPE the MGM.

Fix: skip zero-weight disks before the division. Treating weight 0 as "do not select" is consistent with how populateWeights already handles it for placement (discrete_distribution assigns it zero probability).

Regarding testing, I added a new case that covers both fixes, both that selectedIndex is a position and that a
zero-weight disk is considered skippable. Too, that a list of only zero-weight disks (don't know how extreme that case would be) returns ENOENT. I ran the tests on the prebuild-el9-diopside image from CERN, running on my Ubuntu as the Host.

…t crash

  WeightedRandomPlacement::access() had two bugs that made it unsafe to use
  as a per-space access strategy:

  1. selectedIndex was set to the chosen fsid instead of its position in
     args.selectedfs. The downstream consumer (XrdMgmOfsFile.cc) uses it
     to INDEX parallel arrays (firewalleps, proxys) sized like selectedfs,
     matching the contract set by RoundRobinPlacement::access. The guard
     `best_index <= args.selectedfs.size()` then rejected pretty much
     every selection as ENOENT, and on the rare match returned an
     out-of-bounds index. In the fix, we track the loop position instead
     and use a max() sentinel for the "nothing picked" path.

  2. score = hashFid(...) / wt divided by the disk weight with no guard.
     Disk weight is atomic<uint8_t> default-initialised to 0
     (ClusterDataTypes.hh), settable to 0 via `eos sched weight ... 0`
     (SchedCmd.cc), and validDiskPlct does not filter on weight. In the
     fix, we skip zero-weight disks before the division.

  Adds FlatScheduler.WeightedRandomAccessSelectedIndex covering both
  fixes (position semantics, zero-weight skip, only-zero-weight returns
  ENOENT) and tightens ASSERT_LE -> ASSERT_LT in TLSingleSiteWeighted.

  Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant