Skip to content

update_check: auto-prefix refs/tags/ for info/refs patterns; adapt 101 update files#60609

Draft
Rutpiv wants to merge 104 commits into
void-linux:masterfrom
Rutpiv:update-check-refs-tags
Draft

update_check: auto-prefix refs/tags/ for info/refs patterns; adapt 101 update files#60609
Rutpiv wants to merge 104 commits into
void-linux:masterfrom
Rutpiv:update-check-refs-tags

Conversation

@Rutpiv
Copy link
Copy Markdown
Contributor

@Rutpiv Rutpiv commented May 18, 2026

@classabbyamp's PR #60025 migrated update_check.sh from HTML scraping to git info/refs?service=git-upload-pack. This avoids fragile HTML parsing and reduces bandwidth usage.

The info/refs endpoint emits tag entries in the form <sha>\trefs/tags/<tag>. While investigating why Vulkan-Headers and related packages weren't appearing in void-updates.txt, I found that custom pattern= entries in update files, originally written against HTML output, no longer match because they don't account for the refs/tags/ prefix. This PR fixes the 101 affected packages (of 152 with custom patterns reaching info/refs) and proposes two small additions to update_check.sh that aim to keep update files concise and resilient going forward.

Infrastructure changes

Two additions to update_check.sh:

  1. Auto-prefix refs/tags/ when (a) the URL is an info/refs endpoint and (b) the custom pattern doesn't already start with refs/tags/. Existing patterns with explicit anchor remain unchanged. Fully backward compatible.

  2. distfiles_only=yes variable for update files. When set, homepage scanning is skipped and only distfile URLs are checked.

Manual.md is updated to describe both features and provide guidance on choosing between the default behavior, distfiles_only=yes, and site=URL.

Design rationale

Why auto-prefix refs/tags/?

The info/refs endpoint (used by update_check.sh since #60025) emits tag lines in a uniform format: every line ends with refs/tags/<tag>. Since the refs/tags/ prefix is a property of the wire format rather than something each package needs to express, it seemed reasonable to handle it at the script level.

The alternative would be adding refs/tags/ to each affected update file individually. That works, but spreads endpoint-format knowledge across ~100 packages. Centralizing it in the script keeps update files focused on what's package-specific (the tag naming convention upstream chose) rather than protocol-level detail.

The condition is narrow (URL contains info/refs AND pattern doesn't already start with refs/tags/), so explicit-anchor patterns continue working unchanged. The 44 untouched custom-pattern packages were cross-checked against master to confirm this. No regressions observed.

I'm happy to revisit this approach if you'd prefer the per-package path. The script change is small and easy to drop.

Why distfiles_only=yes?

The homepage scan is a useful heuristic when patterns live in HTML space, since a single broad pattern can catch versions in either the homepage or the distfiles page. With info/refs, the situation changes a bit: the endpoint already provides a uniformly-structured tag listing, while the homepage remains a human-facing page whose layout can change for unrelated reasons.

In practice, two patterns emerge:

  • Patterns that survive the homepage tend to anchor on decorative strings (e.g., "Latest release: X", <title>Release foo-X</title>). When upstream redesigns the site, these break even when the tags themselves are stable.
  • Patterns scoped to info/refs can stay simpler and more structural (e.g., vulkan-sdk-\K[\d.]+ against a refs/tags/ listing). Adding homepage-survival logic to them tends to bloat the regex.

distfiles_only=yes is meant as an opt-in for packages where the homepage adds noise rather than signal. Default behavior is unchanged.

Why a new variable instead of site=URL to info/refs?

site= works for this. It's used in this PR for libeot and libluv, where upstream's tag scheme requires it. The reason I went with a separate variable is that site=URL requires each package to spell out the full info/refs URL, duplicating information already derivable from the template's homepage / distfiles. distfiles_only=yes is one line and lets the framework keep deriving the endpoint from the template.

The Manual update describes this as a small spectrum: default for most packages, distfiles_only=yes when the homepage adds noise, site=URL when a different endpoint is genuinely required.

Package changes (101)

Adapted custom patterns so ./xbps-src update-check <pkg> returns the correct latest version via info/refs. Each was verified individually with ./xbps-src update-check, often with XBPS_VERBOSE=yes to confirm the URL and regex being used. Pre-release tags (rc, beta, pre) were caught by inspecting upstream tags via curl 'info/refs?service=git-upload-pack' and filtered with ignore= where needed.

34 of these also gained distfiles_only=yes.

Full list of touched packages

CImg, EternalTerminal, ImageMagick, Lucene++, OpenSubdiv, SPIRV-Headers, SPIRV-LLVM-Translator19, SPIRV-LLVM-Translator21, SPIRV-LLVM-Translator22, SPIRV-Tools, UEFITool, Vulkan-Headers, Vulkan-Tools, Vulkan-Utility-Libraries, Vulkan-ValidationLayers, asahi-uboot, atomicparsley, cgal, cronutils, discount, eigen3.2, faac, faad2, flintlib, fntsample, font-adobe-source-code-pro, gdmd, gnome-epub-thumbnailer, godot, gopls, gzdoom, incron, inih, intel-ucode, inxi, iverilog, j4-dmenu-desktop, kodi, lf, libaccounts-qt, libblockdev, libcgroup, libeot, libgit2-1.8, libgit2-1.9, libgme, libhomfly, libkeybinder3, libluv, libmygui, libui, libvidstab, lilypond-doc, linux-asahi, liteide, lm_sensors, lmdb, love, lsyncd, lua54-luafilesystem, mame, mathjax2, md4c, minio, miruo, moosefs, openimageio, openjdk8, otfcc, pgbackrest, poedit, python3-aioamqp, python3-hypothesis, python3-pyqtgraph, re2, redshift, redsocks, sane, sasm, sdl12-compat, slurm-wlm, spectrwm, swayr, swayrbar, taplo, tectonic, tinymist, tmux, vapoursynth, w3m, wire-desktop, wlroots0.18, wlroots0.19, wlroots0.20, x265, xdelta3, yaydl, zeux-volk, zfs-auto-snapshot, zfs-lts, zimg

Numbers

Of 1,574 unique update files:

  • 942 have no site= set, so the URL is transformed to info/refs when the template references a git forge.
    • 426 of these reach a git forge, meaning info/refs is actually used.
      • 152 have custom pattern=:
        • 101 had patterns incompatible with the refs/tags/ prefix and are updated here.
        • 44 of the remaining 51 were already compatible.
        • 7 of the remaining 51 return NO VERSION for unrelated reasons.
      • 274 use the default regex; 8 of these still return NO VERSION due to template peculiarities.
    • 516 reach other hosts (PyPI, etc.), so info/refs is not engaged.
  • 632 have site= set, so the URL is used as-is and info/refs is not engaged.
    • 85 of these still have legacy patterns (referencing tar.gz/archive/refs/tags); they continue working via HTML scraping, so migrating them is cosmetic.

Commands used to gather these numbers:

# Total unique update files
find srcpkgs -maxdepth 2 -name update -type f | wc -l                                # 1574

# With custom pattern
find srcpkgs -maxdepth 2 -name update -type f \
    -exec grep -l '^pattern=' {} \; | wc -l                                          # 682

# Touched in this PR
git diff --diff-filter=AM --name-only master..HEAD -- 'srcpkgs/*/update' \
    | grep -oP 'srcpkgs/\K[^/]+' | sort -u | wc -l                                   # 101

# distfiles_only=yes added in this PR
git diff master..HEAD -- 'srcpkgs/*/update' \
    | grep -c '^+distfiles_only=yes'                                                 # 34

Validation

Each of the 101 modified update files was individually verified with ./xbps-src update-check (often with XBPS_VERBOSE=yes) to confirm the URL fetched, the regex applied (including the auto-prefix), and the version detected.

In addition, the 51 untouched custom-pattern packages (44 already compatible with the refs/tags/ prefix, 7 with pre-existing NO VERSION causes) were cross-checked against the master baseline to confirm the update_check.sh changes introduce no regressions on the unmodified set. No regressions were observed.

Out of scope / follow-up

  • Packages with site= pointing to git-forge HTML pages (~85). These still work via HTML scraping but could be migrated to info/refs for consistency. This is a different class of change (modifying site= rather than pattern=), left as a potential cosmetic follow-up.

  • Templates with version-format mismatches. Some packages have pattern migration that works correctly but the template's version= itself follows an older convention than current upstream (e.g., slurm-wlm template 19.05.5.1 vs upstream 25-11-6-1). The pattern migration is included since it makes detection work again; the version-format bump is left as separate per-package work.

  • Upstream-archived packages (cronutils, otfcc, minio, redshift). Patterns are migrated for consistency, but these may warrant removal or orphan in separate cleanup PRs.

  • SPIRV-Tools tag format. The pattern migration here aligns detection with the other Khronos packages in this PR. The template currently consumes a release-candidate tag format; aligning with the Vulkan SDK release cycle would be a separate decision.

  • libgme upstream move. Upstream migrated from Bitbucket (mpyne/game-music-emu) to GitHub (libgme/game-music-emu). The template still points at the old Bitbucket location, which a future PR may want to address.

Feedback welcome, particularly on the auto-prefix condition scope and whether distfiles_only=yes reads well alongside the existing site= mechanism in Manual.md.

Rutpiv added 30 commits May 18, 2026 00:23
Rutpiv added 28 commits May 18, 2026 00:23
@Rutpiv
Copy link
Copy Markdown
Contributor Author

Rutpiv commented May 18, 2026

Marking as draft. I spent a lot of time on this and want to step back and look at it from a different angle before continuing. Will come back when I have a clearer view of the scope.

@Rutpiv Rutpiv marked this pull request as draft May 18, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant