Skip to content

[SPARK-56808][INFRA][3.5] Fix branch-3.5 base image build against Ubuntu focal archive rotation#55785

Closed
LuciferYang wants to merge 6 commits intoapache:branch-3.5from
LuciferYang:fix-SPARK-3.5-base-image
Closed

[SPARK-56808][INFRA][3.5] Fix branch-3.5 base image build against Ubuntu focal archive rotation#55785
LuciferYang wants to merge 6 commits intoapache:branch-3.5from
LuciferYang:fix-SPARK-3.5-base-image

Conversation

@LuciferYang
Copy link
Copy Markdown
Contributor

@LuciferYang LuciferYang commented May 9, 2026

What changes were proposed in this pull request?

Three small changes in dev/infra/Dockerfile to make the scheduled Base image build job on branch-3.5 green again:

  1. Add https://mirrors.edge.kernel.org/ubuntu as an additional APT source (focal, focal-updates, focal-security). This mirrors the pattern the master branch already uses and provides a stable fallback when archive.ubuntu.com / security.ubuntu.com rotate point-release packages out of the canonical archive.
  2. Merge apt-get update into the SPARK-39959 install step so its APT index is aligned with the archive at install time, instead of relying on an index cached many Docker layers earlier.
  3. Bump FULL_REFRESH_DATE from 20221118 to 20260510 so the GH Actions base-image cache is invalidated and this fix actually takes effect on the next run.

The base image itself (ubuntu:focal-20221019) is unchanged — branch-3.5 is in maintenance and not a good place to upgrade to jammy.

Why are the changes needed?

The scheduled Build (branch-3.5, Scala 2.13, Hadoop 3, JDK 8) workflow on 2026-05-09 failed during Base image build with multiple 404 Not Found errors while installing -dev packages (libtiff5-dev, libharfbuzz-dev, libglib2.0-dev, libfreetype6-dev, libblkid-dev, libmount-dev, ...). See:

https://github.com/apache/spark/actions/runs/25599925191/job/75152057946

Root cause: Ubuntu 20.04 (focal) entered ESM in April 2025. Security point releases rotate out of archive.ubuntu.com / security.ubuntu.com faster than before. When the Dockerfile's cached APT index (fetched many layers earlier) references a point-release version that has since been rotated, apt-get install hits 404.

The fix avoids the race by (a) adding a reliably-synced additional mirror and (b) refreshing the APT index right before the failing install step.

Does this PR introduce any user-facing change?

No. Infra-only change to the CI base image on branch-3.5.

How was this patch tested?

  • Pass Github Actions

…ntu focal archive rotation

Three changes in dev/infra/Dockerfile:
1. Add mirrors.edge.kernel.org as an additional APT source (same approach
   as master), providing a stable fallback when archive.ubuntu.com /
   security.ubuntu.com rotate point-release packages out of the archive.
2. Merge `apt-get update` into the SPARK-39959 install step so its APT
   index is aligned with the archive at install time.
3. Bump FULL_REFRESH_DATE to 20260510 to invalidate the GH Actions
   base-image cache so this fix takes effect.

The scheduled Base image build on branch-3.5 has been failing with 404s
when fetching -dev packages (libtiff5-dev, libharfbuzz-dev, libglib2.0-dev,
libfreetype6-dev, ...). focal entered ESM in April 2025 and point releases
rotate out of the canonical archives faster than before.

Example failure:
https://github.com/apache/spark/actions/runs/25599925191/job/75152057946
@LuciferYang LuciferYang marked this pull request as draft May 9, 2026 19:14
… avoid CA trust failure on focal-20221019

The previous commit added the kernel.org mirror over HTTPS, but the
ubuntu:focal-20221019 base image ships with a stale CA bundle and no
ca-certificates is installed at the first `apt-get update`, causing:

    Certificate verification failed: The certificate is NOT trusted.
    W: No system certificates available. Try installing ca-certificates.
    E: The repository 'https://mirrors.edge.kernel.org/ubuntu focal Release'
       does not have a Release file.

Switch the mirror.list entries from https:// to http://. APT verifies
Release/Packages indices via GPG signatures and each .deb via SHA256,
so HTTP is safe here and unblocks `apt-get update` on this older base
image.
@sarutak
Copy link
Copy Markdown
Member

sarutak commented May 9, 2026

FYI, I'm working on the same issue too with another approach.
#55740

…specific URLs

`https://bootstrap.pypa.io/get-pip.py` now requires Python 3.10+:

    ERROR: This script does not work on Python 3.9.
    The minimum supported Python version is 3.10.
    Please use https://bootstrap.pypa.io/pip/3.9/get-pip.py instead.

Since branch-3.5 installs pip into python3.9 and pypy3.8, switch to the
version-pinned installers that pypa still maintains:

- python3.9 -> https://bootstrap.pypa.io/pip/3.9/get-pip.py
- pypy3 (3.8) -> https://bootstrap.pypa.io/pip/3.8/get-pip.py
…uild

pypy3.8 does not provide `ast.MatchStar` (a Python 3.10+ pattern-matching
AST node). When pip builds scipy from source on pypy3.8, it creates an
isolated build env that installs scipy's build dependencies
(meson-python -> pythran -> beniget -> gast). `beniget` contains:

    elif isinstance(self.node, (ast.MatchStar, ast.MatchAs)):

where `ast` is an alias for `gast`. If pip resolves gast < 0.5.3 into the
overlay, gast has no `MatchStar` class of its own and the lookup fails
with:

    AttributeError: module 'gast' has no attribute 'MatchStar'

aborting the scipy build.

gast >= 0.5.3 defines MatchStar/MatchAs as its own node classes,
independent of whether Python's native `ast` has them. Force gast >=
0.5.3 in the build-isolation overlay by exporting PIP_CONSTRAINT; the
overlay pip inherits the env var (pip >= 22.1) and respects the
constraint.

This preserves existing pypy3 test coverage (keeps scipy / numpy /
pandas / matplotlib) without changing the pypy version.
@holdenk
Copy link
Copy Markdown
Contributor

holdenk commented May 10, 2026

Also predating both of those in #55432 :p

@holdenk
Copy link
Copy Markdown
Contributor

holdenk commented May 10, 2026

I think we do need to upgrade to Jammy however since focal is dead. Either way any changes to the dockerfile breaks the cache which until the Ubuntu DDoS is resolved means the PPAs flake.

…cal to jammy

Focal (20.04) reached end-of-standard-support and its archive rotation,
stale CA bundle, and deprecated get-pip.py URLs repeatedly broke the
branch-3.5 base image build. Each focal-side workaround surfaced a new
failure (TLS on kernel.org mirror, version-specific get-pip.py, then
scipy/pythran/beniget requiring Python 3.10 AST nodes that pypy3.8 lacks).

Switch the base image to ubuntu:jammy-20240911.1 (same as master). Keep
branch-3.5 specifics unchanged: openjdk-8, python3.8 / python3.9 via
ppa:deadsnakes, and all numpy/pandas/pyarrow/scipy/grpcio/torch version
pins. Bump pypy3.8 to pypy3.10 so the scipy build chain works without
the PIP_CONSTRAINT gast>=0.5.3 workaround. Switch R repo to
jammy-cran40 and add libuv1-dev so the R `fs` package configures
cleanly.
…sitory on Jammy

The Jammy base image ships without gnupg, and --no-install-recommends
stops software-properties-common from pulling it in transitively. As a
result `add-apt-repository ppa:deadsnakes/ppa` fails while trying to
import the PPA key:

  subprocess.CalledProcessError: Command '['gpg', ..., '--import']'
  returned non-zero exit status 2.

Pull gnupg/ca-certificates/dirmngr into the first apt install so the
PPA key import succeeds. Drop the now-redundant later install of
gnupg/ca-certificates.
@LuciferYang
Copy link
Copy Markdown
Contributor Author

@holdenk @sarutak I agree to this upgrade as long as it doesn't introduce breaking changes. It seems you've all made solid progress. I'll close this PR for now, and feel free to tag me in for a review later.

@holdenk
Copy link
Copy Markdown
Contributor

holdenk commented May 10, 2026

Awesome, let me ping you once the other one is finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants