Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
f02d4a0
feat: Implement Kubernetes job client and service
javanlacerda Dec 24, 2025
697cdd7
pipenv lock
javanlacerda Dec 24, 2025
3310027
install kubernetes
javanlacerda Dec 27, 2025
7f12a07
Update dependencies and fix linting
javanlacerda Jan 9, 2026
b986275
move use_batch
javanlacerda Jan 9, 2026
5b7b4d5
add todo
javanlacerda Jan 9, 2026
ca629cd
Pr/metrics logging (#5115)
javanlacerda Jan 9, 2026
234f0ac
fixes
javanlacerda Jan 9, 2026
6838442
fix lint
javanlacerda Jan 9, 2026
eec7e6b
mock gcloud auth default
javanlacerda Jan 9, 2026
110440b
feature flag
javanlacerda Jan 8, 2026
ef6e272
fix lint
javanlacerda Jan 9, 2026
636a3bf
rename create utask main jobs
javanlacerda Jan 10, 2026
7cd5067
delete out.og
javanlacerda Jan 10, 2026
f76c56f
fix lint
javanlacerda Jan 11, 2026
e9c7f7d
add ack postprocess
javanlacerda Jan 12, 2026
b837c06
rollback fuzz local executor
javanlacerda Jan 13, 2026
a498af0
fuzz
javanlacerda Jan 13, 2026
2d723a8
fix
javanlacerda Jan 13, 2026
d531174
fix test
javanlacerda Jan 14, 2026
172ee90
Fix k8s service tests to use feature flag for pending jobs limit
javanlacerda Jan 14, 2026
bd08a69
refactoring
javanlacerda Jan 14, 2026
3075c41
add string value to feature flag
javanlacerda Jan 14, 2026
81d8407
fix
javanlacerda Jan 14, 2026
e196477
extract k7s job templates
javanlacerda Jan 15, 2026
3692699
create adapters enum
javanlacerda Jan 15, 2026
ae2e936
lint
javanlacerda Jan 16, 2026
c108e35
set ttl for the job after completion
javanlacerda Jan 16, 2026
092e27f
revert Pipfile
javanlacerda Jan 18, 2026
9b3a503
fix JTW
javanlacerda Jan 18, 2026
941c7d1
use actions/checkout@v4
javanlacerda Jan 18, 2026
2114adf
fix single quotes usage
javanlacerda Jan 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions .github/workflows/kubernetes-e2e-tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: Run Kubernetes e2e tests
on: [pull_request]

permissions: read-all

jobs:
build:
runs-on: ubuntu-24.04

steps:
- uses: actions/checkout@v4
- run: | # Needed for git diff to work.
git fetch origin master --depth 1
git symbolic-ref refs/remotes/origin/HEAD refs/remotes/origin/master

- name: Setup python environment
uses: actions/setup-python@b55428b1882923874294fa556849718a1d7f2ca5
with:
python-version: 3.11

- name: Set up JDK 21
uses: actions/setup-java@v4
with:
java-version: '21'
distribution: 'temurin'

- name: Run Kubernetes e2e tests
run: ./local/tests/kubernetes_e2e_test.bash
24 changes: 13 additions & 11 deletions ...clusterfuzz/_internal/batch/kubernetes.py → local/tests/kubernetes_e2e_test.bash
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#!/bin/bash -ex
#
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
Expand All @@ -11,17 +13,17 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Kubernetes batch client."""
from clusterfuzz._internal.remote_task import RemoteTaskInterface

# This script is for running the Kubernetes end-to-end test in CI.

pip install pipenv

# Install dependencies.
pipenv --python 3.11
pipenv install

class KubernetesJobClient(RemoteTaskInterface):
"""A remote task execution client for Kubernetes.
This class is a placeholder for a future implementation of a remote task
execution client that uses Kubernetes. It is not yet implemented.
"""
./local/install_deps.bash
Comment on lines +18 to +25
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only intended to be used in CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!


def create_job(self, spec, input_urls):
"""Creates a Kubernetes job."""
raise NotImplementedError('Kubernetes batch client is not implemented yet.')
# Run the test.
export K8S_E2E=1
pipenv run python butler.py py_unittest -t core -p k8s_service_e2e_test.py
5 changes: 3 additions & 2 deletions src/Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,16 @@ google-crc32c = "==1.5.0"
grpcio = "==1.62.2"
httplib2 = "==0.19.0"
jira = "==2.0.0"
Jinja2 = "==3.1.4"
kubernetes = "==34.1.0"
mozprocess = "==1.3.1"
oauth2client = "==4.1.3"
psutil = "==5.9.4"
protobuf = "==4.23.4"
pygithub = "==1.55"
pyOpenSSL = "==22.0.0"
python-dateutil = "==2.8.1"
PyYAML = "==6.0"
PyYAML = "==6.0.1"
pytz = "==2023.3"
redis = "==4.6.0"
requests = "==2.21.0"
Expand All @@ -49,7 +51,6 @@ charset-normalizer = "==3.3.2"
firebase-admin = "==6.2.0"
Flask = "==2.2.2"
itsdangerous = "==2.0.1"
Jinja2 = "==3.1.4"
PyJWT = "==2.7.0"
requests-toolbelt = "==0.9.1"
werkzeug = "==2.2.2"
Expand Down
1,740 changes: 1,061 additions & 679 deletions src/Pipfile.lock

Large diffs are not rendered by default.

14 changes: 12 additions & 2 deletions src/clusterfuzz/_internal/base/tasks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@
'regression': 24 * 60 * 60,
}


def get_task_duration(command):
"""Gets the duration of a task."""
return TASK_LEASE_SECONDS_BY_COMMAND.get(command, TASK_LEASE_SECONDS)


TASK_QUEUE_DISPLAY_NAMES = {
'LINUX': 'Linux',
'LINUX_WITH_GPU': 'Linux (with GPU)',
Expand Down Expand Up @@ -503,6 +509,7 @@ def __init__(self, pubsub_message):
}

self.eta = datetime.datetime.utcfromtimestamp(float(self.attribute('eta')))
self.do_not_ack = False

def attribute(self, key):
"""Return attribute value."""
Expand Down Expand Up @@ -550,7 +557,8 @@ def lease(self, _event=None): # pylint: disable=arguments-differ
leaser_thread.join()

# If we get here the task succeeded in running. Acknowledge the message.
self._pubsub_message.ack()
if not self.do_not_ack:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its part of the job limiter for the Kubernetes service, we can probably use this for implement the job limiter for Batch as well, using the new feature they implemented for us. The rationale behind is if the task cannot be scheduled for Kubernetes because it already reached the limit of jobs, the message should not be acked, allowing the other adapter, such as Batch, to process the message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO readability would be improved by using ack instead of do_not_ack (go/tott/764).

self._pubsub_message.ack()
track_task_end()

def dont_retry(self):
Expand Down Expand Up @@ -587,7 +595,8 @@ def lease(self, _event=None): # pylint: disable=arguments-differ
leaser_thread.join()

# If we get here the task succeeded in running. Acknowledge the message.
self._pubsub_message.ack()
if not self.do_not_ack:
self._pubsub_message.ack()
track_task_end()


Expand Down Expand Up @@ -730,6 +739,7 @@ def __init__(self,
grandparent_class.__init__(command, output_url_argument, job_type, eta,
is_command_override, high_end)
self._pubsub_message = pubsub_message
self.do_not_ack = False

def ack(self):
self._pubsub_message.ack()
Expand Down
200 changes: 0 additions & 200 deletions src/clusterfuzz/_internal/batch/gcp.py

This file was deleted.

Loading
Loading