Bulter.py: Adds `preprocess` command for local preprocess by IvanBM18 · Pull Request #5266 · google/clusterfuzz

IvanBM18 · 2026-05-05T04:01:56Z

Adds `preprocess` butler script

This command allows developers to trigger the preprocess portion of a fuzz task and in consecuence generate the serialized and compressed uworker_input payload, upload it to real GCS, and get the signed download URL, exactly as it happens remotely. We can then use the resulting url to trigger a task in any backend that we want:

In swarming trough a prpc request
In batch trough manually posting the task to the utask_main queue

This accelerates local debugging of the tworker preprocessing phase without relying on remote execution queues, which has proven to take multiple hours to "ACK" a task request.

Note: To use this command you need the Secret Manager Secret Accessor for Dev or setup a service account key in your local(by using the gcloud auth cli) that has said role and any other role required for a tworker's preprocess.

Changes

Added the preprocess subcommand.
- It interacts with the actual Datastore and GCS based on the provided configuration.
- It fetches and populates the uworker_env with:
  - Job-specific environment variables from the Datastore.
  - Fuzzer-specific environment variables (for blackbox fuzzers).
  - Required logging metadata (CF_TASK_NAME, CF_TASK_ARGUMENT, CF_TASK_JOB_NAME, CF_TASK_ID) to ensure logs in the subsequent uworker_main step have the correct context.

Tests performed

Executed the following command in dev:

pipenv run python butler.py preprocess --fuzzer <fuzzer> --job <job> -c <config_dir>

Successfully creates and uploads the payload and returns a valid signed URL. This signed url was later used to trigger a swarming task trough prpc, here are the logs

jardondiego · 2026-05-06T22:45:31Z

+  return uworker_env
+
+
+def _early_setup(args):


nit: any reason not to name it just setup?

Since most of this script is just env setup i thought it was easier for me to understand what made this method different or what is purpose is if i called it early_setup

I think setup by itself makes more sense and it is also more common to find throughout the codebase but I am also okay if we go through with it as is.

jardondiego · 2026-05-06T22:47:11Z

Note: To use this command you need the Secret Manager Secret Accessor for Dev or setup a service account in your local that has those permissions.

Did you use a .json service account for this? It would be nice to have a few sentences/links to understand it better.

jardondiego · 2026-05-06T22:48:17Z

Does it need to be a subcommand of butler itself? Why not make it a standalone script? I think it's a good idea to not over-populate butler with subcommands. What do you think?

letitz · 2026-05-07T08:46:46Z

+  uworker_env = _get_job_environment(args.job)
+  uworker_env.update(_get_fuzzer_environment(args.fuzzer, args.job))
+
+  # Replicate what process_command_impl does in a real tworker


Could we use process_command_impl() then instead?

At the end of said method we call run_command():
https://github.com/google/clusterfuzz/blob/master/src/clusterfuzz/_internal/bot/tasks/commands.py#L482
Which in turns triggers a workflow in which the preprocess step immediately queues the main task for remote execution when finished or just straight ups executes all 3 steps in the same machine(depending on setup), but we don't want that, we want to stop just after finishing the preprocess so we could manually trigger the main portion wherever and whenever we need to

IvanBM18 · 2026-05-07T16:56:22Z

@jardondiego

Note: To use this command you need the Secret Manager Secret Accessor for Dev or setup a service account in your local that has those permissions.

Did you use a .json service account for this? It would be nice to have a few sentences/links to understand it better.

Not in this case, but its possible to use a service account, you just need to generate a key, save it in your local and set it up as the default credentials for any gcloud library and cli operation. This is done using the gcloud auth subcommand.

Added more context in the description so future reviewers can easily understand this

IvanBM18 · 2026-05-07T16:58:08Z

@jardondiego

Does it need to be a subcommand of butler itself? Why not make it a standalone script? I think it's a good idea to not over-populate butler with subcommands. What do you think?

Yes, its need to as butler already handles a lot of bootstrapping operations for the same purpose, for example if we didn't use butler, we would need to add methods to read, parse, and populate configurations based on the yaml files

jardondiego · 2026-05-07T23:40:06Z

Yes, its need to as butler already handles a lot of bootstrapping operations for the same purpose, for example if we didn't use butler, we would need to add methods to read, parse, and populate configurations based on the yaml files

I think we are referring to different things, what I mean is that I think we could have as a standalone butler script so that we can run it with as

python butler.py run <name_of_script> --non-dry-run --config $MY_DIR

That way we don't have to handle all of that by ourselves. Does it make sense?

jardondiego · 2026-05-07T23:40:26Z

Added more context in the description so future reviewers can easily understand this

Thanks!

IvanBM18 added 3 commits April 30, 2026 18:32

Add butler command to run uworker preprocess locally

7222be2

Implement environment preparation in uworker preprocess

618c19f

Rename script to just "Preprocess"

a30f9a1

IvanBM18 requested review from ViniciustCosta, Xeicker, jardondiego, javanlacerda and letitz May 5, 2026 04:01

IvanBM18 self-assigned this May 5, 2026

IvanBM18 requested a review from a team as a code owner May 5, 2026 04:01

IvanBM18 changed the title ~~Bulter.py: Adds uworker_preprocess command for local preprocess~~ Bulter.py: Adds preprocess command for local preprocess May 5, 2026

Fixes Linter errors

51348d4

jardondiego reviewed May 6, 2026

View reviewed changes

letitz reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bulter.py: Adds `preprocess` command for local preprocess#5266

Bulter.py: Adds `preprocess` command for local preprocess#5266
IvanBM18 wants to merge 4 commits intomasterfrom
feature/butler/local_preprocess

IvanBM18 commented May 5, 2026 •

edited

Loading

Uh oh!

jardondiego May 6, 2026

Uh oh!

IvanBM18 May 7, 2026 •

edited

Loading

Uh oh!

jardondiego May 7, 2026

Uh oh!

jardondiego commented May 6, 2026

Uh oh!

jardondiego commented May 6, 2026

Uh oh!

letitz May 7, 2026

Uh oh!

IvanBM18 May 7, 2026 •

edited

Loading

Uh oh!

IvanBM18 commented May 7, 2026 •

edited

Loading

Uh oh!

IvanBM18 commented May 7, 2026 •

edited

Loading

Uh oh!

jardondiego commented May 7, 2026

Uh oh!

jardondiego commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

IvanBM18 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Adds preprocess butler script

Changes

Tests performed

Uh oh!

jardondiego May 6, 2026

Choose a reason for hiding this comment

Uh oh!

IvanBM18 May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jardondiego May 7, 2026

Choose a reason for hiding this comment

Uh oh!

jardondiego commented May 6, 2026

Uh oh!

jardondiego commented May 6, 2026

Uh oh!

letitz May 7, 2026

Choose a reason for hiding this comment

Uh oh!

IvanBM18 May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

IvanBM18 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IvanBM18 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jardondiego commented May 7, 2026

Uh oh!

jardondiego commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IvanBM18 commented May 5, 2026 •

edited

Loading

Adds `preprocess` butler script

IvanBM18 May 7, 2026 •

edited

Loading

IvanBM18 May 7, 2026 •

edited

Loading

IvanBM18 commented May 7, 2026 •

edited

Loading

IvanBM18 commented May 7, 2026 •

edited

Loading