Skip to content

Add python script to launch vtk benchmark#1

Open
Lambourl wants to merge 41 commits into
mainfrom
feature/add-benchmark
Open

Add python script to launch vtk benchmark#1
Lambourl wants to merge 41 commits into
mainfrom
feature/add-benchmark

Conversation

@Lambourl
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@pysco68 pysco68 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, but I don't this we're tracking all relevant data in the most useful way yet;

We need to time:

  • full rebuild (clean)
  • Initial full build
  • Configure

For "baselined cmake+ninja and for cmake-re" AFAICT

Comment thread vtk/benchmark.py Outdated
Comment on lines +104 to +105
tc_name = Path(toolchain).stem
output_file = f"cmake-run_{tc_name}.txt"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't be dealing with manually collecting that data and writing it to files especially since you're not using structured data anyway;

import logging
from datetime import datetime

log_filename = f"kitwarebench_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}.log"

# Configure the logging framework
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(log_filename), # Writes to the file
        logging.StreamHandler()            # Also prints to your console
    ]
)

# Use it!
logging.info("Application started successfully.")
logging.warning("This is a warning message.")
logging.error("An error occurred.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, you might want to go structured data and serialize to CSV or JSON both of which can be done using python built-ins

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't a completely finished script yet. But I always wanted to use JSON to store that information in the end.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so should be done with json

Comment thread vtk/benchmark.py Outdated
Comment on lines +42 to +98
def start_docker(image, source_dir, container_name=None):
"""Start a detached docker container with source_dir mounted. Returns the container name."""
if container_name is None:
container_name = str(uuid.uuid4())
uid = os.getuid()
gid = os.getgid()
username = getpass.getuser()
home = os.environ["HOME"]

print(f"Starting container {container_name}...")
subprocess.run([
"docker", "run",
"--platform", "linux/amd64",
"--rm", "--init",
"--name", container_name,
f"-u{uid}:{gid}", "--group-add", "tipi",
"--ulimit", "nofile=65535:65535", #
"-e", "TIPI_DISABLE_AR_RANLIB_DRIVER=ON",
"-e", "TIPI_CACHE_CONSUME_ONLY=ON",
"-e", "TIPI_CACHE_FORCE_ENABLE=OFF",
"-e", "HOME",
"-e", "RBE_service=kernite.cluster.engflow.com:443",
"-e", f"RBE_tls_client_auth_key={home}/engflow-mTLS/engflow.key",
"-e", f"RBE_tls_client_auth_cert={home}/engflow-mTLS/engflow.crt",
"-v", f"{home}:{home}:rw",
"-v", f"{source_dir}:{source_dir}:rw",
"-w", str(source_dir),
"-d",
image,
"sleep", "infinity",
], check=True)

# Create the user inside the container
subprocess.run([
"docker", "exec", "-u", "0", container_name,
"useradd", "-d", home, "-u", str(uid), username,
], check=False)

print(f"Container running: {container_name}")
return container_name


def stop_docker(container_name):
"""Stop a running docker container."""
print(f"Stopping container {container_name[:8]}...")
subprocess.run(["docker", "stop", "-t0", container_name], check=True)
print("Container stopped.")


def docker_exec(container_name, cmd):
"""Run a command inside a running docker container and return elapsed time."""
print(f" [{container_name[:8]}] Running: {cmd}")
start = time.perf_counter()
subprocess.run(["docker", "exec", container_name, "bash", "-c", cmd], check=True)
elapsed = time.perf_counter() - start
print(f" Done in {elapsed:.2f}s")
return elapsed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: This should be in a Python RAII wrapper that allows managing the lifecycle of the container IMHO (so a class with __exit__(self, exc_type, exc_val, exc_tb) and __enter__(self) defined which then allows to do with ... as x: ... blocks)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread vtk/benchmark.py Outdated
"""Run a command inside a running docker container and return elapsed time."""
print(f" [{container_name[:8]}] Running: {cmd}")
start = time.perf_counter()
subprocess.run(["docker", "exec", container_name, "bash", "-c", cmd], check=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should capture the output and log it somewhere because we will certainly want to know about the error messages / details / RBE invocation ID for profiles ...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread vtk/benchmark.py Outdated

# Second run: with warm RBE cache
print(" [with-cache] cmake-re configure + build...")
docker_exec(container, f"cmake-re -GNinja -S . -B ./build -DCMAKE_TOOLCHAIN_FILE={toolchain} --host --distributed")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

track the configure time too as this was part of the tracked data in Bill's initial benchmark

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread vtk/benchmark.py Outdated
Comment on lines +189 to +190
benchmark_vtk_project_cmake(source_dir, image, args.iterations, toolchains)
benchmark_vtk_project_cmake_re(source_dir, image, args.iterations, toolchains)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick; feels like the loop doing the toolchain and iterations could be defined here to not repeat yourself so often.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment thread vtk/benchmark.py Outdated
@Orphis
Copy link
Copy Markdown

Orphis commented Apr 29, 2026

Just a general comment about the Python code: you should now use type annotations on your functions whenever possible. It's been supported by Python 3 for a while now.

On the benchmark side, it might be interesting to check the cache hit rate on rebuild by extracting it from the reproxy log files. I don't think we have much cache invalidation in this project, but if we were to reuse some of this, we'd definitely need it.

@Lambourl
Copy link
Copy Markdown
Collaborator Author

On the benchmark side, it might be interesting to check the cache hit rate on rebuild by extracting it from the reproxy log files. I don't think we have much cache invalidation in this project, but if we were to reuse some of this, we'd definitely need it.

I'm currently working on a way to retrieve all the logs related to the Docker container at the end of the process, which will allow us to review them.

@Lambourl Lambourl force-pushed the feature/add-benchmark branch from ce105fc to 781b68b Compare April 30, 2026 09:43
@Lambourl Lambourl force-pushed the feature/add-benchmark branch from be2552f to 4249615 Compare May 4, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants