Skip to content

[benchmark] Add Video Benchmarks#1430

Merged
praateekmahajan merged 26 commits intomainfrom
aot/video-benchmark
Jan 27, 2026
Merged

[benchmark] Add Video Benchmarks#1430
praateekmahajan merged 26 commits intomainfrom
aot/video-benchmark

Conversation

@suiyoubi
Copy link
Copy Markdown
Contributor

Description

Usage

# Add snippet demonstrating usage

Checklist

  • I am familiar with the Contributing Guide.
  • New or Existing tests cover these changes.
  • The documentation is up to date with these changes.

Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jan 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 26, 2026

Greptile Overview

Greptile Summary

This PR adds comprehensive video processing benchmarks to the nightly benchmark suite. The implementation properly reuses existing tutorial code by extracting the argparser and pipeline creation functions into reusable components.

Key changes:

  • Created video_pipeline_benchmark.py that reuses the video splitting pipeline from tutorials with proper metrics collection (videos processed, clips generated, throughput)
  • Added 4 benchmark configurations testing different video processing scenarios: embedding generation, transcoding, captioning with enhancement, and TransNetV2 with motion/aesthetic filtering
  • Refactored video_split_clip_example.py to extract create_video_splitting_argparser() for reuse across scripts
  • Renamed argument from --output-clip-path to --output-path and updated all README examples accordingly
  • Added defensive checks for task.data attributes to prevent AttributeError exceptions

Minor issue:

  • One benchmark requirement has a placeholder value that needs updating after actual benchmarking (line 532)

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The implementation follows established benchmark patterns, properly reuses existing code, includes defensive error handling, and addresses previous review comments. The only concern is a placeholder value that should be updated after benchmarking.
  • No files require special attention - the placeholder value in benchmarking/nightly-benchmark.yaml:532 can be updated in a follow-up after benchmarking is complete

Important Files Changed

Filename Overview
benchmarking/scripts/video_pipeline_benchmark.py new benchmark script that reuses video pipeline from tutorials with proper error handling and metrics collection
benchmarking/nightly-benchmark.yaml adds video dataset configs and 4 benchmark entries (embedding, transcoding, captioning, transnetv2)
tutorials/video/getting-started/video_split_clip_example.py refactored to extract argparser function for reuse, renamed --output-clip-path to --output-path
tutorials/video/getting-started/README.md updated all examples to use --output-path instead of deprecated --output-clip-path

Sequence Diagram

sequenceDiagram
    participant User
    participant Benchmark as video_pipeline_benchmark.py
    participant Utils as utils.py
    participant Tutorial as video_split_clip_example.py
    participant Pipeline as Video Pipeline
    participant Executor as Xenna/RayData Executor

    User->>Benchmark: Run benchmark with args
    Benchmark->>Tutorial: create_video_splitting_argparser()
    Tutorial-->>Benchmark: ArgumentParser
    Benchmark->>Benchmark: Add benchmark args (--benchmark-results-path, --executor)
    Benchmark->>Benchmark: parse_args()
    
    Benchmark->>Utils: setup_executor(args.executor)
    Utils-->>Benchmark: Executor instance
    
    Benchmark->>Tutorial: create_video_splitting_pipeline(args)
    Tutorial->>Pipeline: Create Pipeline("video_splitting")
    Tutorial->>Pipeline: Add VideoReader stage
    Tutorial->>Pipeline: Add splitting stage (FixedStride/TransNetV2)
    Tutorial->>Pipeline: Add ClipTranscodingStage
    
    alt Generate Embeddings
        Tutorial->>Pipeline: Add embedding stage (CosmosEmbed1/InternVideo2)
    end
    
    alt Generate Captions
        Tutorial->>Pipeline: Add VideoFrameCaptioningStage
        alt Enhance Captions
            Tutorial->>Pipeline: Add LLMCaptionImprovementStage
        end
    end
    
    alt Motion/Aesthetic Filtering
        Tutorial->>Pipeline: Add VideoMotionFilterStage
        Tutorial->>Pipeline: Add VideoAestheticFilterStage
    end
    
    Tutorial->>Pipeline: Add ClipWriterStage
    Tutorial-->>Benchmark: Pipeline object
    
    Benchmark->>Pipeline: pipeline.run(executor)
    Pipeline->>Executor: Process video tasks
    Executor-->>Pipeline: output_tasks
    Pipeline-->>Benchmark: output_tasks
    
    Benchmark->>Benchmark: Calculate metrics (videos processed, clips generated, throughput)
    Benchmark->>Utils: write_benchmark_results(results, path)
    Utils->>Utils: Write params.json, metrics.json, tasks.pkl
    Utils-->>Benchmark: Success
    
    Benchmark-->>User: Exit code (0=success, 1=failure)
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread benchmarking/nightly-benchmark.yaml Outdated
Comment on lines +9 to +10
results_path: /raid/aot/output/curator_benchmark
datasets_path: /raid/aot/datasets
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check that these paths (/raid/aot/...) are appropriate for the shared benchmark configuration, as they appear specific to a local development environment.

Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Signed-off-by: Ao Tang <aot@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile


# Calculate metrics from output tasks
# Count unique videos by their input_video path
unique_videos = {task.data.input_video for task in output_tasks if task.data and task.data.input_video}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential AttributeError if task.data is None or doesn't have input_video attribute

Suggested change
unique_videos = {task.data.input_video for task in output_tasks if task.data and task.data.input_video}
unique_videos = {task.data.input_video for task in output_tasks if task.data and hasattr(task.data, 'input_video') and task.data.input_video}

# Count unique videos by their input_video path
unique_videos = {task.data.input_video for task in output_tasks if task.data and task.data.input_video}
num_videos_processed = len(unique_videos)
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and task.data.clips)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same defensive check needed here for clips attribute

Suggested change
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and task.data.clips)
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and hasattr(task.data, 'clips') and task.data.clips)

Signed-off-by: Ao Tang <aot@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Ao Tang <aot@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Ao Tang <aot@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +78 to +80
unique_videos = {task.data.input_video for task in output_tasks if task.data and task.data.input_video}
num_videos_processed = len(unique_videos)
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and task.data.clips)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need defensive checks for task.data, task.data.input_video, and task.data.clips to handle potential None values or missing attributes more robustly.

Suggested change
unique_videos = {task.data.input_video for task in output_tasks if task.data and task.data.input_video}
num_videos_processed = len(unique_videos)
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and task.data.clips)
unique_videos = {task.data.input_video for task in output_tasks if task.data and hasattr(task.data, 'input_video') and task.data.input_video}
num_videos_processed = len(unique_videos)
num_clips_generated = sum(len(task.data.clips) for task in output_tasks if task.data and hasattr(task.data, 'clips') and task.data.clips)

Signed-off-by: Ao Tang <aot@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

parser.add_argument("--video-limit", type=int, default=None, help="Limit the number of videos to read")
parser.add_argument("--verbose", action="store_true", default=False)
parser.add_argument("--output-clip-path", type=str, help="Path to output clips", required=True)
parser.add_argument("--output-path", type=str, help="Path to output clips", required=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument was renamed from --output-clip-path to --output-path, but README.md in this directory still uses the old name in all examples (lines 20, 36, 47, 80). Update the README to use --output-path instead.

Comment thread benchmarking/nightly-benchmark.yaml Outdated
timeout_s: 1800
ray:
num_cpus: 64
num_gpus: 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_gpus = 4

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

requirements:
# ensure the total number of documents processed is correct
- metric: num_clips_generated
exact_value: 300 # TODO: update this value after benchmarking
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

placeholder value (300) needs updating after actual benchmarking

@praateekmahajan praateekmahajan merged commit c022a7e into main Jan 27, 2026
50 checks passed
@sarahyurick sarahyurick mentioned this pull request Feb 11, 2026
44 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants