Mosaic memory profiling tutorial #3752

basilwong · 2026-01-28T01:36:50Z

Tutorial Content Changes

Template Compliance

Added # -- coding: utf-8 -- encoding declaration
Added grid cards for "What you will learn" and "Prerequisites"
Added "Further Reading" section with 5 links

GPU/Environment Compatibility

Added HAS_CUDA flag to handle missing GPU (for Colab users)
Added HAS_MOSAIC_CLI check using shutil.which()
Added sys.modules["main"].file fix for sphinx-gallery environment

Images and Visual Documentation

Added figure directives for Mosaic profiling screenshots in introduction
Added GPT-2 memory profile comparison images (with/without activation checkpointing)

Google Colab Support

Added download instructions code block for Colab users to download generated files

Subprocess Output Visibility

Fixed subprocess.run() calls to use capture_output=True, text=True
Now prints stdout/stderr so users can see Mosaic CLI output

Code Block Organization

Split Mosaic analysis (baseline vs buggy) into separate code blocks for readability

Buggy Model Refactor (GPT2WithDebugOverhead)

Changed from subclassing GPT2LMHeadModel to wrapping it as torch.nn.Module
Constructor now takes base_model parameter instead of config
Fixes transformers version compatibility issues
Removed try/except workaround that was bypassing tutorial purpose
Removed unused GPT2Config import

Text/Formatting Updates

Embedded Mosaic repo link in first mention
Embedded LLaMA 3.1 blog link
Changed section headers from RST underlines to bold text
Removed "What this tells us" bold formatting
Various cleanup of conclusion and key takeaways

Introduces a beginner tutorial demonstrating how to use Mosaic for GPU memory analysis in PyTorch. The tutorial covers: - Analyzing memory savings from activation checkpointing - Debugging unexpected memory usage from abandoned code - Integrating Mosaic into training pipelines for CI/CD Includes graceful handling for environments without GPU access.

Add HAS_MOSAIC_CLI check to skip Mosaic CLI subprocess calls when the mosaic package is not installed. This prevents FileNotFoundError in CI environments that have CUDA but don't have Mosaic installed.

Remove check=True from subprocess.run calls to prevent exceptions when Mosaic CLI commands fail. Instead, check return codes and print informative messages. This allows the tutorial to run in environments where Mosaic is partially installed or configured differently.

Set __main__.__file__ to a valid file path if not present. Transformers library reads this file to inspect source code, so we provide the tutorial file path or fall back to the transformers module path if __file__ is not available.

Wrap buggy model instantiation in try/except to handle ValueError from newer transformers versions that don't support experts implementation on GPT2Model. Falls back gracefully when the demo cannot run.

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>

- Add tutorial to "What's new in PyTorch tutorials" section - Add customcarditem in Profiling section of index.rst - Add customcarditem and toctree entry in ecosystem.rst

…buggy model - Add GPT-2 memory profiling images (with/without activation checkpointing) - Add Google Colab download instructions for generated files - Fix subprocess.run to capture and print Mosaic CLI output - Split Mosaic analysis into separate code blocks for readability - Refactor GPT2WithDebugOverhead to use wrapper pattern instead of subclassing, fixing transformers version compatibility issues - Remove try/except workaround that was bypassing the tutorial's purpose - Update section formatting (bold headers instead of RST underlines)

pytorch-bot · 2026-01-28T01:36:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3752

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

basilwong · 2026-01-28T01:51:44Z

Superseded by #3753 which has a cleaner diff showing only the improvements on top of the merged #3744

basilwong and others added 13 commits January 25, 2026 15:41

Mosaic Memory Profiling Tutorial

9e758a8

Adding static pictures to the tutorial

275ca40

Update mosaic_memory_profiling_tutorial.py

249020f

Add Mosaic library to Docker requirements

826aa18

Fix CI failure by checking for Mosaic CLI availability

d219877

Add HAS_MOSAIC_CLI check to skip Mosaic CLI subprocess calls when the mosaic package is not installed. This prevents FileNotFoundError in CI environments that have CUDA but don't have Mosaic installed.

Handle transformers version compatibility in Case 2

f9f3186

Wrap buggy model instantiation in try/except to handle ValueError from newer transformers versions that don't support experts implementation on GPT2Model. Falls back gracefully when the demo cannot run.

Update beginner_source/mosaic_memory_profiling_tutorial.py

739d16f

Co-authored-by: Svetlana Karslioglu <svekars@meta.com>

Add Mosaic memory profiling tutorial to index and ecosystem pages

1544d5e

- Add tutorial to "What's new in PyTorch tutorials" section - Add customcarditem in Profiling section of index.rst - Add customcarditem and toctree entry in ecosystem.rst

Merge branch 'main' into mosaic-memory-profiling-tutorial

019dc27

meta-cla bot added the cla signed label Jan 28, 2026

basilwong closed this Jan 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mosaic memory profiling tutorial #3752

Mosaic memory profiling tutorial #3752

Uh oh!

basilwong commented Jan 28, 2026

Uh oh!

pytorch-bot bot commented Jan 28, 2026

Uh oh!

basilwong commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mosaic memory profiling tutorial #3752

Mosaic memory profiling tutorial #3752

Uh oh!

Conversation

basilwong commented Jan 28, 2026

Uh oh!

pytorch-bot bot commented Jan 28, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3752

Uh oh!

basilwong commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants