Skip to content

[None][fix] only configure gc thresholds once#13910

Open
ixlmar wants to merge 1 commit intoNVIDIA:mainfrom
ixlmar:fix/gc-config
Open

[None][fix] only configure gc thresholds once#13910
ixlmar wants to merge 1 commit intoNVIDIA:mainfrom
ixlmar:fix/gc-config

Conversation

@ixlmar
Copy link
Copy Markdown
Collaborator

@ixlmar ixlmar commented May 8, 2026

Description

Avoids latency spikes caused by repeated calls to gc.set_threshold.

Test Coverage

n/a

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Summary by CodeRabbit

  • Refactor

    • Enhanced garbage collection efficiency by optimizing how memory management thresholds are applied during executor result processing.
    • Improved thread context management to ensure garbage collection optimizations apply at the thread level rather than at operation level.
    • Consolidated internal utilities and simplified context handling in core threading components.
  • Chores

    • Removed internal helper class.

Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
@ixlmar ixlmar requested review from Funatiq and dcampora May 8, 2026 15:03
@ixlmar
Copy link
Copy Markdown
Collaborator Author

ixlmar commented May 8, 2026

/bot run

@ixlmar
Copy link
Copy Markdown
Collaborator Author

ixlmar commented May 8, 2026

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47416 [ run ] triggered by Bot. Commit: 2cfda96 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47417 [ run ] triggered by Bot. Commit: 2cfda96 Link to invocation

@ixlmar ixlmar marked this pull request as ready for review May 8, 2026 16:05
@ixlmar ixlmar requested a review from a team as a code owner May 8, 2026 16:05
@ixlmar ixlmar requested a review from nv-guomingz May 8, 2026 16:05
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c059d174-2103-4817-9307-eaaef51bc2d3

📥 Commits

Reviewing files that changed from the base of the PR and between 1651d1b and 2cfda96.

📒 Files selected for processing (2)
  • tensorrt_llm/executor/proxy.py
  • tensorrt_llm/llmapi/utils.py

📝 Walkthrough

Walkthrough

The PR refactors garbage collection threshold customization in the executor's result dispatcher. Previously, the GC threshold context wrapped only the blocking queue-check operation; now it is passed to the ManagedThread as a context manager, applying the customization at the entire thread-execution level. The ManagedThread class gains an optional context parameter and wraps its main loop accordingly, replacing the now-removed ContextManager helper class.

Changes

GC Threshold Thread Context

Layer / File(s) Summary
Imports & Helper Class Removal
tensorrt_llm/llmapi/utils.py
nullcontext and ContextManager type imports are added; the ContextManager helper class definition is removed.
ManagedThread Context Support
tensorrt_llm/llmapi/utils.py
ManagedThread.__init__() accepts an optional context parameter and stores it; run() main loop is wrapped with with self.context: so thread execution occurs within the supplied context manager (or nullcontext() by default).
GC Threshold Thread Context
tensorrt_llm/executor/proxy.py
Result dispatcher ManagedThread receives context=customized_gc_thresholds(...) instead of wrapping only the sentinel check, shifting GC-threshold scope to the entire thread.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description explains the issue (latency spikes from repeated gc.set_threshold calls) but lacks details on the solution, affected components, and test coverage is marked as n/a without explanation. Expand the description to explain how the solution works (context manager approach, thread-level configuration) and clarify why test coverage is not applicable for these changes.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: moving GC threshold configuration to occur once at thread level rather than repeatedly.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47417 [ run ] completed with state SUCCESS. Commit: 2cfda96
/LLM/main/L0_MergeRequest_PR pipeline #37344 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants