Skip to content

feat: deep_ep v2#1303

Open
blueswhen wants to merge 2 commits intomainfrom
ep_v2
Open

feat: deep_ep v2#1303
blueswhen wants to merge 2 commits intomainfrom
ep_v2

Conversation

@blueswhen
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates ElasticBuffer from deep_ep to optimize MoE dispatching by separating prefill and decode logic. It introduces distinct buffers for low-latency and elastic operations, refines SM allocation for deep_gemm, and updates environment variable handling for token dispatch limits across various models. Feedback was provided regarding a potential NameError in type hinting for EventOverlap when the deep_ep library is missing.

w1_scale: Optional[torch.Tensor] = None,
w2_scale: Optional[torch.Tensor] = None,
previous_event: Optional["EventOverlap"] = None,
previous_event: Optional[EventOverlap] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using EventOverlap directly in the type hint will cause a NameError at import time if deep_ep is not installed, as the import is wrapped in a try...except block. Please use a string literal for the type hint to maintain compatibility with environments where deep_ep might be missing.

Suggested change
previous_event: Optional[EventOverlap] = None,
previous_event: Optional["EventOverlap"] = None,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant