Skip to content

[atom-sgl-benchmark] Debug benchmark timeout#977

Merged
zhuyuhua-v merged 1 commit into
ROCm:mainfrom
junyyang-amd:fix-sglang-timeout-origin-main
Jun 2, 2026
Merged

[atom-sgl-benchmark] Debug benchmark timeout#977
zhuyuhua-v merged 1 commit into
ROCm:mainfrom
junyyang-amd:fix-sglang-timeout-origin-main

Conversation

@junyyang-amd
Copy link
Copy Markdown
Contributor

报错原因:
podman exec 本身在等 container exec 的 exit 文件(vllm launch)时超时导致的255报错,即SGLang 的失败正好卡在 bash .github/scripts/atom_sglang_test.sh launch 这个长 exec。

解决思路:
atom_sglang_test.sh 增加 start 模式,让服务能后台启动后立即退出;
atom-sglang-benchmark.yaml 把 SGLang-ATOM 分支的长启动 exec 改成后台启动 + 外层轮询 ready。

@junyyang-amd junyyang-amd requested a review from zhuyuhua-v May 29, 2026 14:14
@junyyang-amd junyyang-amd force-pushed the fix-sglang-timeout-origin-main branch from 4211c3c to a5f1c8b Compare June 2, 2026 07:00
@junyyang-amd junyyang-amd force-pushed the fix-sglang-timeout-origin-main branch from a5f1c8b to f99d245 Compare June 2, 2026 08:20
@junyyang-amd junyyang-amd force-pushed the fix-sglang-timeout-origin-main branch from f99d245 to a4c7f6a Compare June 2, 2026 09:21
@zhuyuhua-v zhuyuhua-v merged commit 61b6080 into ROCm:main Jun 2, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants