<feature>[sdk]: support dgpu sdk#3676
Conversation
Cherry-pick dgpu-related sdk changes from upstream/feature-5.5.12-dgpu. DBImpact Resolves: ZSTAC-83477 Change-Id: I61746e646972787a6f72757565776d6570617768
Walkthrough该变更通过数据库迁移扩展GPU设备管理和虚拟化功能,添加GPU模型规范化字段、评估任务排序字段、PCI设备虚拟化能力表,以及新的dGPU相关表结构。同时新增资源指标绑定扩展点接口、API辅助方法和错误代码常量。 Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.42.0)utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.javaComment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
header/src/main/java/org/zstack/header/zwatch/ResourceMetricBindingExtensionPoint.java (2)
22-28:requireText验证后返回的是未经 trim 的原始值。当前实现检查
value.trim().isEmpty()来验证非空,但最终返回的是原始的value。如果输入值包含前后空格(如" myMetric "),验证会通过,但返回的值仍带有空格,可能导致字符串比较或作为 key 使用时出现不一致的行为。根据项目规范,用户输入可能包含复制粘贴带来的空格/换行符,建议返回 trim 后的值:
♻️ 建议修改
private static String requireText(String fieldName, String value) { requireValue(fieldName, value); - if (value.trim().isEmpty()) { + String trimmed = value.trim(); + if (trimmed.isEmpty()) { throw new IllegalStateException(String.format("ResourceMetricBinding.%s must not be empty", fieldName)); } - return value; + return trimmed; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/zwatch/ResourceMetricBindingExtensionPoint.java` around lines 22 - 28, The requireText method currently validates emptiness using value.trim() but returns the original untrimmed value; update the implementation of requireText (in class ResourceMetricBindingExtensionPoint) to trim the input once (e.g., String trimmed = value == null ? null : value.trim()), call requireValue on the original or trimmed as appropriate, validate using the trimmed variable (throwing the same IllegalStateException if trimmed.isEmpty()), and return the trimmed string so callers receive the canonicalized value.
5-88: 建议添加 Javadoc 文档注释。根据编码规范,接口方法必须配有有效的 Javadoc 注释。建议为接口和方法添加文档说明其用途。
📝 建议添加文档
+/** + * Extension point for providing resource metric binding configurations. + * Implementations supply mappings between logical metrics and their source definitions. + */ public interface ResourceMetricBindingExtensionPoint { // ... nested class ... + /** + * Returns the list of resource metric bindings provided by this extension point. + * + * `@return` list of metric binding configurations + */ List<ResourceMetricBinding> getResourceMetricBindings(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@header/src/main/java/org/zstack/header/zwatch/ResourceMetricBindingExtensionPoint.java` around lines 5 - 88, The interface ResourceMetricBindingExtensionPoint and its nested class ResourceMetricBinding lack Javadoc; add clear Javadoc comments to the interface, the nested class, and the public API method getResourceMetricBindings(), and also document the purpose of ResourceMetricBinding and each of its public getters/setters (getResourceType, getLogicalMetricName, getSourceNamespace, getSourceMetricName, getResourceField, getSourceLabel, isRequireUniqueSourceKey) explaining expected semantics, null/empty constraints and side effects so the API intent and usage are clear.utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java (1)
6283-6284: 建议保持命名模式一致性常量
ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT使用了描述性后缀,而整个文件中其他所有错误码常量都遵循ORG_ZSTACK_[MODULE]_[数字]的模式(如ORG_ZSTACK_STORAGE_BACKUP_10133)。虽然描述性命名本身是清晰的,但打破统一模式可能导致:
- 开发人员在添加新错误码时不确定应遵循哪种模式
- 维护和自动化工具处理时的不一致性
建议后续将此常量重命名为类似
ORG_ZSTACK_STORAGE_BACKUP_10134的格式以保持一致性。注:考虑到这是 cherry-pick 操作,如果上游分支已使用此名称,可在后续统一优化时处理。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java` around lines 6283 - 6284, The constant ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT in CloudOperationsErrorCode breaks the file's naming convention; rename this constant to a numeric-style identifier (e.g., ORG_ZSTACK_STORAGE_BACKUP_10134) and update all references/usages accordingly (ensure the constant value string matches the new identifier) so it follows the ORG_ZSTACK_[MODULE]_[NUMBER] pattern used by other error codes.conf/db/upgrade/V5.5.12__schema.sql (3)
37-48: TIMESTAMP 列建议添加显式默认值。根据编码规范,应使用
DEFAULT CURRENT_TIMESTAMP而非隐式依赖 MySQL 行为。在不同的 MySQL 配置(如explicit_defaults_for_timestamp)下,未指定默认值的TIMESTAMP NOT NULL列可能导致插入失败。建议的修复
CREATE TABLE IF NOT EXISTS `zstack`.`PciDeviceVirtCapabilityVO` ( `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT, `pciDeviceUuid` VARCHAR(32) NOT NULL, `capability` VARCHAR(32) NOT NULL, - `createDate` TIMESTAMP NOT NULL, - `lastOpDate` TIMESTAMP NOT NULL, + `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`),As per coding guidelines: "Do not use
DEFAULT 0000-00-00 00:00:00, useDEFAULT CURRENT_TIMESTAMPinstead"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/db/upgrade/V5.5.12__schema.sql` around lines 37 - 48, The TIMESTAMP columns in PciDeviceVirtCapabilityVO (createDate, lastOpDate) lack explicit defaults and should use CURRENT_TIMESTAMP per guidelines; change createDate to TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP and change lastOpDate to TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP (avoid DEFAULT '0000-00-00 00:00:00'), so inserts and automatic updates behave consistently across MySQL configurations.
126-168: TIMESTAMP 列需要添加显式默认值。与
PciDeviceVirtCapabilityVO相同的问题,createDate和lastOpDate列应添加DEFAULT CURRENT_TIMESTAMP。此问题影响DGpuProfileVO(第131-132行)和DGpuDeviceVO(第153-154行)。建议的修复
CREATE TABLE IF NOT EXISTS `zstack`.`DGpuProfileVO` ( ... - `createDate` TIMESTAMP NOT NULL, - `lastOpDate` TIMESTAMP NOT NULL, + `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`uuid`),CREATE TABLE IF NOT EXISTS `zstack`.`DGpuDeviceVO` ( ... - `createDate` TIMESTAMP NOT NULL, - `lastOpDate` TIMESTAMP NOT NULL, + `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`uuid`),As per coding guidelines: "Do not use
DEFAULT 0000-00-00 00:00:00, useDEFAULT CURRENT_TIMESTAMPinstead"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/db/upgrade/V5.5.12__schema.sql` around lines 126 - 168, The TIMESTAMP columns createDate and lastOpDate in tables DGpuProfileVO and DGpuDeviceVO lack explicit defaults; update the table definitions for DGpuProfileVO (`createDate`, `lastOpDate`) and DGpuDeviceVO (`createDate`, `lastOpDate`) to use DEFAULT CURRENT_TIMESTAMP (and for lastOpDate consider DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP if matching PciDeviceVirtCapabilityVO behavior) so they no longer rely on implicit/zero timestamps and comply with the guideline to avoid DEFAULT '0000-00-00 00:00:00'.
170-190: TIMESTAMP 列需要添加显式默认值。
VmInstanceDGpuStrategyVO表的createDate和lastOpDate列(第179-180行)同样需要添加DEFAULT CURRENT_TIMESTAMP。建议的修复
`autoDetachOnStop` TINYINT(1) NOT NULL DEFAULT 1, - `createDate` TIMESTAMP NOT NULL, - `lastOpDate` TIMESTAMP NOT NULL, + `createDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP, + `lastOpDate` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`),As per coding guidelines: "Do not use
DEFAULT 0000-00-00 00:00:00, useDEFAULT CURRENT_TIMESTAMPinstead"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@conf/db/upgrade/V5.5.12__schema.sql` around lines 170 - 190, The TIMESTAMP columns in VmInstanceDGpuStrategyVO (createDate and lastOpDate) lack explicit defaults; update the CREATE TABLE DDL for VmInstanceDGpuStrategyVO to add DEFAULT CURRENT_TIMESTAMP to both createDate and lastOpDate (and optionally add ON UPDATE CURRENT_TIMESTAMP for lastOpDate if the intent is to auto-update it) so they no longer rely on zero-dates; modify the column definitions for createDate and lastOpDate accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@conf/db/upgrade/V5.5.12__schema.sql`:
- Around line 17-21: The UPDATE on table ModelEvaluationTaskVO sets totalScore
from Json_getKeyValue(opaque, 'total_score') but lacks a check for empty-string
results, which can cause CAST to yield 0; modify the WHERE clause (and/or the
expression) in the UPDATE that references Json_getKeyValue(`opaque`,
'total_score') to also require Json_getKeyValue(`opaque`, 'total_score') != ''
so only non-empty values are cast and written to totalScore (mirror the same !=
'' check used for endTime).
---
Nitpick comments:
In `@conf/db/upgrade/V5.5.12__schema.sql`:
- Around line 37-48: The TIMESTAMP columns in PciDeviceVirtCapabilityVO
(createDate, lastOpDate) lack explicit defaults and should use CURRENT_TIMESTAMP
per guidelines; change createDate to TIMESTAMP NOT NULL DEFAULT
CURRENT_TIMESTAMP and change lastOpDate to TIMESTAMP NOT NULL DEFAULT
CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP (avoid DEFAULT '0000-00-00
00:00:00'), so inserts and automatic updates behave consistently across MySQL
configurations.
- Around line 126-168: The TIMESTAMP columns createDate and lastOpDate in tables
DGpuProfileVO and DGpuDeviceVO lack explicit defaults; update the table
definitions for DGpuProfileVO (`createDate`, `lastOpDate`) and DGpuDeviceVO
(`createDate`, `lastOpDate`) to use DEFAULT CURRENT_TIMESTAMP (and for
lastOpDate consider DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP if
matching PciDeviceVirtCapabilityVO behavior) so they no longer rely on
implicit/zero timestamps and comply with the guideline to avoid DEFAULT
'0000-00-00 00:00:00'.
- Around line 170-190: The TIMESTAMP columns in VmInstanceDGpuStrategyVO
(createDate and lastOpDate) lack explicit defaults; update the CREATE TABLE DDL
for VmInstanceDGpuStrategyVO to add DEFAULT CURRENT_TIMESTAMP to both createDate
and lastOpDate (and optionally add ON UPDATE CURRENT_TIMESTAMP for lastOpDate if
the intent is to auto-update it) so they no longer rely on zero-dates; modify
the column definitions for createDate and lastOpDate accordingly.
In
`@header/src/main/java/org/zstack/header/zwatch/ResourceMetricBindingExtensionPoint.java`:
- Around line 22-28: The requireText method currently validates emptiness using
value.trim() but returns the original untrimmed value; update the implementation
of requireText (in class ResourceMetricBindingExtensionPoint) to trim the input
once (e.g., String trimmed = value == null ? null : value.trim()), call
requireValue on the original or trimmed as appropriate, validate using the
trimmed variable (throwing the same IllegalStateException if trimmed.isEmpty()),
and return the trimmed string so callers receive the canonicalized value.
- Around line 5-88: The interface ResourceMetricBindingExtensionPoint and its
nested class ResourceMetricBinding lack Javadoc; add clear Javadoc comments to
the interface, the nested class, and the public API method
getResourceMetricBindings(), and also document the purpose of
ResourceMetricBinding and each of its public getters/setters (getResourceType,
getLogicalMetricName, getSourceNamespace, getSourceMetricName, getResourceField,
getSourceLabel, isRequireUniqueSourceKey) explaining expected semantics,
null/empty constraints and side effects so the API intent and usage are clear.
In
`@utils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java`:
- Around line 6283-6284: The constant ORG_ZSTACK_STORAGE_BACKUP_CANCEL_TIMEOUT
in CloudOperationsErrorCode breaks the file's naming convention; rename this
constant to a numeric-style identifier (e.g., ORG_ZSTACK_STORAGE_BACKUP_10134)
and update all references/usages accordingly (ensure the constant value string
matches the new identifier) so it follows the ORG_ZSTACK_[MODULE]_[NUMBER]
pattern used by other error codes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
Run ID: e1402635-d6da-4bf1-a62e-581c60a4bfc6
⛔ Files ignored due to path filters (31)
conf/i18n/globalErrorCodeMapping/global-error-en_US.jsonis excluded by!**/*.jsonconf/i18n/globalErrorCodeMapping/global-error-zh_CN.jsonis excluded by!**/*.jsonsdk/src/main/java/SourceClassMap.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DGpuDeviceInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DGpuProfileInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DGpuSpecStatsInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DGpuStatus.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DetachDGpuFromVmAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DetachDGpuFromVmResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DisableDGpuModeAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/DisableDGpuModeResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/EnableDGpuModeAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/EnableDGpuModeResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/GetDGpuSpecStatsAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/GetDGpuSpecStatsResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/GpuDeviceInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/GpuDeviceSpecCandidateInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/PciDeviceInventory.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/PciDeviceVirtMode.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/PciDeviceVirtState.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/PciDeviceVirtStatus.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/QueryDGpuDeviceAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/QueryDGpuDeviceResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/QueryDGpuProfileAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/QueryDGpuProfileResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/RemoveVmDGpuStrategyAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/RemoveVmDGpuStrategyResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/SetDGpuProfileAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/SetDGpuProfileResult.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/SetVmDGpuStrategyAction.javais excluded by!sdk/**sdk/src/main/java/org/zstack/sdk/SetVmDGpuStrategyResult.javais excluded by!sdk/**
📒 Files selected for processing (4)
conf/db/upgrade/V5.5.12__schema.sqlheader/src/main/java/org/zstack/header/zwatch/ResourceMetricBindingExtensionPoint.javatestlib/src/main/java/org/zstack/testlib/ApiHelper.groovyutils/src/main/java/org/zstack/utils/clouderrorcode/CloudOperationsErrorCode.java
| UPDATE `zstack`.`ModelEvaluationTaskVO` | ||
| SET `totalScore` = CAST(Json_getKeyValue(`opaque`, 'total_score') AS DECIMAL(20,6)) | ||
| WHERE `opaque` IS NOT NULL | ||
| AND `totalScore` IS NULL | ||
| AND Json_getKeyValue(`opaque`, 'total_score') IS NOT NULL; |
There was a problem hiding this comment.
建议增加空字符串检查以保持一致性。
endTime 的回填逻辑(第33行)检查了 != '',但 totalScore 的回填没有相同的检查。如果 Json_getKeyValue 返回空字符串,CAST 可能会产生意外的 0 值。
建议的修复
UPDATE `zstack`.`ModelEvaluationTaskVO`
SET `totalScore` = CAST(Json_getKeyValue(`opaque`, 'total_score') AS DECIMAL(20,6))
WHERE `opaque` IS NOT NULL
AND `totalScore` IS NULL
- AND Json_getKeyValue(`opaque`, 'total_score') IS NOT NULL;
+ AND Json_getKeyValue(`opaque`, 'total_score') IS NOT NULL
+ AND Json_getKeyValue(`opaque`, 'total_score') != '';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@conf/db/upgrade/V5.5.12__schema.sql` around lines 17 - 21, The UPDATE on
table ModelEvaluationTaskVO sets totalScore from Json_getKeyValue(opaque,
'total_score') but lacks a check for empty-string results, which can cause CAST
to yield 0; modify the WHERE clause (and/or the expression) in the UPDATE that
references Json_getKeyValue(`opaque`, 'total_score') to also require
Json_getKeyValue(`opaque`, 'total_score') != '' so only non-empty values are
cast and written to totalScore (mirror the same != '' check used for endTime).
Cherry-pick dgpu-related sdk changes from upstream/feature-5.5.12-dgpu.
DBImpact
Resolves: ZSTAC-83477
Change-Id: I61746e646972787a6f72757565776d6570617768
sync from gitlab !9536