Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,11 @@ echo 'alias km=karmadactl' >>~/.bashrc
echo 'complete -F __start_karmadactl km' >>~/.bashrc
```

> **Note:** bash-completion sources all completion scripts in /etc/bash_completion.d.
:::note

bash-completion sources all completion scripts in /etc/bash_completion.d.

:::

Both approaches are equivalent. After reloading your shell, karmadactl autocompletion should be working.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,8 @@ It contains the following metrics:
| GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,11 @@ echo 'alias km=karmadactl' >>~/.bashrc
echo 'complete -F __start_karmadactl km' >>~/.bashrc
```

> **Note:** bash-completion sources all completion scripts in /etc/bash_completion.d.
:::note

bash-completion sources all completion scripts in /etc/bash_completion.d.

:::

Both approaches are equivalent. After reloading your shell, karmadactl autocompletion should be working.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,8 @@ It contains the following metrics:
| GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,11 @@ echo 'alias km=karmadactl' >>~/.bashrc
echo 'complete -F __start_karmadactl km' >>~/.bashrc
```

> **Note:** bash-completion sources all completion scripts in /etc/bash_completion.d.
:::note

bash-completion sources all completion scripts in /etc/bash_completion.d.

:::

Both approaches are equivalent. After reloading your shell, karmadactl autocompletion should be working.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,15 @@ Install only gpushare-device-plugin, don't install gpu-scheduler-plugin package.

:::

> **NOTE:** The default resource names are:
> - `enflame.com/vgcu` for GCU count, only support 1 now.
> - `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.
>
> You can customize these names by modifying `hami-scheduler-device` configMap above.
:::note

The default resource names are:
- `enflame.com/vgcu` for GCU count, only support 1 now.
- `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.

You can customize these names by modifying `hami-scheduler-device` configMap above.

:::

* Set 'devices.enflame.enabled=true' when deploy HAMi

Expand Down Expand Up @@ -105,7 +109,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,16 @@ title: Enable Iluvatar GCU sharing
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag={your kubernetes version} --set iluvatarResourceMem=iluvatar.ai/vcuda-memory --set iluvatarResourceCore=iluvatar.ai/vcuda-core -n kube-system
```

> **NOTE:** The default resource names are:
> - `iluvatar.ai/vgpu` for GPU count
> - `iluvatar.ai/vcuda-memory` for memory allocation
> - `iluvatar.ai/vcuda-core` for core allocation
>
> You can customize these names using the parameters above.
:::note

The default resource names are:
- `iluvatar.ai/vgpu` for GPU count
- `iluvatar.ai/vcuda-memory` for memory allocation
- `iluvatar.ai/vcuda-core` for core allocation

You can customize these names using the parameters above.

:::

## Device Granularity

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,8 @@ It contains the following metrics:
| GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,8 @@ It contains the following metrics:
| GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,15 @@ Install only gpushare-device-plugin, don't install gpu-scheduler-plugin package.

:::

> **NOTE:** The default resource names are:
> - `enflame.com/vgcu` for GCU count, only support 1 now.
> - `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.
>
> You can customize these names by modifying `hami-scheduler-device` configMap above.
:::note

The default resource names are:
- `enflame.com/vgcu` for GCU count, only support 1 now.
- `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.

You can customize these names by modifying `hami-scheduler-device` configMap above.

:::

* Set 'devices.enflame.enabled=true' when deploy HAMi

Expand Down Expand Up @@ -106,7 +110,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,16 @@ Install only gpu-manager, don't install gpu-admission package.
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag={your kubernetes version} --set iluvatarResourceMem=iluvatar.ai/vcuda-memory --set iluvatarResourceCore=iluvatar.ai/vcuda-core -n kube-system
```

> **NOTE:** The default resource names are:
> - `iluvatar.ai/vgpu` for GPU count
> - `iluvatar.ai/vcuda-memory` for memory allocation
> - `iluvatar.ai/vcuda-core` for core allocation
>
> You can customize these names using the parameters above.
:::note

The default resource names are:
- `iluvatar.ai/vgpu` for GPU count
- `iluvatar.ai/vcuda-memory` for memory allocation
- `iluvatar.ai/vcuda-core` for core allocation

You can customize these names using the parameters above.

:::

## Device Granularity

Expand Down Expand Up @@ -130,7 +134,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,8 @@ It contains the following metrics:
| GPUDeviceSharedNum | Number of containers sharing this GPU | `{deviceidx="0",deviceuuid="GPU-00552014-5c87-89ac-b1a6-7b53aa24b0ec",nodeid="aio-node67",zone="vGPU"}` 1 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods | `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-AWSNeuron-{index}`. You can find the available device IDs in the node annotations.
:::note

The device ID format is `{node-name}-AWSNeuron-{index}`. You can find the available device IDs in the node annotations.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,15 @@ Install only gpushare-device-plugin, don't install gpu-scheduler-plugin package.

:::

> **NOTE:** The default resource names are:
> - `enflame.com/vgcu` for GCU count, only support 1 now.
> - `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.
>
> You can customize these names by modifying `hami-scheduler-device` configMap above.
:::note

The default resource names are:
- `enflame.com/vgcu` for GCU count, only support 1 now.
- `enflame.com/vgcu-percentage` for the percentage of memory and cores in a gcu slice.

You can customize these names by modifying `hami-scheduler-device` configMap above.

:::

* Set 'devices.enflame.enabled=true' when deploy HAMi

Expand Down Expand Up @@ -106,7 +110,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-enflame-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,16 @@ Install only gpu-manager, don't install gpu-admission package.
helm install hami hami-charts/hami --set scheduler.kubeScheduler.imageTag={your kubernetes version} --set iluvatarResourceMem=iluvatar.ai/vcuda-memory --set iluvatarResourceCore=iluvatar.ai/vcuda-core -n kube-system
```

> **NOTE:** The default resource names are:
> - `iluvatar.ai/vgpu` for GPU count
> - `iluvatar.ai/vcuda-memory` for memory allocation
> - `iluvatar.ai/vcuda-core` for core allocation
>
> You can customize these names using the parameters above.
:::note

The default resource names are:
- `iluvatar.ai/vgpu` for GPU count
- `iluvatar.ai/vcuda-memory` for memory allocation
- `iluvatar.ai/vcuda-core` for core allocation

You can customize these names using the parameters above.

:::

## Device Granularity

Expand Down Expand Up @@ -130,7 +134,11 @@ spec:
# ... rest of pod spec
```

> **NOTE:** The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.
:::note

The device ID format is `{node-name}-iluvatar-{index}`. You can find the available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,11 @@ spec:
# ... rest of Pod configuration
```

> **Note:** Device ID format is `{BusID}`. You can find available device IDs in the node status.
:::note

Device ID format is `{BusID}`. You can find available device IDs in the node status.

:::

### Finding Device UUIDs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,8 @@ It contains the following metrics:
| QuotaUsed | resourcequota usage for a certain device | `{quotaName="nvidia.com/gpucores", quotanamespace="default",limit="200",zone="vGPU"}` 100 |
| vGPUPodsDeviceAllocated | vGPU Allocated from pods (This metric will be deprecated in v2.8.0, use vGPUMemoryAllocated and vGPUCoreAllocated instead.)| `{containeridx="Ascend310P",deviceusedcore="0",deviceuuid="aio-node74-arm-Ascend310P-0",nodename="aio-node74-arm",podname="ascend310p-pod",podnamespace="default",zone="vGPU"}` 3.221225472e+09 |

> **Note** This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.
:::note

This is the overview about device allocation, it is NOT device real-time usage metrics. For that part, see real-time device usage.

:::
Loading