Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 2 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ repos:
rev: v4.0.0-alpha.8
hooks:
- id: prettier
args: [--cache-location, /tmp/prettier-cache]
stages:
- pre-commit
- repo: https://github.com/pre-commit/pre-commit-hooks
Expand All @@ -50,14 +51,8 @@ repos:
exclude: ^(i18n/en/code.json|i18n/en/docusaurus-plugin-content-blog/options.json|i18n/en/docusaurus-plugin-content-docs/current.json|i18n/en/docusaurus-theme-classic/footer.json|i18n/en/docusaurus-theme-classic/navbar.json)$
- id: trailing-whitespace
args: ["--markdown-linebreak-ext=md,mdx"]
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v4.0.0-alpha.8
hooks:
- id: prettier
stages:
- pre-commit
- repo: https://github.com/crate-ci/typos
rev: v1
rev: v1.46.1
hooks:
- id: typos
- repo: https://github.com/milin/gitown
Expand Down
1 change: 0 additions & 1 deletion contents
Submodule contents deleted from 6209d4
9 changes: 7 additions & 2 deletions docs/common/ai/cubie/_g2d-usage-guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@

G2D 是 Allwinner SoC 集成的 2D 图形硬件加速器,负责图像旋转、缩放、格式转换、颜色填充等操作。

:::info
当前版本默认不启用 g2d ,如有需要,按照下面的命令配置
:::tip
新版镜像已默认启用 g2d,可直接跳转到下方[验证驱动状态](#验证驱动状态)确认。若处于旧版镜像,请展开下方手动安装
:::

<details>
<summary>旧版镜像:手动安装 g2d 内核</summary>

<NewCodeBlock tip="Device" type="device">

```bash
Expand All @@ -19,6 +22,8 @@ sudo reboot

</NewCodeBlock>

</details>

典型应用场景:

- 视频编解码前后的图像预处理(缩放、色彩空间转换)
Expand Down
185 changes: 54 additions & 131 deletions docs/nio/nio12l/ubuntu/npu-usage/env-setup.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,25 @@
---
sidebar_position: 2
description: 介绍如何安装 Neuron SDK
description: 介绍如何配置 NPU 开发环境
---

# Device 环境配置
# NPU 环境配置

在 Ubuntu 系统下配置设备端的 NPU 环境很简单,在终端运行下面的命令:
NIO 12L 搭载 MediaTek MT8395 (Genio 1200) 内置 APU (AI Processor Unit) 和 AI Accelerator (Aia),可通过 NeuroPilot 软件栈在 Ubuntu 上调用硬件加速进行 AI 推理。

## 安装固件与运行时

首先安装 APU 固件和 Neuron 运行时:

```bash
sudo apt install mediatek-apusys-firmware-genio1200
sudo apt install mediatek-libneuron mediatek-neuron-utils mediatek-libneuron-dev
sudo reboot
```

## 安装 Neuron SDK

重启后安装 Neuron SDK:

<NewCodeBlock tip="Device" type="device">

Expand All @@ -15,150 +29,59 @@ snap install mtk-neuropilot --edge

</NewCodeBlock>

--edge 渠道发布的是最新的软件包,可能比稳定渠道更新得更频繁,如果想要用于稳定的生产环境,可以前往 Snap 商店查看各版本的渠道和官方支持状态
`--edge` 渠道发布的是最新软件包,可能比稳定渠道更新更频繁。如需用于稳定生产环境,可前往 Snap 商店查看各版本渠道和官方支持状态

---
## 验证安装

### vpu5_test

```bash
sudo vpu5_test -a ks -l 10
```

测试结果应显示 `PASS`。

mtk-neuropilot 安装完成之后,在终端验证是否安装成功:
### ncc-tflite

`ncc-tflite` 是用于将模型从 tflite 格式转换为 dla 格式的命令行工具。
`ncc-tflite` 用于将模型从 tflite 格式转换为 dla 格式:

<NewCodeBlock tip="Device" type="device">

```bash
$ ncc-tflite

Usage:
ncc-tflite [OPTION...] filename

--verify Force tflite model verification
--no-verify Bypass tflite model verification
-d, --dla-file <file> Specify a filename for the output DLA file
--check-target-only Check target support and exit
--resize <dims,...> Specify a list of input dimensions for resizing
(e.g., 1x3x5,2x4x6)
-s, --show-tflite Show tensors and nodes in the tflite model
--show-io-info Show input and output tensors of the tflite
model
--show-builtin-ops Show available builtin operations and exit
--show-mtkext-ops Show available MTKEXT operations and exit
--verbose Enable verbose mode
--version Output version information and exit
--help Display this help and exit
-e, --exec Enable execution (inference) mode
-i, --input <file,...> Specify a list of input files for inference
-o, --output <file,...> Specify a list of output files for inference
--arch <name,...> Specify a list of target architecture names
--platform <name> Platform preference as hint for compilation
-O, --opt <level> Specify which optimization level to use:
[0]: no optimization
[1]: enable basic optimization for fast codegen
[2]: enable most optimizations
[3]: enable -O2 with other optimizations that
take longer compilation time (default: 2)
--opt-accuracy Optimize for accuracy
--opt-aggressive Enable optimizations that may lose accuracy
--opt-bw Optimize for memory bandwidth
--opt-footprint Optimize for memory footprint
--opt-size Optimize for size, including code and static
data
--relax-fp32 Run fp32 models using fp16
--l1-size-kb <size> Hint the size of L1 memory (default: 0)
--l2-size-kb <size> Hint the size of L2 memory (default: 0)
--suppress-input Suppress input data conversion
--suppress-output Suppress output data conversion
--gen-debug-info Produce debugging information in the DLA file.
Runtime can work with this info for profiling
--show-exec-plan Show execution plan
--show-memory-summary Show memory allocation summary
--dla-metadata <key1:file1,key2:file2,...>
Specify a list of key:file pairs for DLA
metadata
--disallow-bridge Report error if bridging is needed
--avoid-reorder Keep execution order during graph optimization
if possible
--extract-static-data <filename>
Extract static parameters into file and make
them as input tensors
--intval-color-fast Disable exhaustive search in interval coloring
--show-l1-req Show the requirement for L1 without dropping.
Only effective when global buffer allocation is
in effect
--int8-to-uint8 Convert data types from INT8 to UINT8
--fc-to-conv Convert Fully Connected to Conv2D
--decompose-qlstmv2 Decompose QLSTM V2 to sub-OPs
--stable-linearize Stable linearize NIR (respect the input NIR
order), making layer order predictable
--rewrite-pattern <pattern1,pattern2,...>
Specify a list of patterns to be rewritten if
matched in a graph.
Use --rewrite-pattern=? to show available
patterns
--sink-concat Sink concat operations if possible
--reshape-to-4d Reshape tensor to 4D if possible

aps options:
--aps-cbfc-vids <vid,...>
Provide idle CBFC vids for APS internal use.
(e.g., 0,1)
--aps-ext-datatype Enable more datatype support for extension.

gno options:
--gno <opt1,opt2,...> Specify a list of graphite neuron optimizations.
Available options: NDF, SMP, BMP
--basic-tiling Enable basic tiling

gpu options:
--cltuner-file <path> An output file path for CL tuner that generates
optimization settings (default:
/vendor/etc/armnn_app.config)
--cltuning-mode <mode> Set the tuning level of CL tuner (default: -1)
--cmdl-dir <path> An output directory for CmdL that dumps infos
--clprofile Enable CmdL clprofile
--clfinish Enable CmdL clfinish

mdla options:
--num-mdla <num> Use numbers of MDLA cores (default: 1)
--mdla-bw <num> Hint MDLA bandwidth (MB/s) (default: 10240)
--mdla-freq <num> Hint MDLA frequency (MHz) (default: 960)
--mdla-wt-to-l1 Hint MDLA try to put weight into L1
--mdla-wt-pruned The weight of given model has been pruned
--prefer-large-acc <num> Use large accumulator to improve accuracy
--use-sw-dilated-conv Use software dilated convolution
--use-sw-deconv Convert DeConvolution to Conv2Ds
--req-per-ch-conv Requant invalid per-channel convs
--trim-io-alignment Trim the model IO alignment

mvpu options:
--mvpu-algo <bit_flags> The selection of MVPU algo libraries:
[bit 0]: NN
[bit 1]: CV
[bit 16~17]: custom (default: 0xFFFFFFFF)
--mvpu-l1-heuristic <percentage>
Hint the percentage of L1 memory usage:
The range is from 0 to 100 (default: 75)
--mvpu-l2-limit <size> Hint the maximum L2(TCM) usage size limit
(default: 1048576)
--mvpu-disable-cycle-mem-opt
Disable MVPU cycle memory optimization
--num-mvpu <num> Use numbers of MVPU cores (default: 1)

vpu options:
--dual-vpu Use dual VPU
ncc-tflite --help
```

</NewCodeBlock>

---
### neuronrt

`neuronrt` 可以运行 dla 模型,测试模型完整性。
`neuronrt` 可运行 dla 模型并验证完整性:

<NewCodeBlock tip="Device" type="device">

```bash
$ neuronrt -v
INFO: dlopen libneuronusdk_runtime.mtk.so
Version: 6.3.3
neuronrt -v
```

</NewCodeBlock>

## 运行 Benchmark

```bash
sudo mkdir -p /usr/share/benchmark_dla
sudo cp /usr/share/neuropilot/benchmark_dla/* /usr/share/benchmark_dla/
sudo apt install python3-pip
sudo pip3 install numpy
sudo python3 /usr/share/benchmark_dla/benchmark.py --auto
```

## 支持的模型格式

- **TFLite** (`.tflite`)
- **ONNX** (`.onnx`) — 需通过 Neuron API 加载
- **Caffe** (`.caffemodel`)

更多信息参考 MediaTek 官方文档:

- [NeuroPilot 主页](https://neuropilot-developer.mediatek.com/resources/public/latest/en/docs/readme)
- [Neuron SDK 文档](https://mediatek.gitlab.io/aiot/doc/aiot-dev-guide/release/v25.0/sw/yocto/ml-guide/ml-neuron-sdk.html)
67 changes: 1 addition & 66 deletions docs/nio/nio12l/ubuntu/ubuntu-user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,69 +366,4 @@ $ sudo apt update && sudo apt install -y qtcreator

## 使用 NeuroPilot (APU)

NIO 12L 搭载的 MediaTek MT8395 (Genio 1200) 内置 APU (AI Processor Unit) 和 AI Accelerator (Aia),可通过 NeuroPilot 软件栈在 Ubuntu 上调用硬件加速进行 AI 推理。

### 安装 NeuroPilot 组件

首先安装对应平台的固件包:

```bash
# Genio 1200 (NIO 12L 使用)
sudo apt install mediatek-apusys-firmware-genio1200
```

然后安装 Neuron runtime 包:

```bash
sudo apt install mediatek-libneuron mediatek-neuron-utils mediatek-libneuron-dev
sudo reboot
```

### 验证 APU 是否正常工作

重启后,运行以下命令验证:

```bash
sudo vpu5_test -a ks -l 10
```

测试结果应显示 `PASS`。

### 运行 benchmark 示例

运行 MediaTek 提供的 benchmark 示例程序:

```bash
# 创建 workaround 目录
sudo mkdir -p /usr/share/benchmark_dla
sudo cp /usr/share/neuropilot/benchmark_dla/* /usr/share/benchmark_dla/

# 安装依赖
sudo apt install python3-pip
sudo pip3 install numpy

# 运行 benchmark
sudo python3 /usr/share/benchmark_dla/benchmark.py --auto
```

结果示例:

```text
root@mtk-genio:/usr/share/benchmark_dla# python3 benchmark.py --auto
2023-07-31 07:04:19,029 [INFO] ssd_mobilenet_v1_coco_quantized.tflite, mdla3.0, avg inference time: 2.53
2023-07-31 07:04:24,499 [INFO] ssd_mobilenet_v1_coco_quantized.tflite, vpu, avg inference time: 46.14
...
```

### 支持的模型格式

NeuroPilot Neuron SDK 主要支持以下模型格式:

- **TFLite** (.tflite)
- **ONNX** (.onnx) — 需通过 Neuron API 加载
- **Caffe** (.caffemodel)

更多详细信息请参考 MediaTek 官方文档:

- [NeuroPilot 主页](https://neuropilot-developer.mediatek.com/resources/public/latest/en/docs/readme)
- [Neuron SDK 文档](https://mediatek.gitlab.io/aiot/doc/aiot-dev-guide/release/v25.0/sw/yocto/ml-guide/ml-neuron-sdk.html)
请参考 [NPU 环境配置](./npu-usage/env-setup)。
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@

G2D is a 2D graphics hardware accelerator integrated into Allwinner SoCs, responsible for image rotation, scaling, format conversion, color filling and other operations.

:::info
G2D is not enabled by default in the current version. If needed, configure according to the commands below.
:::tip
New images already include G2D by default. Skip to [Verify Driver Status](#verify-driver-status) below to confirm. For older images, expand below for manual installation.
:::

<details>
<summary>Older images: Manual G2D kernel installation</summary>

<NewCodeBlock tip="Device" type="device">

```bash
Expand All @@ -19,6 +22,8 @@ sudo reboot

</NewCodeBlock>

</details>

Typical application scenarios:

- Image preprocessing (scaling, color space conversion) before and after video encoding/decoding
Expand Down
Loading