Skip to content

Update: convert all paged_attention examples from float16 to bfloat16#490

Open
hw-native-sys-bot wants to merge 1 commit intohw-native-sys:mainfrom
hw-native-sys-bot:sync/a5-pa-dtype-bfloat16
Open

Update: convert all paged_attention examples from float16 to bfloat16#490
hw-native-sys-bot wants to merge 1 commit intohw-native-sys:mainfrom
hw-native-sys-bot:sync/a5-pa-dtype-bfloat16

Conversation

@hw-native-sys-bot
Copy link
Copy Markdown
Collaborator

@hw-native-sys-bot hw-native-sys-bot commented Apr 9, 2026

Summary

  • Convert all four paged_attention example variants from float16/half to bfloat16/bfloat16_t:
    • a2a3/tensormap_and_ringbuffer/paged_attention
    • a5/tensormap_and_ringbuffer/paged_attention
    • a2a3/host_build_graph/paged_attention
    • a5/host_build_graph/paged_attention
  • Change half/float16 types to bfloat16_t/bfloat16 in all kernel files
  • Update golden.py dtype from "float16" to "bfloat16"
  • Rename f16/fp16 variables and comments to bf16
  • Align pto::Stride to Stride (matching reference style)

Testing

  • Simulation tests pass
  • Hardware tests pass on a2a3 device
  • Hardware tests pass on a5 device

@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@hw-native-sys-bot hw-native-sys-bot force-pushed the sync/a5-pa-dtype-bfloat16 branch from 17ba6df to b7547dd Compare April 9, 2026 02:08
@hw-native-sys-bot hw-native-sys-bot changed the title Update: sync a5 paged_attention dtype from float16 to bfloat16 Update: convert all paged_attention examples from float16 to bfloat16 Apr 9, 2026
@hw-native-sys-bot hw-native-sys-bot force-pushed the sync/a5-pa-dtype-bfloat16 branch from b7547dd to 2d76a78 Compare April 9, 2026 06:09
Convert all four paged_attention example variants to use bfloat16:
- a2a3/tensormap_and_ringbuffer/paged_attention
- a5/tensormap_and_ringbuffer/paged_attention
- a2a3/host_build_graph/paged_attention
- a5/host_build_graph/paged_attention

Changes across all variants:
- Change half/float16 to bfloat16_t/bfloat16 in kernel files
- Update golden.py dtype from "float16" to "bfloat16"
- Rename f16/fp16 variables and comments to bf16
- Align pto::Stride to Stride (matching reference style)
@hw-native-sys-bot hw-native-sys-bot force-pushed the sync/a5-pa-dtype-bfloat16 branch from 2d76a78 to 44d638e Compare April 10, 2026 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants