Skip to content

[Bug][Frontend] uint8 and bool dtypes silently downcast to int8 via MLIR_TO_DTYPE round-trip #238

@YWHyuk

Description

@YWHyuk

Summary

PyTorchSimFrontend/mlir/mlir_common.py maps torch.uint8, torch.bool, and torch.int8 all to the same MLIR type "i8". The inverse table MLIR_TO_DTYPE["i8"] only points back to torch.int8, so any code that does an MLIR -> torch round-trip silently demotes uint8 and bool to int8.

Where it bites

mlir_codegen_backend.py:1536 (indirect indexing path):

dtype = mlir_common.MLIR_TO_DTYPE[var_info[1]]

The resolved dtype then drives DTYPE_TO_C for the SRAM buffer declaration. A bool mask or uint8 index that was lowered as "i8" reappears as int8, picking the wrong C type / range.

Tables (mlir_common.py:56-65)

DTYPE_TO_MLIR = {
    torch.bool:  "i8",
    torch.uint8: "i8",
    torch.int8:  "i8",
    ...
}
MLIR_TO_DTYPE = {
    "i8":  torch.int8,
    ...
}

Suggested fix

Either:

  1. Track the original torch dtype directly and don't round-trip through the MLIR string, or
  2. Add dedicated MLIR tags ("ui8", "i1") in DTYPE_TO_MLIR and extend MLIR_TO_DTYPE to match.

Option 2 is closer in style to the existing code but needs DTYPE_TO_C updated too.

Why this matters

This is a silent-corruption-class bug: no error, just wrong simulated cycles / wrong functional results whenever a kernel ends up needing the MLIR -> torch direction for a uint8/bool tensor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions