Offline language runtime on ESP32-C3: where does a flash-resident execution path fit relative to ExecuTorch? #18221
Alpha-Guardian
started this conversation in
Show and tell
Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi ExecuTorch folks,
I wanted to share a small on-device language-runtime experiment and ask how people here would think about it relative to the usual graph/runtime view of on-device AI.
We built a public demo line called Engram and deployed it on a commodity ESP32-C3.
Current public numbers:
Host-side benchmark capability
LogiQA = 0.392523IFEval = 0.780037Published board proof
LogiQA 642 = 249 / 642 = 0.3878504672897196host_full_match = 642 / 6421,380,771 bytesImportant scope note:
This is not presented as unrestricted open-input native LLM generation on MCU.
The board-side path is closer to a flash-resident, table-driven runtime with:
What makes this interesting to us is that it seems to sit somewhere between:
So I’m curious how people here would think about it in relation to ExecuTorch’s world.
If a language-task system is no longer best expressed as a standard dense graph executed by a familiar operator runtime, but instead as a highly specialized flash-resident execution path, does that still feel like “model deployment” in the usual sense? Or is it a different category entirely?
Repo:
https://github.com/Alpha-Guardian/Engram
Would love to hear any thoughts.
Beta Was this translation helpful? Give feedback.
All reactions