Axio is a modular compiler prototype for a C-like language, implemented in C++ with a build pipeline that targets LLVM IR, native object files, and final binaries.
The repository is structured so that lexer, parser, AST, metaprogramming passes, diagnostics, and LLVM lowering stay isolated. The goal is to make new language features land in focused modules instead of turning the compiler into a single giant file.
- CMake project with LLVM integration and optional re2c-based lexer generation
- C-like surface syntax for
struct,extern, functions, local variables, integer arithmetic, string literals, line/block comments, pointer syntax, address-of, dereference,return,defer, and inline LLVM functions - AST-driven parsing with a clean separation between lexing, parsing, meta passes, and code generation
- explicit module loading with public interfaces, selective imports, and re-exports
- semantic validation for
constglobals, locals, and parameters - Diagnostic engine with source ranges, line and column rendering, and caret markers
- Simple metaprogramming hooks:
- annotations such as
@inline
- annotations such as
- LLVM backend that emits
.ll,.o, and a linked binary
- Comments:
//comments run until the next newline/* ... */comments can span multiple lines
- Functions without an explicit return type are treated as
voidreturnis optional- a bare
returnis allowed
defer call()runs the deferred call when the current scope exits, in LIFO orderwhile,for,do { ... } while ..., andswitchare supportedswitchcases do not fall through into the next case- reaching the end of a case body exits the
switch; usebreakonly when you want to exit early from inside nested control flow in that case breakexits the nearest loop orswitchcontinueadvances the nearest loop and is rejected insideswitch- for non-flag enums, a
switchwithoutdefaultmust cover every enum case - enum-switch exhaustiveness diagnostics list the concrete missing enum members when they can be determined
- overlapping constant switch cases are rejected during semantic analysis
- switch patterns must be compile-time constants
- dense value-only switches lower through LLVM
switch fn llvm name(...) ... { ... }lets you define a function body directly in LLVM IR- the body must contain only LLVM IR, no Axio statements
- the inline LLVM body is parsed and verified before Axio accepts it
- inline LLVM currently supports exactly one return value
Requirements:
- CMake 3.28+
- LLVM development package with
LLVMConfig.cmake - C++20 compiler
clang++for final linkingre2coptional for lexer regeneration
Build commands:
cmake -S . -B build
cmake --build build -jAxio uses Doxygen-compatible inline comments in public headers and key internal workflow headers.
The generated docs are not limited to public facades. Private methods and core internal workflow helpers are documented too, so you can inspect what a method exists for even when it is not part of the external API.
If doxygen is installed, generate browsable HTML and machine-readable XML:
make docsOutputs:
docs/api/html/index.htmldocs/api/xml/
See docs/API_DOCS.md for details on coverage and intended usage.
Compile a source file to LLVM IR, object file, and native binary:
./build/axc examples/hello.ax -o build/helloThen run the produced program:
./build/helloGenerate only LLVM IR:
./build/axc examples/hello.ax --emit-llvm-only -o build/helloSee examples/hello.ax and examples/assets/banner.txt.
include/axc/Support,src/Support- source management and human-readable diagnostics
include/axc/Lex,src/Lex- token model and lexer, with
src/Lex/Grammar/Lexer.reas the re2c grammar seed
- token model and lexer, with
include/axc/AST- syntax tree definitions
include/axc/Parse,src/Parse- recursive descent parser split by declarations, statements, and expression families
src/Driver/Module- single-file parsing, module interfaces, import binding rules, qualification, and recursive loading
include/axc/Sema,src/Sema- semantic validation for symbols, const rules, ownership checks, and class/member usage
include/axc/Meta,src/Meta- annotation validation pipeline and future pass insertion points
include/axc/Codegen,src/Codegen- LLVM IR lowering split by declarations, expressions, statements, and module workflow
include/axc/Driver,src/Driver- compiler orchestration and CLI-facing pipeline control
- each file declares its package path explicitly at the top:
package math.ops
pub fn,pub struct,pub class,pub enum,pub const,pub let- exported as part of the package interface
- declarations without
pub- private to their defining package file set
- bare import of a package injects its exported names into the current local scope
- import blocks are supported
- aliases are supported
pub import foo.bar{Name}remains available as an explicit re-export form
Examples:
package app.main
import (
math.ops
pt geom.point
)
fn main() int {
let p pt.Point = pt.Point(1, 2)
return add(p.x, p.y)
}
The package interface fingerprint is derived from the exported API surface, so private implementation changes can stay below that boundary in future incremental builds.
constglobals are immutableconstlocal bindings are immutable inside their scopeconstfunction parameters cannot be reassigned- these checks run in semantic analysis before code generation
The current codebase establishes the extension points for a multi-stage language model:
- preprocessing stage
- annotation-driven AST transforms
- LLVM IR transformation stage
Right now the repository contains the skeleton and first working examples for annotation validation and later pass insertion. The next step is to promote those hooks into a formal plugin/pass API.
Your performance target is realistic only if the pipeline is designed around parallel front-end work and incremental compilation. This repository lays the modular groundwork, but it is intentionally still a prototype.
The most important future upgrades are:
- parallel parsing and semantic analysis per file or per package
- incremental dependency graph and cached intermediate artifacts
- arena allocation and interned identifiers for lower front-end overhead
- richer recovery parser so diagnostics keep flowing after the first failures
- a dedicated semantic phase with type checking and symbol indexing
GPU acceleration is usually not the first thing that helps a compiler. For most compilers, data layout, caching, dependency scheduling, and parallel front-end execution matter far more.