This is a compiler for the Mx* language, which is a mixture of C++, Java and Python.
It's the course lab of CS2966@SJTU (Compiler Design, 2024 Summer)
This project depends on pybind11, among anything else.
In order for the dominator module to work correctly, you need to build it via pybind11.
As a result, one notable step to set up is to install pybind11 headers. On ubuntu, you may run sudo apt-get install pybind11-dev. On Windows, install MSVC and then install pybind11 manually as mentioned in this part of the official docs. Afterward, the build scripts should work.
Alternatively, a Dockerfile is provided for you to build the project in a containerized environment. To build the image, run docker build -t mxstar-compiler .. To run the container, run docker run -it mxstar-compiler < /path/to/your/input/file.mx.
Afterwards, make build should be enough.
- Dead Code Elimination (Naive DCE)
- Memory-to-Register Promotion (mem2reg)
- Global Variable Inlining
- Remove Unreachable Blocks
- Remove Critical Edges
- Reverse Post-Order Block Rearrangement
- Sparse Conditional Constant Propagation (SCCP)
- Global Value Numbering with Partial Redundancy Elimination (GVN-PRE)
- Copy Propagation
- Liveness Analysis
- MIR Construction
- Lots of small optimizations in ASM generation
Specify optimization levels using the -O flag:
python main.py -O O1 input.mx -o output.s # Standard optimization
python main.py -O O0 input.mx -o output.s # Minimal optimization
python main.py -O gvn_pre input.mx -o output.s # GVN-PRE specific optimizationsFor comprehensive testing instructions, including LLVM IR testing, assembly testing, semantic analysis, and optimization level usage, please refer to the Testing Guide.
ACM-Compiler/
βββ main.py # Main compiler entry point
βββ mxc/ # Core compiler package
β βββ frontend/ # Frontend components
β β βββ parser/ # ANTLR grammar files (MxLexer.g4, MxParser.g4)
β β βββ semantic/ # Semantic analysis (scope, type checking, syntax validation)
β β βββ ir_generation/ # IR generation (IR builder, block chain)
β βββ middle_end/ # Optimization passes
β β βββ cfg_transform.py # Control Flow Graph transformations
β β βββ dce.py # Dead Code Elimination
β β βββ mem2reg.py # Memory-to-Register promotion
β β βββ sccp.py # Sparse Conditional Constant Propagation
β β βββ gvn_pre.py # Global Value Numbering with Partial Redundancy Elimination
β β βββ liveness_analysis.py # Liveness analysis for register allocation
β β βββ mir.py # Machine IR construction
β βββ backend/ # Code generation
β β βββ asm_builder.py # Assembly code generation
β β βββ asm_repr.py # Assembly representation
β β βββ regalloc.py # Register allocation
β β βββ operand.py # Operand handling
β βββ common/ # Shared utilities
β β βββ dominator/ # Dominator tree analysis (C++ module with Python bindings)
β β βββ ir_repr.py # IR representation classes
β β βββ renamer.py # Variable renaming utilities
β βββ runtime/ # Runtime support
β β βββ builtin.c # Built-in functions implementation
β βββ test/ # Unit tests and test utilities
βββ testcases/ # Test suite
β βββ sema/ # Semantic analysis tests
β βββ codegen/ # Code generation tests
β βββ optim/ # Optimization tests
β βββ optim2/ # Additional optimization tests with I/O
β βββ demo/ # Demo programs
βββ scripts/ # Build and test scripts
β βββ antlr-build.bash # ANTLR parser generation
β βββ pybind11-build.bash # pybind11 module compilation
β βββ test_*.bash # Testing scripts
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container setup
βββ Makefile # Build automation
βββ README.md # This file
βββ TESTING.md # Comprehensive testing guide
- Frontend: Handles lexical analysis, parsing, and semantic analysis of Mx* source code
- Middle-end: Implements various optimization passes including DCE, SCCP, GVN-PRE, and more
- Backend: Generates assembly code with register allocation and instruction selection
- Runtime: Provides built-in function implementations for the Mx* language
- Dominator Module: High-performance C++ implementation for dominator tree analysis with Python bindings
- Test Suite: Comprehensive tests covering semantic analysis, code generation, and optimizations