A Java-based interpreter for the Sparrow intermediate representation language - a low-level, assembly-like language designed for educational purposes in compiler construction.
The Sparrow interpreter is a two-pass execution engine that processes intermediate representation code. It simulates a runtime environment with:
- Stack-based function calls
- Heap memory allocation
- Jump-based control flow (goto/if-goto)
- First-class function pointers
- Pointer arithmetic for array operations
This project was developed as part of CMSI-585 (Compiler Construction).
- Two-Pass Execution: First pass collects metadata (labels, functions, parameters), second pass executes instructions
- Function Calling: Support for nested function calls with function pointers
- Memory Management: Separate stack (per-function scope) and heap (global allocation)
- Control Flow: Labels, unconditional jumps (goto), and conditional jumps (if-goto)
- Type System: INTEGER, STRING, FUNCTION, POINTER, NULL
- Error Handling: Null pointer checks and custom error messages
- Java 8 (JavaSE-1.8) or higher
- JavaCC (included in
misc/javacc.jar) - JTB (Java Tree Builder, included in
misc/jtb.jar) - Parser source code (in
src/parse/java/)
Programs consist of function declarations with instructions:
func FunctionName(param1 param2 ...)
instruction1
instruction2
...
return return_value
| Instruction | Syntax | Description |
|---|---|---|
| SetInteger | var = 5 |
Assign integer literal |
| SetFuncName | var = @FunctionName |
Store function pointer |
| Add | result = var1 + var2 |
Addition |
| Subtract | result = var1 - var2 |
Subtraction |
| Multiply | result = var1 * var2 |
Multiplication |
| LessThan | result = var1 < var2 |
Comparison (returns 0 or 1) |
| Load | result = [pointer + offset] |
Load from memory/array |
| Store | [pointer + offset] = value |
Store to memory/array |
| Move | var1 = var2 |
Variable assignment |
| Alloc | pointer = alloc(size) |
Heap allocation |
print(value) |
Output to stdout | |
| Error | error("message") |
Print error and exit |
| Goto | goto label |
Unconditional jump |
| IfGoto | if0 condition goto label |
Jump if condition is 0 |
| Call | result = call func(arg1 arg2) |
Function call |
Factorial computation (object-oriented style):
func Main()
v0 = 4
w0 = alloc(v0)
vmt_Fac = alloc(v0)
v0 = @FacComputeFac
[vmt_Fac + 0] = v0
v0 = vmt_Fac
[w0 + 0] = v0
if0 w0 goto null1
w1 = [w0 + 0]
w1 = [w1 + 0]
v0 = 6
w2 = call w1(w0 v0)
print(w2)
goto main_end
null1:
error("null pointer")
main_end:
return v0
func FacComputeFac(this num)
v0 = 1
w0 = num < v0
if0 w0 goto if1_else
num_aux = 1
goto if1_end
if1_else:
w1 = [this + 0]
w1 = [w1 + 0]
v0 = 1
w2 = num - v0
w3 = call w1(this w2)
num_aux = num * w3
if1_end:
return num_aux
Output: 720 (6! = 720)
┌─────────────────────────────────────────────────────────────┐
│ Interpreter.java │
│ (Entry Point) │
└───────────────┬─────────────────────────────────────────────┘
│
├──> SparrowParser.java (src/parse/java/SparrowParser.java)
│ Parses input → Abstract Syntax Tree (AST)
│
├──> Heap.java
│ Memory manager & metadata store
│
├──> LabelledInstructionGatherer.java (Pass 1)
│ Collects functions, labels, parameters
│
└──> Executor.java (Pass 2)
Executes instructions sequentially
│
├──> Scope.java
│ Function call frames (stack)
│
└──> MemoryUnit.java
Variable containers with types
Key Classes:
Interpreter.java: Entry point that orchestrates parsing and execution (located atsrc/main/java/Interpreter.java)Executor.java: Instruction execution engine with visitor pattern (src/main/java/Executor.java)LabelledInstructionGatherer.java: Preprocessing pass to build execution metadataHeap.java: Central memory and metadata managerScope.java: Represents a function's call frame with local variablesMemoryUnit.java: Container for variable values with type informationVariableType.java: Enum defining INTEGER, STRING, FUNCTION, POINTER, NULL
-
Parse Phase:
- Read Sparrow code from stdin
SparrowParsergenerates AST
-
Metadata Collection (Pass 1):
LabelledInstructionGatherertraverses AST- Collects all functions and their instructions
- Maps labels to instruction indices
- Extracts function parameters
- Populates
Heapwith metadata
-
Execution (Pass 2):
Executorstarts fromMain()function- For each instruction:
- Fetch at current program counter
- Execute via visitor pattern
- Increment program counter
- On
goto/if-goto: Jump to new location - On
call: Create new scope, execute function - On
return: Destroy scope, return to caller
Stack:
- Each function gets a
Scopeobject representing its call frame - Local variables stored with stack addresses starting at 4
- Automatically destroyed on function return
Heap:
- Allocated via
alloc(size)instruction - Returns pointer (memory address) as integer
- Used for arrays and object-like structures
- Persists across function calls
- Memory unit size: 4 bytes (
MemoryUnit.size)
Example:
v0 = 4 // v0 = 4
w0 = alloc(v0) // Allocate 4 bytes on heap
[w0 + 0] = v0 // Store 4 at address w0
w1 = [w0 + 0] // Load value from w0 into w1
- Factorial.sparrow - Recursive factorial with OO-style method calls
- QuickSort.sparrow - Quicksort algorithm with array operations
- ManyCalls.sparrow - Tests multiple function calls with many arguments
- CalleeSave.sparrow - Tests callee-save register conventions
- ManyArgs2.sparrow - Function calls with many arguments
- strech.sparrow - Stress test for complex scenarios
The Sparrow grammar is defined in misc/sparrow.jj (JavaCC format). To regenerate the parser:
- Modify
misc/sparrow.jj - Run JTB to generate syntax tree classes:
java -jar misc/jtb.jar misc/sparrow.jj - Run JavaCC on the JTB output:
java -jar misc/javacc.jar misc/sparrow.jj - Generated files will be placed in
src/parse/java/andsyntaxtree/
- Memory Unit Size: Fixed at 4 bytes
- Heap Start Address: 4 (address 0 reserved for null)
- Values: All stored as strings, parsed as needed
- Visitor Pattern: Extensive use for AST traversal
- Program Counter: Manual management for control flow