Forge compiles mathematical expressions to optimized x86-64 machine code with automatic gradient computation. It follows a record-once, compile-once, evaluate-many paradigm designed for workloads where the same computation is repeated with varying inputs.
- JIT Compilation: Generates native x86-64 machine code via AsmJit
- Reverse-mode AD: Automatic gradient computation for all recorded operations
- Graph Optimizations: Common subexpression elimination, constant folding, algebraic simplification
- Instruction Set Backends: SSE2 scalar (default) and AVX2 packed (4-wide SIMD), with extensible backend interface
- Branching Support: Record-time conditional evaluation via
fboolandIf()for data-dependent control flow
Forge is designed to be backend-agnostic — the core compiler is decoupled from specific instruction sets, number types, and hardware. The AVX2 backend demonstrates this: it can be bundled at compile time (FORGE_BUNDLE_AVX2=ON) or loaded dynamically at runtime via InstructionSetFactory::loadBackend(). This architecture enables custom backends with their own register allocation strategies, machine code generation, and memory layouts. The compilation policy (ICompilationPolicy) controls whether intermediate values are stored to memory or kept in registers — enabling forward-optimized execution when gradients aren't needed, or storing values for backward forging when they are. See backends/ for implementation details and a step-by-step guide to creating custom backends.
Forge is designed for repeated evaluation scenarios:
- Monte Carlo methods: Pricing, XVA, path-dependent calculations
- Scenario analysis: Stress testing, what-if analysis, parameter sweeps
- Sensitivities: Fast gradient computation across input variations
- Model calibration: Repeated function/gradient evaluation during optimization
Trade-off: Forge incurs upfront compilation cost. For single evaluations, tape-based AD is faster. Break-even typically occurs after 10–50 evaluations depending on graph complexity.
| Phase | Description |
|---|---|
| 1. Graph API | Define computation graph using Direct API, operator overloading (fdouble), or transform from external sources (e.g., xad-forge) |
| 2. Graph Pre-processing | ForgeEngine applies graph optimizations: common subexpression elimination, constant folding, algebraic simplification, and stability cleaning |
| 3. Kernel Forging | ForgeEngine compiles optimized graph to native machine code via forward forging and optional backward forging (for gradients) using pluggable instruction set backends |
| 4. Execution | Execute the ForgedKernel repeatedly with varying inputs; retrieve computed values and gradients |
Extensibility: Custom graph transformations (1), optimization passes (2), instruction set backends with custom machine code and register management (3).
#include <graph/graph.hpp>
#include <compiler/forge_engine.hpp>
#include <compiler/interfaces/node_value_buffer.hpp>
using namespace forge;
int main() {
// 1. Graph API — Define f(x) = x² + sin(x) using Direct API
Graph graph;
NodeId x = graph.addInput();
graph.diff_inputs.push_back(x); // Mark x for gradient computation
NodeId x_squared = graph.addNode({OpCode::Mul, x, x}); // x²
NodeId sin_x = graph.addNode({OpCode::Sin, x}); // sin(x)
NodeId result = graph.addNode({OpCode::Add, x_squared, sin_x});
graph.markOutput(result);
// 2. Graph Pre-processing + 3. Kernel Forging — ForgeEngine compiles graph
ForgeEngine engine;
auto kernel = engine.compile(graph);
auto buffer = NodeValueBufferFactory::create(graph, *kernel);
// 4. Execution — Run ForgedKernel repeatedly with different inputs
buffer->setValue(x, 2.0);
kernel->execute(*buffer);
double f_x = buffer->getValue(result); // f(2.0)
double df_dx = buffer->getGradient(x); // f'(2.0)
}git clone https://github.com/da-roth/forge.git
cd forge && mkdir build && cd build
cmake .. && cmake --build .CMake integration:
add_subdirectory(forge)
target_link_libraries(your_target PRIVATE forge::forge)Requires C++17 and CMake 3.20+. All dependencies are fetched automatically.
FORGE is licensed under the Zlib License. See LICENSE.md for details.
- xad-forge — Forge JIT backend for XAD
- QuantLib-Risks-Cpp-Forge — QuantLib-Risks with Forge JIT integration
- AsmJit — High-performance machine code generation
- MathPresso — Mathematical expression JIT compilation inspiration
- AutoDiffSharp — Automatic differentiation design influence
- SLEEF — Vectorized math functions for SIMD operations
