Skip to content

Hekbas/Luth

Repository files navigation

Luth Logo

Version Build Status
Language Platform License

C++ game engine built to explore high-performance architecture.
Currently under active development, serves as both a learning platform and research project.

Or it might just be a playground to test my sanity.

Important

My original Bachelor's Thesis version is archived in the thesis branch.

Engine Screenshot


Why Luth?

Honestly? I just really love this stuff.

It started with my Bachelor's Thesis, where I designed a dual-renderer engine to benchmark Vulkan path tracing against traditional OpenGL PBR. The focus was purely on real-time graphics, so the underlying architecture was single-threaded. It worked, and I had a blast building it!

Then I watched Christian Gyrling’s GDC talk on Parallelizing the Naughty Dog Engine Using Fibers. Seeing how they saturated every single CPU core made me realize that my "simple loop" was basically running with the parking brake on.

So, I started Luth from scratch to explore high-performance architecture: fiber-based job systems, lock-free memory models, and bindless Vulkan rendering. It is absolutely over-engineered for a solo project, but that’s the point.


Shuddup! how build??

Prerequisites:

  • OS: Windows 10 / 11
  • Compiler: MSVC (v143+) or Clang (C++20-compliant)
  • SDK: Vulkan SDK 1.3+. Needs dynamicRendering, timelineSemaphore, and descriptor indexing (any GPU 2018+)

Steps:

  1. Clone with submodules
    git clone --recursive https://github.com/Hekbas/Luth.git
  2. Generate the VS solution
    scripts/setup/setup_windows.bat
  3. Build — either open Luth.sln in Visual Studio 2022, or run the headless script:
    scripts/build/build_windows.bat

The editor binary lands at bin/windows-x86_64/Debug/Runtime/Luthien.exe.


Technical Architecture

Luth moves away from standard C++ patterns (RAII everywhere, heavy STL usage, single-threaded contexts) in favor of Data-Oriented Design and Fiber-Based Concurrency.

1. The Fiber Job System

Instead of dedicated OS threads per task ("Render Thread", "Audio Thread"), Luth treats the CPU as a generic worker pool.

  • N:M Threading: One Worker Thread per CPU core. Logical tasks are wrapped in Fibers aka lightweight user-mode stacks that migrate freely between workers.
  • Zero Blocking: When a job waits on a dependency (or the GPU), it yields to the scheduler, which swaps in another fiber. CPU saturation stays near 100%.
  • Synchronization: SpinLocks (test-and-set + _mm_pause()) and Atomic Counters keep critical sections short, never blocks the OS.

2. Pipelined Frame Execution

Three stages run in parallel. At any frame T, the engine is processing three frames at once:

time ──►   frame N          frame N-1        frame N-2
          ┌────────┐      ┌───────────┐     ┌─────────┐
   CPU →  │  Game  │  →   │  Render   │  →  │   GPU   │
          │  logic │      │ recording │     │ execute │
          └────────┘      └───────────┘     └─────────┘                                     
  1. Game (N): Physics, AI, transform updates.
  2. Render (N-1): Read last frame's results, record Vulkan command buffers in parallel.
  3. GPU (N-2): Execute the commands submitted previously.

3. Memory Strategy

new / delete are forbidden in the hot path. Two allocators handle everything that churns:

Page Pool (2 MB virtual pages)
 ├── TaggedPageAllocator   ──  tagged lifetime, bulk free
 │   └── per-thread cache  ──  lock-free hot-path allocations
 └── LinearAllocator       ──  per-frame, reset on Begin()
  • Tagged Page Allocator — Naughty Dog–style. Allocations carry a tag (LevelGeometry, Frame_N, …) and are freed in bulk by tag.
  • Linear Allocator — bump-allocate transient frame data (command lists, UI state); resets each frame, no per-object destructors.

4. Vulkan 1.3 Backend

Modern hardware, minimal driver overhead.

  • Bindless Descriptors: VK_EXT_descriptor_indexing binds all engine textures to one global array (Set 0). Materials store an integer index — any draw call can sample any texture without rebinding.
  • Dynamic Rendering: No VkRenderPass / VkFramebuffer — passes use vkCmdBeginRendering directly.
  • Timeline Semaphores: Replace vkWaitForFences. A dedicated Poller Job queries semaphore values and wakes dependent fibers only when the GPU finishes their workload.
  • VMA: Vulkan Memory Allocator handles all device-memory placement (buffers, images, staging).

5. Render Graph

Each frame, Luth builds a DAG of render passes. Passes declare reads and writes through a RenderPassBuilder; the graph solves pipeline barriers, culls unused passes, and computes resource lifetimes automatically.

graph.AddPass<GeometryPassData>("GeometryPass",
    [&](GeometryPassData& data, RG::RenderPassBuilder& builder) {
        data.depthTex  = builder.WriteDepth(sceneDepth, ...);
        data.outputTex = builder.Write(sceneColor);
        data.indirect  = builder.ReadIndirectBuffer(indirectBuffer);
    },
    [=](GeometryPassData& data, RG::RenderPassContext& ctx) {
        // record draw commands on ctx.commandBuffer
    });

Passes execute in topological order; command-buffer recording inside each pass parallelizes across worker threads.


Features

Rendering

PBR Cook-Torrance BRDF, metallic/roughness workflow, material SSBO with render mode variants (Opaque, Cutout, Transparent)
Lighting 1 directional + up to 64 point lights from ECS, LightUBO (Set 3)
Shadows 4-cascade PSSM (Sascha Willems bounding-sphere fit), per-cascade GPU cull, PCF 3×3 via sampler2DShadow, cascade blending + bias
Ambient Occlusion GTAO half-res compute chain — depth prefilter → horizon integral → bilateral denoise (Jimenez 2016 slice integral, VS-normal reconstruction from depth)
GPU Culling Compute frustum cull per cascade + main scene, GPUObjectData SSBO (Set 5), vkCmdDrawIndexedIndirect everywhere
IBL HDR skybox, diffuse irradiance, pre-filtered specular (5 mips), BRDF LUT, split-sum ambient
Post-Processing HDR pipeline, bloom, tonemapping (Reinhard/ACES/Uncharted 2/exposure), vignette, film grain, chromatic aberration
Shaders Single-stage SPIR-V asset pipeline (.vert/.frag/.comp each one artifact + UUID), hot-reload on any stage via FileWatcher, SPIRV-Cross reflection
Pipeline Cache Disk-persisted VkPipelineCache, lazy variant creation, targeted hot-reload invalidation
Mipmaps Per-texture settings pipeline with sampler maxLod control

Animation

Sampling Fiber-parallel keyframe evaluation across worker threads
GPU Skinning Bone matrix SSBO, vertex shader skinning
Blending SQT interpolation, crossfade transitions, layered override with bone masks
Root Motion Automatic extraction and application to entity transform
Debug Bone overlay visualization in editor viewport

Asset Pipeline

Asset Database UUID-based registry with .meta sidecar files, importers for shaders/textures/models/materials
Smart Import Multi-strategy texture discovery, drag-and-drop with eager import, texture remap dialog
Hot Reload FileWatcher-based live reload for shaders, textures, and project files
Scene Format Custom JSON .luth format with dirty tracking and native file dialogs

Editor

Scene Interaction Mouse picking (ID buffer), selection outlines with occluded fade, shade modes (Lit/Wireframe/Unlit)
Inspector Material editor, animation controls, light/shadow settings, Add Component workflow
Undo / Redo Command pattern with 14 command types, UUID-based entity resolution, gizmo drag coalescing, compound commands, material snapshot undo
Frame Debugger Trigger-based capture, frozen-state model with auto-recapture on camera move, hierarchical event tree (Group/Pass/Cascade/Draw), per-draw replay-then-copy, archive sink + per-pass image staging, CSM cascade detail panel
Project Panel Folder navigation, search, hot reload, context menus for entity/primitive creation
Profiler Per-system timing breakdown with fiber-aware instrumentation
Persistence Window layouts, editor settings, and panel state saved across sessions

Roadmap

See the full development roadmap for completed phases and version history.

Future Ideas

Rendering — Deferred GBuffer, Forward+ clustered lighting, FXAA/TAA, global illumination, volumetric fog, SSR

Gameplay — Physics (Jolt, jobified), GPU particle system, animation blend trees & IK, prefab system, scripting (C#/Lua)

Editor — Play mode, asset streaming, visual shader editor


Dependencies

LUTH Engine is built on the shoulders of giants:

Vulkan SDK Rendering backend
VMA Vulkan memory allocator
shaderc Runtime GLSL → SPIR-V compilation (ships with Vulkan SDK)
SPIRV-Cross Shader reflection
EnTT Entity-Component-System
ImGui Editor GUI
ImGuizmo Translate / rotate / scale gizmos
Tracy Frame profiler
GLFW Windowing + input
GLM Math
spdlog Logging
assimp Model importing
stb_image Image loading
nlohmann/json JSON serialization

Planned integrations:

  • Jolt Physics — rigid body physics, jobified onto the fiber scheduler

License

Released under the MIT License.

About

Data-oriented C++20 game engine with a fiber-based job system and bindless Vulkan 1.3 renderer.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors