Sovereign Lattice QCD — Pure Rust Gauge Theory on Consumer GPUs

A deployable lattice QCD stack replacing QUDA/MILC/Chroma C++/CUDA with pure Rust + WGSL — guideStone-certified, MILC-compatible output, runs on any Vulkan GPU.

Organization: sporeGarden (product name TBD)
License: scyBorg (AGPL-3.0-or-later + ORC + CC-BY-SA 4.0)
Status: Engine validated, product packaging in development


What It Is

A sovereign lattice QCD stack — pure Rust, zero C dependencies, no CUDA — that produces MILC-compatible gauge configurations on consumer GPUs. The physics engine already exists across hotSpring, barraCuda, coralReef, and ToadStool. The product is the packaging: a guideStone-certified deployment artifact that a lattice physicist can scp to any machine and run.

The established lattice QCD toolchain — QUDA (C++/CUDA, GPU), MILC (C, CPU), Chroma (C++, JeffersonLab) — requires CUDA, vendor SDKs, MPI, and HPC cluster access. This product replaces all of it with a single static binary.


What Already Works

The physics engine is validated. The springs are the acceptance tests.

CapabilityPrimalEvidence
Wilson gauge action + SU(3)barraCudaPlaquette at beta=6.0: 0.5929 (literature ~0.594)
Gradient flow (W6, W7, CK4, LSCFRK3)barraCudaConvergence orders 2.06/2.08/2.11, LSCFRK3 coefficients derived from first principles
Staggered fermions + HMCbarraCudaDynamical N_f=4 adaptive Omelyan — in progress
f64 on consumer GPUsbarraCuda + coralReefVulkan SHADER_F64: native f64 at 1:2 throughput on RTX 4070
Sovereign GPU compilercoralReefWGSL to native GPU binary — no LLVM, no NVCC, no vendor SDK
Hardware dispatchToadStoolNVIDIA SM70-SM89, AMD RDNA2 (GFX1030), auto-detection
Cross-substrate parityguideStone40/40 bit-identical across 5 substrates (x86_64, aarch64, NVIDIA, AMD, CPU-only)
guideStone certificationhotSpring-guideStone-v0.7.059/59 checks, 3 published papers reproduced, self-leveling benchmark

Three published papers independently validated by the original author (TC Chuna, MSU/Murillo Group):

PaperCitationResult
Gradient flowBazavov & Chuna, arXiv:2101.0532014/14 checks — integrators, t0/w0 scale, convergence
BGK dielectricChuna & Murillo, PRE 111, 03520625/25 checks — Mermin, f-sum, DSF, conductivity
Kinetic-fluid couplingHaack et al., JCP (2024)20/20 checks — BGK relaxation, Sod shock, coupled interface

What the Product Adds

The engine does the physics. The product packages it for lattice physicists.

FeatureWhat It Does
ILDG-compatible outputGauge configurations in the International Lattice Data Grid format — directly consumable by MILC, Chroma, and existing analysis tools
Measurement pipelinePlaquette, Polyakov loop, topological charge, Wilson flow observables — the standard lattice measurements
Self-leveling benchmark./hotspring benchmark characterizes unknown hardware against published lattice results — the physics is the benchmark
Deploy Graph compositionhotSpring + barraCuda + coralReef composed via biomeOS as a single BYOB Niche
Portable artifactStatic musl binary, dual-arch (x86_64 + aarch64), OCI container, USB-deployable

How It Composes

LayerWhatPrimal
MathWGSL f64 shaders: gauge action, force, HMC, gradient flow, spectralbarraCuda
CompilationWGSL to native GPU binary (NVIDIA + AMD)coralReef
DispatchHardware discovery, GPU scheduling, workload routingToadStool
ValidationhotSpring — the spring that proves the physicshotSpring

Why It Matters

QUDA + MILC + ChromaThis product
LanguageC, C++, FortranRust
GPU backendCUDA (NVIDIA only)Vulkan / WGSL (NVIDIA, AMD, Intel)
Precisionf64 on compute-class onlyf64 on consumer GPUs ($600 RTX 4070)
DependenciesCUDA SDK, MPI, autoconf, LLVMZero (static binary)
InstallationDays (build MILC, QUDA, configure MPI, test)Minutes (tar xf && ./hotspring validate)
CostHPC cluster allocation$4K basement workstation
DeploymentCluster job scriptsUSB drive
Memory safetyManual C/C++Compiler-guaranteed

NVIDIA’s CUDA pricing model throttles consumer f64 to 1:64 throughput to protect the compute-class product line. Vulkan’s SHADER_F64 extension exposes the native 1:2 ratio. The $600 RTX 4070 does the same f64 physics as a $10,000 A100 — CUDA just doesn’t let you see it.


Current Status

  • Engine: Validated (59/59 checks, 3 papers, cross-vendor GPU parity)
  • ILDG output format: In development
  • Measurement pipeline: Plaquette and flow observables working; Polyakov loop and topological charge next
  • Product packaging: Pending (product name TBD, sporeGarden repo TBD)

See also: guideStone for the verification class, Paper 10 — First Dynamical QCD on Consumer GPU, Paper 07 — Sovereign WDM Simulation, Primal Catalog for barraCuda and coralReef details.