trust chain closed · v1.3 · source-available

A compiler that builds itself from 299 bytes of hand-typed hex.

Helix is a from-scratch, ML-native systems language — autodiff, tile types and GPU codegen built into the language, bootstrapped from raw binary with no trusted pre-built compiler. And it runs the real GPT-2, verified token-for-token.

Verifiable execution

Real neural networks. Proof, not promises.

GPT-2 — the real, unchanged public model — runs on this stack token-for-token-identical to an independent reference. So does a 2024 Llama-architecture model. Bring your weights; audit instead of trust.

The bootstrap chain

Nine rungs. 299 bytes you have to trust.

Each rung is built only by the rung before it — no pre-built compiler is ever trusted. Select any node to inspect it; one committed command reproduces the whole ladder.

Three commitments

A foundation, not another framework.

Helix exists to remove uncertainty wherever software honestly can — every byte inspectable, every step reproducible, every limit stated plainly.

01

Source-available & fully auditable.

The entire toolchain is inspectable, from the 299-byte hand-typed root to the PTX it emits. Free for non-commercial use — individuals, students, researchers, non-profits — under the Helix Non-Commercial License; commercial use requires the author's consent. Not open source yet: a restriction intended to be temporary while the project secures investors, revenue, and industry partners.

Helix NC License v1.0source-available
02

Bootstrapped from raw binary.

No trusted pre-built compiler, anywhere. The hex0 root is 299 hand-authored bytes of x86-64; nine rungs compile each other up to kovc, the Helix compiler written in Helix — with a byte-identical self-host fixpoint and no Python in the toolchain.

299 B → kovcK2 == K3 == K4
03

ML-first language design.

Forward- and reverse-mode autodiff are language features, not libraries. Tiles and tensors are types. Effects are verified at the IR level, reflection is verifier-gated, and the same compiler emits x86-64 ELF and PTX for real GPUs.

grad / grad_rev_alltile<f32, [N,N], REG>
autodiff.hx · grad
fn loss(x: f64, y: f64) -> f64 {
    x * y + x * x
}

struct Grad { dx: f64, dy: f64 }

fn main() -> f64 {
    // reverse-mode AD, all gradients in one sweep
    let g = grad_rev_all(loss)(2.0_f64, 3.0_f64);
    g.dx
}
kovc loss.hx -O2 → 7.0_f64
Autodiff, in the language

Gradients are a keyword, not a library.

Forward- and reverse-mode autodiff lower into the program at compile time — no runtime tape, no tracing, no Python overhead. Stdlib transcendentals carry analytic chain rules.

  • grad(f)(x)Forward-mode derivative. Symbolic, fully inlined.
  • grad_rev_all(f)(...)Reverse-mode adjoints; returns a struct of all partials.
  • @pureIR-level effect verification — pure functions are transitively barred from effectful code.
  • quote / modify_fVerifier-gated reflection: your verifier runs before any mutation commits.
In Helix, this compiles

A real systems language. With math notation.

Pattern matching, generics, traits, closures, tiles, reflection — every kovc build is gated by a 109-program feature corpus. Three samples:

By the numbers

Hard numbers, every one of them grounded.

Every figure here resolves to a committed file, gate output, or trust record — pull the repo and verify any of them.

299bytes
hex0 — the hand-authored trust root
9rungs
raw binary → kovc, each built only by the prior rung
1.py
exactly one committed Python file — a fenced oracle, never in the toolchain
109programs
feature corpus gating every kovc build
8kernels
kovc-emitted PTX set that runs GPT-2 124M → XL
25/25tokens
greedy continuation matches the independent oracle — GPT-2 & SmolLM2
4.9e-5max abs
SmolLM2-135M logit difference vs the oracle, across 49,152 logits
1.5Bparams
GPT-2-XL at fp32 on one 8 GB sm_86 GPU
Carbon meets silicon

Compile something honest today.

Rebuild the entire compiler from 299 hand-typed bytes with one committed command — then watch the real GPT-2 run on it, verified token-for-token against an independent reference.