Live backend detected — the GPU worker is ready. Switch to LIVE
Helix × GPT-2 COMPLETION PLAYGROUND GPT-2-XL · 1.5B · 2019 base model · not an assistant
PREVIEW · MOCK TELEMETRY (no GPU connected)
Models & paths Proof & attestation
Helix — the verifiable execution layer

Watch a 1.5-billion-parameter model think on a stack you can rebuild from 299 bytes.

GPT-2-XL is a 2019 base completion model. It continues text; it is not tuned to chat, follow instructions, or be factual. The live thing being proven here is the verified compute underneath — every layer and kernel comes from the from-raw Helix toolchain.

1.5B params · 48 layers 8 kovc-emitted kernels fp32 · forward-only · greedy live XL ≈ 10 s/token — intentionally slow; the pitch is trust, not speed
Conversation — text completion GPT-2-XL · fp32 · greedy
Give GPT-2 some text to continue
Pick a seed below or type your own. GPT-2-XL will continue it token-by-token while the 48 transformer layers and kovc kernels light up on the right. It is a base model — expect continuations, not answers.
Conversation = repeated completion with carried context. Each turn re-sends the conversation so far as one completion prompt — the model itself is stateless between requests: a 2019 base completion model, not an assistant. The live server caps the prompt at ~320 tokens (--max-ctx); when the carried text would blow that budget, the oldest text is cut first and the page says so.
Enter ↵ to run · Shift+Enter for a newline
GPU busy — one generation at a time (single-flight; the server keeps no queue, so the page just waits politely and retries).
Honest residuals: fp32-only · complete-to-PTX-not-SASS · single GPU (sm_86) · base-model-not-assistant · oracle-shares-spec · never-claimed-AGI. This is a demonstration of verifiable execution — not a claim of model quality, speed records, or full-GPU verification. No live parity verdict appears in this chat.