Genesis 1B: Run 2 Extended - 60,000 Steps
Run 2 extended to 60,000 steps (~31.5B tokens). Step 42,292/60,000 (70.5%), loss ~1.94 avg. ETA April 16, 2026.
Research, engineering, and insights from Kroonen AI.
Run 2 extended to 60,000 steps (~31.5B tokens). Step 42,292/60,000 (70.5%), loss ~1.94 avg. ETA April 16, 2026.
Same 1B parameters, 3x throughput. How torch.compile, real-valued RoPE, a deeper architecture (32 layers vs 20), and batch tuning tripled training speed on the same 2x RTX 4090 hardware.
Data sovereignty, constitutional alignment, and the case for training language models on consumer hardware. Why the future of AI is local, private, and personality-first.
A silent AdamW state bug during Run 1 that produced a false recovery on poisoned weights. The load path didn't crash or hang -- it just silently ruined the model.
How we fixed FSDP checkpoint deadlocks during Run 1 on consumer GPUs without NVLink using DCP sharded checkpoints and decoupled evaluation.