Carpathian Open Source
Veritate
A hand-coded INT8 byte-level language model inference engine. No framework. No wrapper. Every kernel custom.
PyTorch trains the model. Veritate runs it. The two halves share nothing but a .bin weight file.
80M
Parameters
256
Vocabulary
0.59
ms / token
7.88
Perplexity
What makes it unusual
Glass-box interpretability.
Full trace on every forward pass
Per-layer residual stream, FFN neuron activations, attention scores, logit lens, and direct logit attribution. A browser-based MRI dashboard reads these live.
No GPU required
One binary. No CUDA, no driver, no runtime. CPU-native autoregressive decode at batch=1 is the CPU's home turf.
Byte-level vocabulary
256-character byte-level vocab. No tokenizer, no vocabulary mismatch, no subword artifacts. Trains on raw bytes of any corpus.
Architecture
Transformer, INT8, custom.
Layers
12
Hidden
768
FFN
3072
Heads
12
Seq length
256
Quantization
INT8 QAT
INT8 weights with per-channel quantization scales. INT16 residual stream. QAT-trained so rounding is baked in at training time.
Decode budget
0.59 ms → 0.03 ms
Five compounding optimizations planned to bring decode time down by 95%. Several are already in training.
INT4 / QuaRot
Weight quantization pass
Mixture of Depths
Adaptive layer skipping
Speculative decoding
5M draft model trained
Mamba-2 SSD
State-space backbone
BitNet b1.58 ternary
Ternary weight encoding
The long-range moonshot
Frontier-class reasoning anywhere.
The target
A 1.5B-parameter model, BitNet ternary quantized (300 MB on disk), with Mamba-2 backbone, Mixture of Experts, Mixture of Recursions adaptive depth, and reasoning-trace distillation from a teacher model.
Estimated behavior
~70B-class quality on hard reasoning tasks, running on any machine with an ALU. No FPU required. No datacenter required. The pitch is that reasoning capability should not have a hardware floor.
Currently in flight
- QAT mode 2 fine-tuning of the 80M model
- Speculative decoding end-to-end integration
- Mamba-2 SSD prototype
- Runtime shape refactor for multi-size coexistence
Open source. Read the code.
Veritate is developed by Carpathian and published on GitHub. The engine, plugins, and training scripts are all there. Fork it, run it, break it, build on top of it.