Models
Models are how the network converts data into embeddings. Not all embeddings are equal — some models understand certain domains better than others. SOMA creates a competition to find out which.
What Models Do
Section titled “What Models Do”A model on SOMA does two things:
Predict: Given a sequence of bytes, predict the next one. Lower loss means better understanding.
Represent: Convert data into an embedding — a vector that captures what the data means.
Every model on SOMA uses the same architecture. Validators define this architecture and can upgrade it network-wide. The architecture is shared, but the weights are different for each model.
Each model also registers an embedding representing its specialization — the domains it understands best. This embedding is used by the network for routing: when targets are generated, models are assigned to them via stake-weighted KNN over these specialization embeddings.
How Models Compete
Section titled “How Models Compete”When targets are created each epoch, the network assigns models to each target via stake-weighted KNN. Submitters download those models’ weights, run them locally on their data, and the model with the lowest loss wins. The winning model’s embedding of the data becomes the submission’s embedding, which is scored against the target.
Models compete indirectly: better prediction → more wins → more stake → higher KNN priority → assigned to more targets → more reward opportunities. A model earns 50% of the target reward when its weights produce the winning submission.
V1 Architecture
Section titled “V1 Architecture”The current network architecture is a byte-level pre-norm transformer:
- Layers: 24
- Embedding dimension: 2048
- Attention heads: 8 (256 head dim)
- FFN hidden dimension: 8192 (4× expansion)
- Vocabulary: 264 tokens — 256 byte values + 8 special tokens (PAD, EOS, etc.)
- Max sequence length: 1024 bytes
- Positional encoding: RoPE
- Loss: cross-entropy + SIGReg (a Gaussian uniformity regularizer that prevents embedding collapse)
- Serialization: safetensors (PyTorch and Flax implementations available)
This is the current architecture. Validators can upgrade it network-wide by consensus.
Why Weights Are Public
Section titled “Why Weights Are Public”All model weights on SOMA are public. Anyone can download and use them.
This seems counterintuitive. Why would a model publish its weights if competitors can copy them?
Privacy: Submitters download model weights and compute scores locally. The data never leaves the submitter’s machine unless it wins a target. Public weights enable private data.
Proof of training: If your weights are good, others copy them. If you stop improving, you get surpassed. There’s no moat from secrecy — only from continuous innovation. The best models stay ahead by getting better, not by hiding.
Publishing Weights
Section titled “Publishing Weights”If weights are public, what stops a model from copying a competitor’s weights mid-round?
SOMA uses a two-phase commit-reveal:
Commit: Post a hash of your weights before the epoch starts.
Reveal: Publish the actual weights at the epoch boundary.
The hash locks you in. If you change your weights after seeing what others submitted, the hash won’t match. Models commit before they can see competitors’ weights, then reveal simultaneously.
Models pay a fee to publish new weights each cycle. This covers the cost validators incur to store them. Weights must be refreshed to stay active in the network.
Early on, rent is reduced to encourage participation. As the network matures and model rewards increase, rent adjusts to market rates set by validator consensus.