Skip to content

Data Submission

Submitters are the network’s producers. Everything else — models, validators, targets — exists to coordinate their work. Here’s what the job looks like.

Each epoch, the network generates targets across embedding space. A submitter browses open targets and picks one to aim for. Each target has a set of assigned models and a distance threshold (radius).

The submitter downloads model weights for the target’s assigned models using get_model_manifests(target). Weights don’t change mid-epoch, so this is a one-time cost per cycle. Having local copies means the submitter can run as many computations as they want without network overhead.

Data can be anything: text, images, audio, code, research, proprietary datasets. The output must be reducible to bytes. This might mean generating synthetic data, structuring proprietary archives, or commissioning domain experts.

The submitter runs all assigned models against their data using the scoring service. Each model produces a loss score (cross-entropy + SIGReg) and an embedding.

The model with the lowest loss wins. That model’s embedding becomes the submission’s embedding. The distance from this embedding to the target determines how competitive the submission is — closer is better.

Everything goes on-chain in a single transaction: target ID, data commitment (hash), data URL, checksum, size, winning model ID, embedding, distance score, and a bond proportional to data size.

The bond is a security deposit. If the submission is honest, the submitter gets it back. If it’s fraudulent, the bond is forfeited. Bonds scale with data size, making large submissions expensive to fake.

At the end of the epoch, the submission closest to each target wins. A challenge window follows — one full epoch during which anyone can dispute the result by posting their own bond and re-running the computation.

After the challenge window closes, the winner claims rewards: 50% to the submitter, 50% to the model whose weights were used.