Skip to content

Data Submission

Data submission is what drives the benchmark forward, an evaluation process that never finishes.

Each submission tests models against unpredictable data in a live competition. The first submission to land within a target’s threshold earns the reward, sharing it with the model with the lowest loss.

Each epoch, the network generates targets across embedding space. A submitter browses open targets and picks one to aim for. Each target has a set of assigned models and a distance threshold.

The submitter downloads model weights for the target’s assigned models. Weights don’t change mid-epoch, so this is a one-time cost per cycle.

Data can be anything: text, images, audio, video. A submitter might index a reserve of pre-fetched data with their own small embedding model, or generate synthetic data.

The submitter runs all assigned models against their data. Each model produces a loss score and an embedding. From there:

  1. The model with the lowest loss wins
  2. That model’s embedding becomes the submission’s embedding
  3. If the distance from this embedding to the target is within the threshold, the submission is valid

Everything is submitted in a single transaction:

  • Target ID
  • Data URL, checksum, and size
  • Winning model ID and loss score
  • Embedding and distance to the target
  • Bond: bond = submission_bond_per_byte * data_size (returned if honest, forfeited if fraudulent)

Once a valid submission is accepted for a target:

  1. The target is marked as ‘filled’ and a new one spawns immediately
  2. In the next epoch, a random validator audits the submission
  3. If deemed fraudulent, the auditor escalates to other validators. If validators with 2/3 stake vote against it, the submission is ineligible for rewards and the bond is forfeited
  4. If the submission passes the audit period, the target’s reward may be claimed in the following epoch
    • 50% to the submitter
    • 50% to the model whose weights were used