Engineering · Attestation

Signing every benchmark with ML-DSA. What a receipt actually proves.

·7 min read

CryptographyML-DSAAttestation

Every vector DB vendor publishes benchmarks. Most of them publish a PDF. A PDF is a thing you have to believe the vendor about. A signed manifest that hashes every artifact in the run is a thing you can verify yourself. This is the schema we use, why it is the way it is, and what it actually buys you as a reader.

The receipt, concretely

A receipt is a JSON document. It pins down the environment, hashes every artifact produced by the run, commits to the hash, and signs the commitment. Current shape:

{
  "schema": "holonomx-receipt/v1",
  "run_id": "2026-04-11T21:43:00Z-hx-sdp-100m-h100",
  "hardware": {
    "gpu": "NVIDIA H100 80GB HBM3",
    "cuda": "12.8",
    "driver": "550.54.15"
  },
  "software": {
    "hx_engine_sha": "…",
    "hx_gate_sha": "…"
  },
  "workload": {
    "entries": 100000000,
    "dim": 384,
    "precision": "fp32",
    "path": "QTT-Native",
    "query_set_sha256": "…",
    "dataset_sha256": "…"
  },
  "arithmetic": "IEEE 754",
  "hash": { "alg": "SHA-256", "standard": "FIPS 180-4" },
  "results": {
    "recall_at_10": 1.0,
    "p50_ms": 20.4,
    "p99_ms": 21.5
  },
  "artifacts": [
    { "name": "query_latency.csv", "sha256": "…" },
    { "name": "recall_curve.csv", "sha256": "…" },
    { "name": "memory_profile.csv", "sha256": "…" }
  ],
  "signatures": [
    { "alg": "ML-DSA-65", "standard": "FIPS 204", "scope": "per-run",   "value": "…" },
    { "alg": "ML-DSA-65", "standard": "FIPS 204", "scope": "aggregate", "value": "…" },
    { "alg": "Ed25519",   "scope": "classical-backup", "value": "…" }
  ]
}

Why ML-DSA-65 on everything

ML-DSA is the FIPS 204 post-quantum signature standard based on CRYSTALS-Dilithium. The spec defines three parameter sets: ML-DSA-44, ML-DSA-65, ML-DSA-87. We picked one and use it everywhere:

  • ML-DSA-65 on every signed artifact. Per-run receipts, aggregate manifests, key-commitment artifacts. One parameter set across the entire chain so verification tooling never has to choose at runtime. Matches NIST Category 3.
  • SHA-256 (FIPS 180-4) hashes every artifact. IEEE 754 governs every numerical result in the manifest. Both standards are named explicitly in the receipt so a reviewer does not have to guess.
  • Ed25519 classical backup. Run alongside ML-DSA-65 for the window where reviewers are more comfortable verifying classical signatures. This goes away once PQC tooling is ubiquitous.

What a signature does not prove

A valid signature only proves that someone holding our signing key vouched for the exact manifest content. It does not prove the code that produced those results was the code we described, that the GPU was configured the way we said, or that the dataset was what it claimed to be.

Independent verification closes those gaps: re-run the workload with the hx_engine_sha in the manifest against the dataset whose SHA-256 is in the manifest. If the results reproduce inside published tolerance, the signature was worth something. If they don't, the signature is a thumb on a scale. We sign so that that comparison is possible; we don't sign to avoid making it.

Verification in three commands

The /proof receipts each bundle a verify_receipt.py script alongside the manifest, signatures, and artifact hashes. The loop is:

# 1. pull the receipt bundle
curl -L https://holonomx.com/proof/receipts/hx-sdp-100m-h100.tar.gz | tar xz
cd hx-sdp-100m-h100/

# 2. verify every signature against the pinned public keys
python verify_receipt.py --bundle .

# 3. recompute every artifact hash and compare to manifest
python verify_receipt.py --hashes-only --bundle .

Either command exits non-zero if a hash or signature fails. There is no ceremony, no vendor portal, no login wall. If a reader wants to trust-but-verify, the tooling goes with the claim.

Where this is going

Two things on the roadmap are flagged as "planned" on /trust rather than shipped:

  • Lean 4 proofs of the conservation invariants the engine checks at run time, so "passes the harness" becomes "compiles against the formal spec."
  • Halo2 ZK circuits over a bounded subset of the query path, so a reader can verify specific properties without re-running the workload.

These don't exist yet. When they do, a new receipt field will carry the proof and this post will get an update with the verification loop for it. We would rather publish a small, real attestation story than a wide, aspirational one.