Label what you generate. Prove what you trained on.
Hashproof signs generated output with durable C2PA provenance, records training-data lineage you can answer membership queries against, and produces EU AI Act disclosure reports from your stored corpus. The standard regulators and platforms are converging on.
What is going wrong
Three pressures arriving at once: regulation, rights, and durability.
Disclosure is becoming mandatory
EU AI Act Article 50 requires that synthetic content be machine-readable as such, with enforcement beginning August 2026. Voluntary labeling that any tool can strip does not satisfy a regulator asking for a verifiable, auditable record.
Training-data lineage is opaque
When a rights holder asks whether their work was in a training set, or a downstream user asks what a model was trained on, most labs cannot answer with evidence. The provenance of training corpora is rarely captured in a checkable form.
Output labels do not survive the internet
A generated image loses its metadata the first time it is screenshotted or re-encoded by a platform. A label that only lives in EXIF is gone within one hop, exactly when downstream systems need to know the content is synthetic.
What Hashproof brings
Four primitives spanning output labeling and training-data lineage.
Store synthetic-content manifests
Store the C2PA manifests your generation pipeline signs, including AI-generation assertions such as c2pa.ai_generative. Stored manifests carrying those assertions are counted in EU AI Act compliance reports and stay retrievable by ID.
Prove training-data lineage
Register datasets and ingest fingerprints, then answer membership queries: does this asset appear in a training set. The training-data surface gives rights holders and auditors a checkable answer instead of a denial.
Labels that survive re-encoding
Soft-binding resolution reconnects a screenshotted or re-compressed generation back to its manifest. Downstream platforms can still determine an asset is synthetic after it has been through the wringer.
Disclosure reports from the corpus
Generate EU AI Act Article 50 reports directly from your stored manifests. Each finding cites its manifest, so disclosure becomes a query against the record, not a manual audit.
Where it sits in the lifecycle
From training-data registration to disclosure. Your training and inference stacks stay in place.
01
Register data
Register training datasets and ingest perceptual fingerprints through the API. The corpus becomes queryable for membership without exposing the underlying assets.
02
Store output manifests
Your generation pipeline stores a C2PA manifest for each output, carrying the AI-generation assertions your signing tooling adds. The stored record is retrievable and verifiable through the API.
03
Answer queries
Rights holders and auditors check membership against your datasets, and verify any generated asset against its manifest, including re-encoded copies resolved by soft binding.
04
Report
The compliance endpoint produces Article 50 disclosure reports from the stored corpus, with each finding cited by manifest ID, ready for a regulator or a partner.
How Hashproof is different
Disclosure infrastructure that does not ask you to expose the model.
The standard regulators are converging on
C2PA is referenced across emerging AI-content rules and backed by a Linux Foundation membership spanning model providers, platforms, and hardware. Building on it means your disclosures verify in the same tools regulators and platforms use.
Provenance without exposing the model
Manifests record that content is synthetic and prove dataset membership without revealing weights, prompts, or proprietary pipeline detail. Selective disclosure lets verification reveal only the fields you choose and replace the rest with their hashes. A zero-knowledge proof that confirms valid provenance while revealing nothing else is in development, available today as a labeled beta preview.
A substrate, not a content moderator
Hashproof labels and verifies. It does not judge outputs or gate generation. The provenance layer states facts about an asset; policy decisions stay yours.
Compliance fit
One provenance layer across disclosure, transparency, and interoperability requirements.
EU AI Act Article 50
Article 50 requires machine-readable marking of synthetic content and provider transparency. Hashproof produces disclosure reports from your stored manifests. Article 50 obligations apply from 2 August 2026.
Training-data transparency
Dataset registration and membership queries give a verifiable answer to what a model was trained on, supporting transparency obligations and rights-holder inquiries.
Provenance interoperability
Manifests follow the C2PA 2.x specification and can federate with peer registrars, so a disclosure made once is checkable everywhere the standard is read.