Abstract
We present DECORAIT; a decentralized registry through which content creators
may assert their right to opt in or out of AI training as well as receive
reward for their contributions. Generative AI (GenAI) enables images to be
synthesized using AI models trained on vast amounts of data scraped from public
sources. Model and content creators who may wish to share their work openly
without sanctioning its use for training are thus presented with a data
governance challenge. Further, establishing the provenance of GenAI training
data is important to creatives to ensure fair recognition and reward for their
such use. We report a prototype of DECORAIT, which explores hierarchical
clustering and a combination of on/off-chain storage to create a scalable
decentralized registry to trace the provenance of GenAI training data in order
to determine training consent and reward creatives who contribute that data.
DECORAIT combines distributed ledger technology (DLT) with visual
fingerprinting, leveraging the emerging C2PA (Coalition for Content Provenance
and Authenticity) standard to create a secure, open registry through which
creatives may express consent and data ownership for GenAI.