What question did this study set out to answer?

This research aims to enhance data privacy in AI systems by introducing a cryptographic framework that safeguards personal information and proprietary ideas.

February 8, 2026Open Access

Tokenis: Cryptographic Pseudonymization and Idea Protection — Using EC-ElGamal for Privacy-Preserving AI Systems

Key Points

This research aims to enhance data privacy in AI systems by introducing a cryptographic framework that safeguards personal information and proprietary ideas.
Developed Tokenis, a pseudonymization framework combining NER-based data selection with EC-ElGamal encryption.
Established a formal security model based on the Decisional Diffie-Hellman assumption.
Proved three essential security properties: PII confidentiality, idea reconstruction resistance, and adaptive query safety.
Integrated Tokenis with the TorusDB RAG platform.
Successfully demonstrated that Tokenis prevents reconstruction and replication of proprietary rules and algorithms.
Maintained AI-driven reasoning capabilities over protected knowledge without exposing it.
Validated the framework's effectiveness against current anonymization and tokenization techniques.

Abstract

As AI systems increasingly rely on private and proprietary data, conventional data anonymization and tokenization techniques have proven insufficient to prevent information leakage during retrieval-augmented generation (RAG) and autonomous AI agent execution. Surface-level identifier masking leaves the deeper semantic structure of domain knowledge—decision rules, scoring formulas, and operational procedures—fully exposed to reconstruction through iterative querying. This paper presents Tokenis, a cryptographic pseudonymization framework that combines Named Entity Recognition (NER)–based data selection with elliptic-curve ElGamal (EC-ElGamal) encryption to protect both personal data and high-value domain ideas. Unlike existing approaches that focus solely on personally identifiable information (PII), Tokenis introduces an idea protection protocol that prevents the reconstruction and replication of proprietary rules, strategies, and algorithms while still enabling AI-driven reasoning over the protected knowledge. We present a formal security model grounded in the Decisional Diffie-Hellman (DDH) assumption, prove three core security properties (PII confidentiality, idea reconstruction resistance, and adaptive query safety), and demonstrate end-to-end integration with the TorusDB RAG platform. Tokenis serves as a foundational privacy layer for encrypted RAG systems and autonomous AI agencies, enabling controlled utilisation of sensitive domain knowledge without exposingit to the model.

Mark Helpful

Bookmark

Relay

View Full Paper