What question did this study set out to answer?

This research aims to address the limitations of traditional AI compliance methods in rapidly changing environments.

May 17, 2026Open Access

Decoupling Intelligence from Governance: A Dynamic Bilateral Architecture for Real-Time Enterprise AI Compliance

Key Points

This research aims to address the limitations of traditional AI compliance methods in rapidly changing environments.
Introduced the Agreement Validation Interface (AVI) as a modular governance architecture.
Validated AVI against the FinanceBench benchmark with 150 queries in three runs (450 observations).
Cross-domain analysis conducted on a proprietary dataset of 201 queries.
AVI achieved an 83.2% compliance rate, significantly above the baseline of 63.7% (Δ=+19.5 pp, t=4.02, p=0.002).
Perfect detection performance of the vector-based input filter (Precision =1.000, Recall =1.000).
Operational Time-to-Compliance reduced from hours to under five seconds.

Abstract

The widespread adoption of Generative Artificial Intelligence (GenAI) in regulated enterprises is currently hindered by the “Static Alignment Trap”: the inability of traditional safety methods, such as Reinforcement Learning from Human Feedback (RLHF), to adapt to rapidly shifting compliance landscapes without costly retraining. This paper introduces and evaluates the Agreement Validation Interface (AVI), a modular governance architecture that functions as a deterministic middleware layer. By decoupling governance from the core inference engine, AVI implements Dynamic Bilateral Alignment (DBA), enforcing policy constraints at both the input and output stages through vector-based semantic retrieval. Adopting a Design Science Research (DSR) methodology, we validated the system against the FinanceBench financial benchmark (N=150 queries, three repeated runs, 450 total observations) and a proprietary Russian-language provocative content dataset developed internally at MWS AI (N=201 queries; not publicly available). The empirical results demonstrate that the architecture achieves an 83.2% Large Language Model (LLM)-judge compliance rate (95% confidence interval, CI: 79.4–87.1%), statistically significantly exceeding the unfiltered baseline of 63.7% (Δ=+19.5 percentage points (pp), t=4.02, p=0.002). The vector-based input filter achieves perfect detection performance (Precision =1.000, Recall =1.000, F1 =1.000). Cross-domain validation on 201 Russian-language provocative queries confirms generalizability (Recall =0.985, LLM compliance among triggered queries =0.977). The operational Time-to-Compliance for enforcing new rules was reduced from hours (model fine-tuning) to under five seconds (vector indexing). These findings suggest that enterprise AI safety requires an architectural shift from model-centric training to system-centric control, complemented by system-prompt-level anti-inference governance. We conclude that AVI offers a scalable, cost-effective, and statistically validated framework for auditable AI compliance, independent of the underlying model provider.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper