Key points are not available for this paper at this time.
The widespread adoption of Generative Artificial Intelligence (GenAI) in regulated enterprises is currently hindered by the “Static Alignment Trap”: the inability of traditional safety methods, such as Reinforcement Learning from Human Feedback (RLHF), to adapt to rapidly shifting compliance landscapes without costly retraining. This paper introduces and evaluates the Agreement Validation Interface (AVI), a modular governance architecture that functions as a deterministic middleware layer. By decoupling governance from the core inference engine, AVI implements Dynamic Bilateral Alignment (DBA), enforcing policy constraints at both the input and output stages through vector-based semantic retrieval. Adopting a Design Science Research (DSR) methodology, we validated the system against the FinanceBench financial benchmark (N=150 queries, three repeated runs, 450 total observations) and a proprietary Russian-language provocative content dataset developed internally at MWS AI (N=201 queries; not publicly available). The empirical results demonstrate that the architecture achieves an 83.2% Large Language Model (LLM)-judge compliance rate (95% confidence interval, CI: 79.4–87.1%), statistically significantly exceeding the unfiltered baseline of 63.7% (Δ=+19.5 percentage points (pp), t=4.02, p=0.002). The vector-based input filter achieves perfect detection performance (Precision =1.000, Recall =1.000, F1 =1.000). Cross-domain validation on 201 Russian-language provocative queries confirms generalizability (Recall =0.985, LLM compliance among triggered queries =0.977). The operational Time-to-Compliance for enforcing new rules was reduced from hours (model fine-tuning) to under five seconds (vector indexing). These findings suggest that enterprise AI safety requires an architectural shift from model-centric training to system-centric control, complemented by system-prompt-level anti-inference governance. We conclude that AVI offers a scalable, cost-effective, and statistically validated framework for auditable AI compliance, independent of the underlying model provider.
Building similarity graph...
Analyzing shared references across papers
Loading...
Danila Katalshov
Ольга Швецова
Sang-Kon Lee
Electronics
Korea University of Technology and Education
Building similarity graph...
Analyzing shared references across papers
Loading...
Katalshov et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6a095c2c7880e6d24efe2259 — DOI: https://doi.org/10.3390/electronics15102125