Custodial AI (V2): Mechanism-First Ethics as Policy-as-Code, Harm Budgets, and Assurance-Driven Compliance Summary (HTML5 description for Zenodo) This paper advances a mechanism-first approach to AI and software ethics: instead of judging systems primarily by content, declared intent, or downstream outcomes, it evaluates the technical mechanisms that shape user behavior and institutional power (e.g., engagement optimization, coercive choice architectures, strategic opacity, and extractive data pipelines). The core claim is that contemporary “AI ethics” often lacks executability: principles exist, but they rarely translate into constraints that can be enforced throughout the software lifecycle and verified through evidence. The framework is grounded in Islamic legal philosophy while designed for universal translation. It formalizes Amānāh (custodianship/fiduciary duty) as the relational axiom for developer–system–user interactions, treats the Maqāṣid al-Sharīʿah (higher objectives of the law) as non-negotiable teleological goods, and operationalizes the principle of Ḥarām al-Wasīlah (illicit means) to classify mechanisms as impermissible when they exhibit predictable, systemic, and predominant pathways to violating protected human goods. Key Contributions Custodial Policy Infrastructure (CPI): a four-layer architecture that turns teleological ethics into policy-as-code with enforceable controls across design, CI/CD, deployment, and runtime. Level-0 Harm Budgets: operational “inviolable constraints” for core human goods (e.g., agency/time integrity and non-manipulated cognition), preventing economic or engagement benefits from justifying Level-0 degradation. HARAMP taxonomy (mechanism anti-patterns): a systematic classification of harmful technical patterns, each mapped to specific custodial duty breaches (care, loyalty, non-manipulation, accountability) and accompanied by practical detection hooks. Assurance-driven compliance: a shift from checklists to assurance cases (Claim → Argument → Evidence), enabling auditable, evidence-based verification and reducing the space for “ethics-washing.” Proto-suite of technical specifications (RFC-style): versioned, implementable schemas for Feature Impact Declarations (FID), Lazy Granular Consent (LGCP), Bias-by-Design Audits (BDA), and Kill-Switch/Fallback requirements (KSF), plus a policy-bundle format to support modular profiles. Why it Matters The paper reframes “AI guardrails” away from superficial output filtering and toward constraints on objectives, architectures, and incentives. It argues that robust AI governance requires compilable constraints and evidence obligations that persist after deployment. While rooted in an Islamic duty-based teleology, the proposed infrastructure is designed to support pluralistic adoption via a kernel-and-profiles approach: a shared custodial kernel with modular policy bundles that can reflect different regulatory, sectoral, or cultural requirements. Recommended Audience AI/ML engineers, software architects, product and governance teams, HCI researchers, and scholars of responsible AI and applied ethics seeking an enforceable, auditable framework for mechanism-level AI governance.
Mohamed Omar Bennouna (Tue,) studied this question.