Current Responsible AI metrics, including truthfulness, bias, and toxicity scores, often reduce responsibility in large language models (LLMs) to static technical proxies, obscuring the contextual, ethical, and temporal dynamics through which accountability is produced in real-world settings. This study introduces Dynamic Contextual Responsibility (DCR), a conceptual and operational framework that defines responsibility as a dynamic, context-conditioned, and socio-technical relation shaped by system behaviour, governance arrangements, and institutional norms. DCR integrates five dimensions, ethical foundations, contextual grounding, behavioural properties, governance mechanisms, and temporal dynamics, into a unified and interpretable construct. To illustrate its operational implications, the framework is examined through multi-model, multi-context, and multi-temporal evaluations using established benchmarks such as TruthfulQA, FEVER, and HotpotQA. The analysis shows that approximately 22% of outputs classified as responsible under static metrics are reclassified once contextual and temporal factors are considered, revealing latent ethical and governance risks. By foregrounding context, governance, and temporal change, DCR advances Responsible AI evaluation toward more dynamic, transparent, and plural forms of accountability, with direct relevance for emerging regulatory regimes, including the EU AI Act and the NIST AI Risk Management Framework.
Ibitoye et al. (Sun,) studied this question.