EN This research essay investigates the reduction of hallucinations in compact Small Language Models (SLMs) deployed entirely on edge devices in tax-law environments. It introduces Epistemic Scaffolding, an ontology-guided knowledge architecture organizing legal knowledge across three layers: formal ontology, operational heuristics, and user-centered phenomenology. The Qwen2.5-1.5B model was fine-tuned via LoRA and quantized to INT4 using MLC-LLM pipelines, enabling browser-based inference via WebGPU without cloud data transmission. A key finding was that INT4 quantization acted as an implicit semantic regularization mechanism, improving groundedness (+38.8%) over FP16, while structural pruning caused catastrophic degradation. The work suggests that epistemological knowledge organization during training may be as consequential as parameter scale for trustworthy edge AI in legally sensitive domains.
Barros et al. (Sat,) studied this question.