This technical note presents Boolean Algebra Engine, an open-source deterministic verification framework for evaluating Boolean reasoning in large language models (LLMs). The work combines a formal Boolean logic engine with an LLM-assisted translation layer, enabling natural-language Boolean queries to be converted into machine-verifiable expressions and evaluated with exact correctness guarantees. The paper includes a benchmark study of seven LLMs on Boolean satisfiability tasks with machine-verified ground truth, measuring hallucination rates across varying expression complexities. Results reveal consistent model-specific reasoning failures, including optimism and pessimism biases, and suggest that reasoning errors remain relatively stable across increasing variable counts within the tested range. The repository includes the full paper, benchmark methodology, experimental results, and implementation details of the Boolean Algebra Engine. Keywords: Large Language Models, Boolean Logic, Formal Verification, Symbolic Reasoning, Hallucination Analysis, Quine-McCluskey, SAT Reasoning, Neurosymbolic Systems.
Aditya Shrivastava (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: