This work investigates the use of Large Language Models (LLMs) to convert unstructured natural‑language instructions into structured JSON metadata suitable for scientific workflows. The system architecture combines a locally executed LLM via Ollama, a .NET Web API responsible for prompting and validation, and a lightweight console client. The processing pipeline operates in two stages: a linguistic normalization step that translates operator input into clear, unambiguous English, followed by schema‑guided extraction that enforces strict JSON structure. Through iterative prompt engineering, the approach achieves deterministic, schema‑compliant output while avoiding free text and hallucinated fields. Experiments show that smaller, instruction‑obedient models such as Phi‑3 provide the most reliable behavior under strong constraints. The resulting workflow is robust, offline‑capable, and well‑suited to institutional environments where reproducibility and metadata quality are essential. Future extensions may include schema expansion, ontology integration, and confidence scoring.
Francesco Carraro (Mon,) studied this question.