A Practical Architecture for Low-Latency LLM Inference in Educational Assistants | Synapse