Toward Multi-Trit Quantization for Large Language Models: A Theoretical Framework for Balanced N-Trit Weights, Trit-Plane Generalization, and Mixed-Layer Precision | Synapse