Evaluating large language models for endodontic case difficulty assessment: accuracy, consistency, temporal stability, and agreement | Synapse