Evaluating LLMs at Evaluating Temporal Generalization | Synapse