Are Large Language Models Good at Utility Judgments? | Synapse