Rethinking Reward Models for Multi-Domain Test-Time Scaling | Synapse