ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing | Synapse