Key points are not available for this paper at this time.
We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network computation via a simple, feature-wise affine transformation based on conditioning information. We show that FiLM layers are highly effective for visual reasoning - answering image-related questions which require a multi-step, high-level process - a task which has proven difficult for standard deep learning methods that do not explicitly model reasoning. Specifically, we show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are robust to ablations and architectural modifications, and 4) generalize well to challenging, new data from few examples or even zero-shot.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ethan Perez
Florian Strub
Harm de Vries
Université de Montréal
Rice University
Université de Lille
Building similarity graph...
Analyzing shared references across papers
Loading...
Perez et al. (Fri,) studied this question.
www.synapsesocial.com/papers/6a0a5197889486c184116737 — DOI: https://doi.org/10.48550/arxiv.1709.07871
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: