A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

Key Points

Key points are not available for this paper at this time.

Abstract

Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Dileep George

DeepMind (United Kingdom)

Wolfgang Lehrach

Union University

Ken Kansky

Union University

Journals

Science

Actions

Institutions

Union University

Clarion University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

George et al. (Thu,) studied this question.

synapsesocial.com/papers/69d97f580d540cafc5835e3f — DOI: https://doi.org/10.1126/science.aag2612

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Compositionality, MDL Priors, and Object Recognition· 1996 · 115 citations
A Mean Field EM-algorithm for Coherent Occlusion Handling in MAP-Estimation Prob· 2006 · 22 citations
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex· 1962 · 13,818 citations
Adam: A Method for Stochastic Optimization· 2014 · 84,773 citations
Concurrent processing streams in monkey visual cortex· 1988 · 1,129 citations

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Compositionality, MDL Priors, and Object Recognition· 1996 · 115 citations
A Mean Field EM-algorithm for Coherent Occlusion Handling in MAP-Estimation Prob· 2006 · 22 citations
Receptive fields, binocular interaction and functional architecture in the cat's visual cortex· 1962 · 13,818 citations
Adam: A Method for Stochastic Optimization· 2014 · 84,773 citations
Concurrent processing streams in monkey visual cortex· 1988 · 1,129 citations

A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Also consider