We present a concept-centric paradigm for building agents that can learn continually and reason flexibly. The agent utilizes a vocabulary of neuro-symbolic concepts . These concepts of objects, relations, and actions are grounded in sensory inputs and actuation outputs; they are also compositional, allowing for the creation of novel concepts through their structural combination. To facilitate learning and reasoning, the concepts are typed and represented using a combination of symbolic programs and neural network embeddings. Leveraging the complementary features of neural and symbolic representations, an agent can efficiently learn and recombine concepts to solve various tasks across different domains, ranging from 2D images, videos, 3D scenes, and robotic manipulation tasks. This concept-centric framework offers several advantages, including data efficiency, compositional generalization, continual learning, and zero-shot transfer.
Mao et al. (Wed,) studied this question.