Learning to detect, identify, or select stimuli is an essential requirement of many behavioral tasks. In real-life situations, relevant and nonrelevant stimuli are often embedded in a continuous sensory stream, presumably represented by different segments of neural activity. Here we introduce a spiking network model that can discover action-relevant stimuli in an unsegmented sensory stream of spike trains. The model uses a biologically plausible plasticity rule and learns from the reinforcement of correct decisions taken at the right time. Learning is fully online and is faster for larger population size; it allows for a wide spectrum of neural-encoding strategies and can segment cortical spike patterns recorded from behaving animals. Based on these results, the proposed model provides a biologically plausible framework for reinforcement learning in the absence of prior information on the identity, relevance, and timing of input stimuli embedded in a continuous spatiotemporal stream.
Donne et al. (Sun,) studied this question.