The need for theoretically understanding what neural networks learn and how can hardly be overstated. In this work, we apply tools from statistical mechanics on the space of outputs of a generative network and study how high-level concepts, although not explicitly encoded, are implicitly learned. In particular, we examine how a music generator, trained over polyphonic music, extracts the concept of musical consonance. This is done by capturing a dissonance measure within an energy function (a Hamiltonian), wherein lower energy corresponds to more musically consonant output. Though only explicitly programmed to minimize training-error, the network also minimizes dissonance, surprisingly accompanied by higher order phase transitions. Moreover, such phase transitions are found in co-occurrence with pleasing music and distributions that better match the training set, revealing a potential deeper connection between learning and critical phenomena.
Paul et al. (Tue,) studied this question.