Abstract This work investigates why recurrent neural networks (RNNs) tend to learn phonological patterns that are unattested or dispreferred by humans. Specifically, we explore the hypothesis that their over-generation is caused by their excess expressive capacity – they are beyond the limited complexity class that contains the set of attested phonological patterns. We compared these over-expressive RNNs against the weaker convolutional neural networks (CNNs) on a battery of string recognition tasks. We find that the expressivity of a model’s architecture does not predict the string classes that it excels at recognizing. Instead, we suggest that CNNs’ position-invariant biases better explain their successes in our experiment.
Li et al. (Sat,) studied this question.