When measurement meets machine learning: interpretability and scalability in modelling item difficulty for language assessment | Synapse