This work presents a novel approach to modeling expressions in guitar performance using transformer-based architectures. Motivated by the increased rhythmic complexity, tonal variation, and extended playing techniques enabled by guitar electrification, the paper introduces a specialized tokenisation scheme that explicitly represents a wide range of guitar specific techniques. In addition, a sequence level attention mechanism is incorporated into the transformer architecture to capture segmentation between musical phrases. This design allows the model to maintain phrase level contextual coherence while enabling efficient batching of variable- ength musical sequences. The proposed method incorporates the representation of expressive guitar performance and provides a framework for learning long-range musical structure.
Vishakh Begari (Sat,) studied this question.