ABSTRACT Listening in a second language increasingly takes place in digital and multimodal environments, raising questions about how learners adapt their strategies across modalities and proficiency levels. This study investigates how Taiwanese undergraduates ( N = 24) reported their use of cognitive and metacognitive listening strategies in audio‐only versus audiovisual online tasks in semi‐guided digital settings where materials were teacher‐selected and explicit instruction was constrained. Results from this exploratory sample suggest that inferencing was the only cognitive strategy robustly enhanced by multimodal input, while proficiency appeared to exert a stronger influence overall, with higher‐level learners tending to report broader use of inferencing and monitoring. These exploratory findings tentatively highlight a real‐world challenge: Multimodal input, now common in online platforms, may not automatically foster adaptive strategy use. The study points to a potential need for explicit scaffolding that helps learners notice and integrate visual cues, and for proficiency‐sensitive support that prevents reliance on less effective strategies such as translation. By situating a local EFL case within wider debates on digital language learning, this research contributes to understanding how applied linguistics can mediate between strategy theory and classroom practice.
Nguyen et al. (Sun,) studied this question.