We present a pipeline for deep neural network assisted modeling and analysis of tube vocal tract models. Such models are composed of a series of cylindrical tube segments, each characterized by length and cross-sectional area. A large synthetic dataset of such tube configurations is generated, and a circuit theory-based algorithm predicts corresponding formant frequencies. To explore the mapping between tube sequence shapes and predicted resonance (formant) values, the pipeline integrates both linear regression and nonlinear machine learning models including multi-layer perceptrons. Model interpretability is assessed using Shapley Additive Explanations (SHAP), which quantifies the contribution of each segment to predicted formant frequencies. The proposed framework enables detailed exploration of the articulatory-acoustic relationships inherent to an acoustic tube and vocal tract simulacrum. We present and describe the pipeline in the context of modeling effects of perturbations on the first three predicted resonances for a 16-cm tube, divided into 1 cm segments. Our pipeline can be applied to any method that models predictions of behavior of an acoustic tube, where the tube is conceived as a series of segmented units.
Building similarity graph...
Analyzing shared references across papers
Loading...
Runhui Song
Uppsala University
Johan Sjons
Uppsala University
Axel G. Ekström
Stockholm University
Uppsala University
KTH Royal Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Song et al. (Tue,) studied this question.
synapsesocial.com/papers/68af4eb4ad7bf08b1ead73bf — DOI: https://doi.org/10.1101/2025.08.19.671092