With the widespread application of digital audio technology, the balance between audio quality and file size has become an important research topic. This study establishes a systematic evaluation framework to quantitatively analyze how sampling rate, bit depth, compression algorithms, and audio duration collectively influence audio quality and storage efficiency through a Cost-Performance Index (CPI). The research first constructs a theoretical model, quantifying audio quality using Signal-to-Noise Ratio (SNR) and Mean Squared Error (MSE), and calculates CPI in combination with actual file size. Through analysis of “music” and “speech” content categories, the study finds that speech encoding parameter combinations offer a broader selection space in terms of “quality/volume” trade-offs, while music files generally have lower CPI values with a narrower distribution range. Experimental results show that for music files, 44.1kHz with MP3 64kbps offers the highest compression efficiency; while for speech applications, 16kHz downsampling combined with MP3 64kbps presents the best value proposition, delivering acceptable listening quality while keeping file size at an extremely low level. Correlation analysis confirms a strong positive correlation between CPI and quality score (r≈+0.85) and a significant negative correlation with file size (r≈-0.65), validating the rationality of the “high quality + small size” concept. This research provides clear parameter guidance for audio encoding and storage strategies, with practical application value for optimizing audio compression and storage solutions.
Gu et al. (Thu,) studied this question.