What question did this study set out to answer?

This research aims to develop a composite model for recognizing and generating opera audio styles using neural networks.

March 1, 2026

Opera audio understanding and synthesis via neural network models: from recognition to generation

Key Points

This research aims to develop a composite model for recognizing and generating opera audio styles using neural networks.
Integration of chaotic fingerprint coding with deep neural networks for audio recognition.
Utilization of generative networks for style transfer.
Implementation of a 20-bit chaotic audio fingerprint based on logistic mapping.
DNN-LightGBM cascade structure for feature classification in 19 opera categories.
Adoption of generative adversarial networks with orthogonal style loss for improved style separation.
Accuracy of recognition methods reaches 92.3% in noisy environments, surpassing traditional techniques by 12.5%.
Feature classification accuracy ranges between 88-95% across 19 opera categories.
Style transfer shows a reduction of Mel cepstral distortion by 18.3%, improving quality from 5.24 to 4.87.
Robustness of style transfer increases by 23.6% under differing accompaniment conditions.

Abstract

This study proposes a composite model for opera audio recognition and style generation. The model integrates chaotic fingerprint coding, deep neural networks, and generative networks for style transfer. The model uses a 20-bit chaotic audio fingerprint based on logistic mapping and time-frequency peaks. This technology can achieve efficient compression and robust recognition. The accuracy of method in a noisy environment is 92.3%, which is 12.5% higher than that of traditional methods. The DNN-LightGBM cascade structure effectively models features and efficiently classifies features in 19 opera categories with an accuracy of 88-95%. In terms of style transfer, generative adversarial network with orthogonal style loss function separates timbre and style and reduce Mel cepstral distortion by 18.3%, from 5.24 to 4.87. In addition, spectrum-based unsupervised linear style encoder improves the robustness of the transfer by 23.6% under various accompaniment conditions. The framework has high recognition accuracy, high-quality style transfer, and strong adaptability.

Bookmark

Opera audio understanding and synthesis via neural network models: from recognition to generation

Key Points

Abstract

Cite This Study