Amid the growing prevalence of remote work and the digital transformation of business, the automation of routine processes-including the processing of audio recordings of meetings and conferences-is becoming increasingly important. Modern video conferencing systems offer automatic transcription features; however, these are often restricted to paid subscription plans, require an internet connection, and do not provide a sufficient level of confidentiality. Consequently, the development of a local, economically accessible, and secure solution for speech transcription has become highly relevant. This paper presents an automatic system designed for transcribing voice recordings of meetings within a project company. A distinctive feature of the developed system is the integration of open-source neural network models: Whisper (for speech recognition), pyannote.audio (for diarization - speaker identification), and an open-source GPT model (for post-processing and text formatting). The system is implemented using a hybrid architecture employing two programming languages, Python and C#, which combines high-performance audio processing with a user-friendly graphical interface. The key advantages of the solution include complete autonomy (no cloud connection required), support for the Russian language, scalability, and compliance with information security requirements. Testing on a control audio fragment yielded Word Error Rate (WER) and Character Error Rate (CER) metrics at levels acceptable for business use. To assess the accuracy of the designed system, additional tests were conducted in various acoustic environments, demonstrating that the system ensures good transcription quality under typical operating conditions, as well as in the presence of background noise. The implemented software product will enable companies to save on paid access to video conferencing systems and corporate subscriptions, while simultaneously increasing the transparency and efficiency of documenting meetings and conferences. This work holds both theoretical and practical significance for the development of domestic IT solutions in the field of corporate automation.
Building similarity graph...
Analyzing shared references across papers
Loading...
M. A. Polozov
L. A. Korobova
Intellekt Sist Proizv
Copiah-Lincoln Community College
Building similarity graph...
Analyzing shared references across papers
Loading...
Polozov et al. (Sat,) studied this question.
www.synapsesocial.com/papers/69db380f4fe01fead37c6281 — DOI: https://doi.org/10.22213/2410-9304-2026-1-52-63
Synapse has enriched 4 closely related papers on similar clinical questions. Consider them for comparative context: