This paper presents the development of an advanced clinical interface built on the LattePanda Sigma, an embedded device designed for edge computing. The interface integrates OpenAI language models and Whisper for automated speech-to-text transcription, together with accurate speaker diarisation in clinical settings using the pyannote/speaker-diarization-3.1 model. A dataset of ten doctor–patient conversations in Spanish—translated and re-recorded to suit the local context—was used to evaluate the models. Automatic transcriptions generated by the models were compared with the reference transcripts using the ROUGE metric. Average ROUGE scores of 0.9028 for the Small model and 0.9260 for the Medium model indicate high transcription accuracy. The reference transcripts were also used to assess the segments identified by the pyannote model. Finally, the paper analyses the system’s usefulness and effectiveness in improving Spanish-language clinical records.
Jonathan et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: