| dc.contributor.advisor | BEȘLIU, Corina | |
| dc.contributor.author | CERNEI, Ion | |
| dc.date.accessioned | 2026-01-15T06:25:19Z | |
| dc.date.available | 2026-01-15T06:25:19Z | |
| dc.date.issued | 2026 | |
| dc.identifier.citation | CERNEI, Ion. Evaluating whisper’s speech to text performance for romanian using audio from diverse domains. In: Conferinţa Tehnico-Ştiinţifică a Colaboratorilor, Doctoranzilor şi Studenţilor = The Technical Scientific Conference of Undergraduate, Master and PhD Students, 14-16 Mai 2025. Universitatea Tehnică a Moldovei. Chişinău: Tehnica-UTM, 2026, vol. 1, pp. 771-774. ISBN 978-9975-64-612-3, ISBN 978-9975-64-613-0 (PDF). | en_US |
| dc.identifier.isbn | 978-9975-64-612-3 | |
| dc.identifier.isbn | 978-9975-64-613-0 | |
| dc.identifier.uri | https://repository.utm.md/handle/5014/34447 | |
| dc.description.abstract | This study evaluates the Whisper model’s performance for Romanian speech-to-text transcription, investigating how transcription accuracy varies across diverse audio domains. Audio sources, including audiobooks, news broadcasts, and official public speeches, were selected for their verified textual references, ensuring robust evaluation through accurate alignment. Each domain presents distinct linguistic and acoustic characteristics, from the structured and clear narration of audiobooks to the dynamic and occasionally noisy environments of live news, to the formal rhetoric of political discourse. The study uses standard evaluation metrics such as Word Error Rate (WER) and Character Error Rate (CER), enabling a consistent assessment of transcription performance. By focusing on Romanian, a low-resource language in automatic speech recognition, this study provides novel insights into Whisper’s effectiveness and the influence of the audio domain on transcription quality, contributing to advancements in speech recognition for under-resourced languages. Results show that Whisper performs best on scripted, high-quality audio such as audiobooks. At the same time, accuracy decreases in more variable and spontaneous contexts, highlighting the model’s sensitivity to content structure and recording conditions. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Universitatea Tehnică a Moldovei | en_US |
| dc.relation.ispartofseries | Conferinţa tehnico-ştiinţifică a studenţilor, masteranzilor şi doctoranzilor = The Technical Scientific Conference of Undergraduate, Master and PhD Students: 14-16 mai 2025; | |
| dc.rights | Attribution-NonCommercial-NoDerivs 3.0 United States | * |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/us/ | * |
| dc.subject | automatic speech recognition | en_US |
| dc.subject | low-resource languages | en_US |
| dc.subject | error metrics | en_US |
| dc.subject | speech analysis | en_US |
| dc.subject | domain-specific evaluation | en_US |
| dc.title | Evaluating whisper’s speech to text performance for romanian using audio from diverse domains | en_US |
| dc.type | Article | en_US |
The following license files are associated with this item: