A project aims to improve speech technology for those with communication challenges.
Pan-Pan Jiang, Jimmy Tobin, Katrin Tomanek
― 5 min read
Cutting edge science explained simply
A project aims to improve speech technology for those with communication challenges.
Pan-Pan Jiang, Jimmy Tobin, Katrin Tomanek
― 5 min read
Latest Articles
Marco Furio Colombo, Francesca Ronchini, Luca Comanducci
― 5 min read
Jinzuomu Zhong, Korin Richmond, Zhiba Su
― 5 min read
Florian Grötschla, Luca Strässle, Luca A. Lanzendörfer
― 6 min read
Yao-Fei Cheng, Li-Wei Chen, Hung-Shin Lee
― 7 min read
Ines Nolasco, Ilyass Moummad, Dan Stowell
― 5 min read
A new method simplifies siren detection for enhanced vehicle safety.
Stefano Damiano, Thomas Dietzen, Toon van Waterschoot
― 5 min read
A new approach combines sound event detection and speaker diarization for better audio understanding.
Yidi Jiang, Ruijie Tao, Wen Huang
― 5 min read
A new approach enhances ASR by focusing on specific speaker details.
Alexander Polok, Dominik Klement, Matthew Wiesner
― 5 min read
A study revealing how deep learning models recognize emotions in speech.
Satvik Dixit, Daniel M. Low, Gasser Elbanna
― 5 min read
An easy-to-use tool for fine-tuning speech models without complex code.
Masao Someki, Kwanghee Choi, Siddhant Arora
― 6 min read
New methods improve sound isolation from noisy environments without labeled data.
Hao Ma, Zhiyuan Peng, Xu Li
― 5 min read
A novel approach tackles channel variation in voice recognition systems.
Wenhao Yang, Jianguo Wei, Wenhuan Lu
― 5 min read
A new method improves machine voice recognition for speaker verification.
Wenhao Yang, Jianguo Wei, Wenhuan Lu
― 6 min read
A new model enhances audio generation using detailed text and sound prompts.
Chenxu Xiong, Ruibo Fu, Shuchen Shi
― 6 min read
Artificial intelligence is reshaping music with new tools and approaches.
Megan Wei, Mateusz Modrzejewski, Aswin Sivaraman
― 6 min read
MaskSR2 improves speech clarity and quality using innovative techniques.
Xiaoyu Liu, Xu Li, Joan Serrà
― 5 min read
A new method for generating accented speech using text transliteration.
Sho Inoue, Shuai Wang, Wanxing Wang
― 6 min read
E1 TTS transforms text into natural speech faster and more efficiently.
Zhijun Liu, Shuai Wang, Pengcheng Zhu
― 5 min read
Wave-U-Mamba enhances low-quality speech recordings for clearer communication.
Yongjoon Lee, Chanwoo Kim
― 5 min read
A new system predicts naturalness scores for synthetic speech using innovative methods.
Kaito Baba, Wataru Nakata, Yuki Saito
― 5 min read
A new method uses audio to enhance machine pronunciation accuracy.
Siqi Sun, Korin Richmond
― 5 min read
New methods improve audio synchronization with changing video scenes.
Mingjing Yi, Ming Li
― 4 min read
Exploring the GenSEC challenge to improve speech transcription accuracy.
Chao-Han Huck Yang, Taejin Park, Yuan Gong
― 4 min read
A novel assessment method for schizophrenia using multimodal data.
Gowtham Premananth, Carol Espy-Wilson
― 5 min read
New methods are helping machines better interpret individual sounds.
Sripathi Sridhar, Mark Cartwright
― 6 min read
An overview of keyword spotting technologies and their challenges with the Urdu language.
Syed Muhammad Aqdas Rizvi
― 6 min read
Research reveals the difficulties in speech recognition of police radio transmissions.
Tejes Srivastava, Ju-Chieh Chou, Priyank Shroff
― 7 min read
PDMX offers a vast collection of public domain symbolic music for AI development.
Phillip Long, Zachary Novack, Taylor Berg-Kirkpatrick
― 6 min read
A study shows i-vectors can compete with complex models in speaker recognition.
Zakaria Aldeneh, Takuya Higuchi, Jee-weon Jung
― 5 min read
A study on how design choices affect speech foundation models.
Li-Wei Chen, Takuya Higuchi, He Bai
― 7 min read
A new method assesses self-supervised speech models using rank measurement.
Zakaria Aldeneh, Vimal Thilak, Takuya Higuchi
― 5 min read
Study highlights advances in robot emotion recognition using Vision Transformers.
Ruchik Mishra, Andrew Frye, Madan Mohan Rayguru
― 6 min read
Research highlights the importance of fair diagnosis in respiratory illnesses.
Rachel Pfeifer, Sudip Vhaduri, James Eric Dietz
― 7 min read
MusicLIME helps explain AI's approach to analyzing music through audio and lyrics.
Theodoros Sotirou, Vassilis Lyberatos, Orfeas Menis Mastromichalakis
― 6 min read
Discover how Quantum Computing is reshaping musical creativity with the Variational Quantum Harmonizer.
Paulo Vitor Itaboraí, Peter Thomas, Arianna Crippa
― 11 min read
MCMamba model improves speech quality in noisy environments using spatial and spectral information.
Wenze Ren, Haibin Wu, Yi-Cheng Lin
― 4 min read
This study evaluates low-latency methods for improving speech quality in noisy conditions.
Haibin Wu, Sebastian Braun
― 6 min read
Examining how 2D and 3D gestures affect virtual character communication.
Téo Guichoux, Laure Soulier, Nicolas Obin
― 7 min read
A study on enhancing voice recognition systems for noisy settings.
Muhammad Sudipto Siam Dip, Md Anik Hasan, Sapnil Sarker Bipro
― 6 min read
Researchers use speech to identify and monitor various health conditions.
Catarina Botelho, Alberto Abad, Tanja Schultz
― 7 min read
RF-GML measures audio quality without needing a reference signal.
Arijit Biswas, Guanxin Jiang
― 5 min read
Learn how room equalization enhances audio experiences in various environments.
James Brooks-Park, Martin Bo Møller, Jan Østergaard
― 6 min read
StyleTTS-ZS offers efficient, high-quality speech synthesis without extensive speaker training.
Yinghao Aaron Li, Xilin Jiang, Cong Han
― 5 min read
A new method enhances synthesized ensemble singing by modeling singer interactions.
Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura
― 5 min read
A new framework enhances speech recognition by modeling sound relationships effectively.
Zheng Nan, Ting Dang, Vidhyasaharan Sethu
― 4 min read