A study on how design choices affect speech foundation models.
Li-Wei Chen, Takuya Higuchi, He Bai
― 7 min read
Cutting edge science explained simply
A study on how design choices affect speech foundation models.
Li-Wei Chen, Takuya Higuchi, He Bai
― 7 min read
This article discusses methods to enhance speech recognition for accented speech.
Francesco Nespoli, Daniel Barreda, Patrick A. Naylor
― 6 min read
This study addresses challenges in audio language models for low-resource languages.
Potsawee Manakul, Guangzhi Sun, Warit Sirichotedumrong
― 5 min read
Enhancing speech synthesis in Indian languages using inter-pausal units.
Anusha Prakash, Hema A Murthy
― 6 min read
CADA-GAN enhances ASR systems' performance across various recording environments.
Chien-Chun Wang, Li-Wei Chen, Cheng-Kang Chou
― 6 min read
Llama-AVSR merges audio and visual inputs for enhanced speech recognition accuracy.
Umberto Cappellazzo, Minsu Kim, Honglie Chen
― 6 min read
A new method uses virtual shadowing to enhance language learners' pronunciation feedback.
Haopeng Geng, Daisuke Saito, Nobuaki Minematsu
― 6 min read
A new ASR method helps technology understand children's speech better.
Zhonghao Shi, Harshvardhan Srivastava, Xuan Shi
― 5 min read
YOSS uses audio to improve object identification in images.
Wenhao Yang, Jianguo Wei, Wenhuan Lu
― 4 min read
A project developing speech and text datasets for languages with limited resources.
Nikola Ljubešić, Peter Rupnik, Danijel Koržinek
― 5 min read
A new framework enhances voice recognition and adapts to various speech tasks.
Junyi Peng, Ladislav Mošner, Lin Zhang
― 4 min read
New methods improve speech recognition for low-resource languages without text.
Krithiga Ramadass, Abrit Pal Singh, Srihari J
― 4 min read
New methods enhance accuracy in speech recognition systems using phonetic understanding.
Leonid Velikovich, Christopher Li, Diamantino Caseiro
― 5 min read
New acoustic features enhance ASR systems' performance in noisy environments.
Muhammad A. Shah, Bhiksha Raj
― 4 min read
New model achieves faster speech transcription without sacrificing accuracy.
Yael Segal-Feldman, Aviv Shamsian, Aviv Navon
― 4 min read
Discover how Matryoshka embeddings improve speaker recognition efficiency and flexibility.
Shuai Wang, Pengcheng Zhu, Haizhou Li
― 4 min read
New model VoiceGuider improves TTS for diverse speakers.
Jiheum Yeom, Heeseung Kim, Jooyoung Choi
― 6 min read
A new method improves speech recognition for long recordings.
Hao Yen, Shaoshi Ling, Guoli Ye
― 5 min read
New method for speech language models reduces need for extensive data.
Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu
― 6 min read
How new methods are transforming speaker identification in audio recordings.
Petr Pálka, Federico Landini, Dominik Klement
― 6 min read
Learn how TSE improves speech recognition in crowded environments using text cues.
Ziyang Jiang, Xinyuan Qian, Jiahe Lei
― 6 min read
Voice assistants help identify early signs of memory issues in older adults.
Nana Lin, Youxiang Zhu, Xiaohui Liang
― 7 min read
Mamba enhances speech recognition with speed and accuracy, reshaping interaction with devices.
Yoshiki Masuyama, Koichi Miyazaki, Masato Murata
― 4 min read
New method enhances speech clarity using visual information from surroundings.
Xinyuan Qian, Jiaran Gao, Yaodan Zhang
― 5 min read
SAMOS offers a new way to measure speech quality, enhancing naturalness.
Yu-Fei Shi, Yang Ai, Ye-Xin Lu
― 6 min read
Tiny-Align enhances voice assistants for better personal interaction on small devices.
Ruiyang Qin, Dancheng Liu, Gelei Xu
― 6 min read
Introducing VQalAttent, a simpler model for generating realistic machine speech.
Armani Rodriguez, Silvija Kokalj-Filipovic
― 5 min read
A new ASR system enhances medical speech recognition for accurate patient care.
Sourav Banerjee, Ayushi Agarwal, Promila Ghosh
― 6 min read
Exploring how ASR models help identify speech deepfakes effectively.
Davide Salvi, Amit Kumar Singh Yadav, Kratika Bhagtani
― 7 min read
Efficiently tracks speakers in multilingual settings using automatic speech recognition.
Thai-Binh Nguyen, Alexander Waibel
― 6 min read
Improving machine transcription for better understanding of speech disorders.
Jiachen Lian, Xuanru Zhou, Zoe Ezzes
― 5 min read
New model improves Chinese speech recognition accuracy significantly.
Junhong Liang
― 6 min read
Noro enhances voice conversion, making it effective even in noisy settings.
Haorui He, Yuchen Song, Yuancheng Wang
― 6 min read
A new chatbot offering human-like conversations with emotional awareness.
Aohan Zeng, Zhengxiao Du, Mingdao Liu
― 3 min read
Discover how style-agnostic evaluation improves Automatic Speech Recognition systems.
Quinten McNamara, Miguel Ángel del Río Fernández, Nishchal Bhandari
― 7 min read
Learn how adaptive dropout improves efficiency in speech recognition systems.
Yotaro Kubo, Xingyu Cai, Michiel Bacchiani
― 7 min read
Research tests AI's ability to communicate with children like caregivers.
Jing Liu, Abdellah Fourtassi
― 6 min read
A speech-to-text tool transforms spoken math into LaTeX effortlessly.
Evangelia Gkritzali, Panagiotis Kaliosis, Sofia Galanaki
― 6 min read
Revolutionizing text-to-speech with improved efficiency and natural-sounding voices.
Haowei Lou, Helen Paik, Pari Delir Haghighi
― 6 min read
Speech recognition technology enhances digit recognition, especially in noisy environments.
Ali Nasr-Esfahani, Mehdi Bekrani, Roozbeh Rajabi
― 5 min read