This study evaluates low-latency methods for improving speech quality in noisy conditions.
Haibin Wu, Sebastian Braun
― 6 min read
Cutting edge science explained simply
This study evaluates low-latency methods for improving speech quality in noisy conditions.
Haibin Wu, Sebastian Braun
― 6 min read
Examining how 2D and 3D gestures affect virtual character communication.
Téo Guichoux, Laure Soulier, Nicolas Obin
― 7 min read
A study on enhancing voice recognition systems for noisy settings.
Muhammad Sudipto Siam Dip, Md Anik Hasan, Sapnil Sarker Bipro
― 6 min read
Researchers use speech to identify and monitor various health conditions.
Catarina Botelho, Alberto Abad, Tanja Schultz
― 7 min read
RF-GML measures audio quality without needing a reference signal.
Arijit Biswas, Guanxin Jiang
― 5 min read
Learn how room equalization enhances audio experiences in various environments.
James Brooks-Park, Martin Bo Møller, Jan Østergaard
― 6 min read
StyleTTS-ZS offers efficient, high-quality speech synthesis without extensive speaker training.
Yinghao Aaron Li, Xilin Jiang, Cong Han
― 5 min read
A new method enhances synthesized ensemble singing by modeling singer interactions.
Hiroaki Hyodo, Shinnosuke Takamichi, Tomohiko Nakamura
― 5 min read
A new framework enhances speech recognition by modeling sound relationships effectively.
Zheng Nan, Ting Dang, Vidhyasaharan Sethu
― 4 min read
New masking method improves voice conversion by separating speaker identity from phonetics.
Philip H. Lee, Ismail Rasim Ulgen, Berrak Sisman
― 5 min read
Innovative techniques enhance music-text model training with limited resources.
Ilaria Manco, Justin Salamon, Oriol Nieto
― 7 min read
New methods enhance audio tagging for diverse music styles and cultural preservation.
Charilaos Papaioannou, Emmanouil Benetos, Alexandros Potamianos
― 6 min read
A dataset of home sounds promotes safety and comfort for older adults.
Gabriel Bibbó, Thomas Deacon, Arshdeep Singh
― 5 min read
SD-Codec improves audio processing by separating different sound types effectively.
Xiaoyu Bie, Xubo Liu, Gaël Richard
― 5 min read
This article discusses methods to enhance speech recognition for accented speech.
Francesco Nespoli, Daniel Barreda, Patrick A. Naylor
― 6 min read
A new approach enhances the interpretability of spoof speech detection.
Manasi Chhibber, Jagabandhu Mishra, Hyejin Shim
― 5 min read
A look at the new single-stage TTS system improving speech generation.
Gerard I. Gállego, Roy Fejgin, Chunghsin Yeh
― 6 min read
This study addresses challenges in audio language models for low-resource languages.
Potsawee Manakul, Guangzhi Sun, Warit Sirichotedumrong
― 5 min read
This study enhances emotion recognition systems for less common languages using high-resource data.
Hsi-Che Lin, Yi-Cheng Lin, Huang-Cheng Chou
― 6 min read
A model improves speech tasks in multilingual settings, addressing code-switching challenges.
Jing Xu, Daxin Tan, Jiaqi Wang
― 5 min read
DeFT-Mamba improves sound separation and classification in noisy environments.
Dongheon Lee, Jung-Woo Choi
― 5 min read
CADA-GAN enhances ASR systems' performance across various recording environments.
Chien-Chun Wang, Li-Wei Chen, Cheng-Kang Chou
― 6 min read
EVA combines audio and visual signals for better speech recognition accuracy.
Yihan Wu, Yifan Peng, Yichen Lu
― 4 min read
A new framework simplifies speech recognition in busy environments.
Jinhan Wang, Weiqing Wang, Kunal Dhawan
― 5 min read
Llama-AVSR merges audio and visual inputs for enhanced speech recognition accuracy.
Umberto Cappellazzo, Minsu Kim, Honglie Chen
― 6 min read
WMCodec enhances audio watermarking for better security and authenticity.
Junzuo Zhou, Jiangyan Yi, Yong Ren
― 5 min read
New models tackle sound classification with limited training data.
Jin Jie Sean Yeo, Ee-Leng Tan, Jisheng Bai
― 5 min read
A new approach improves fake audio detection using pretrained models.
Zhiyong Wang, Ruibo Fu, Zhengqi Wen
― 5 min read
New method improves speech generation quality and efficiency.
Xin Qi, Ruibo Fu, Zhengqi Wen
― 4 min read
A method combining labeled and unlabeled data enhances sound source detection.
Vadim Rozenfeld, Bracha Laufer Goldshtein
― 5 min read
Discover how audio cues aid players in table tennis.
Thomas Gossard, Julian Schmalzl, Andreas Ziegler
― 6 min read
A system prioritizing melody while offering control over orchestral music generation.
Dinh-Viet-Toan Le, Yi-Hsuan Yang
― 5 min read
A new method uses virtual shadowing to enhance language learners' pronunciation feedback.
Haopeng Geng, Daisuke Saito, Nobuaki Minematsu
― 6 min read
New methods improve binaural audio quality in challenging sound environments.
Ami Berger, Vladimir Tourbabin, Jacob Donley
― 8 min read
A new ASR method helps technology understand children's speech better.
Zhonghao Shi, Harshvardhan Srivastava, Xuan Shi
― 5 min read
Composer uses text prompts to create complex music compositions in MIDI format.
Jakub Poćwiardowski, Mateusz Modrzejewski, Marek S. Tatara
― 5 min read
A resource for studying singing patterns in Japanese idol music.
Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura
― 6 min read
ViolinDiff enhances the realism of computer-generated violin music.
Daewoong Kim, Hao-Wen Dong, Dasaem Jeong
― 5 min read
Combining features enhances underwater sound classification accuracy.
Amirmohammad Mohammadi, Iren'e Masabarakiza, Ethan Barnes
― 6 min read
Transfer learning improves audio classification for underwater sound detection.
Amirmohammad Mohammadi, Tejashri Kelhe, Davelle Carreiro
― 6 min read