New methods improve music creation through audio analysis and user control.
Haonan Chen, Jordan B. L. Smith, Janne Spijkervet
― 6 min read
Cutting edge science explained simply
New methods improve music creation through audio analysis and user control.
Haonan Chen, Jordan B. L. Smith, Janne Spijkervet
― 6 min read
Latest Articles
Robin San Roman, Pierre Fernandez, Antoine Deleforge
― 4 min read
Yisi Liu, Bohan Yu, Drake Lin
― 6 min read
Byunggun Kim, Younghun Kwon
― 4 min read
Haoxuan Liu, Zihao Wang, Haorong Hong
― 5 min read
Yan Ru Pei, Ritik Shrivastava, FNU Sidharth
― 5 min read
Researchers explore ultrasonic echoes for accurate distance measurements in quiet indoor settings.
Junpei Honma, Akisato Kimura, Go Irie
― 6 min read
A new model improves vocal separation and melody transcription in music.
Ju-Chiang Wang, Wei-Tsung Lu, Jitong Chen
― 5 min read
Research reveals how neurons in speech models recognize key features of sound.
Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee
― 7 min read
A new model streamlines audio production by automatically eliminating breath sounds.
Nidula Elgiriyewithana, N. D. Kodikara
― 6 min read
A self-supervised learning approach reduces the need for labeled audio data.
Chunxi Wang, Maoshen Jia, Meiran Li
― 6 min read
Study reveals voice data's role in recognizing emotions in Spanish speakers.
Elena Ortega-Beltrán, Josep Cabacas-Maso, Ismael Benito-Altamirano
― 5 min read
A new method improves speech clarity in loud environments.
Siyi Wang, Siyi Liu, Andrew Harper
― 5 min read
Innovative approaches aim to improve music quality for those with hearing loss.
Gerardo Roa Dabike, Michael A. Akeroyd, Scott Bannister
― 5 min read
GenRep offers a novel approach to identifying unusual machine sounds with limited data.
Phurich Saengthong, Takahiro Shinozaki
― 5 min read
TF-Mamba enhances sound localization using a novel approach integrating time and frequency data.
Yang Xiao, Rohan Kumar Das
― 5 min read
This article discusses efficient training methods for speech models using self-supervised learning.
Andy T. Liu, Yi-Cheng Lin, Haibin Wu
― 4 min read
A new architecture improves sound detection across diverse environments.
Zehao Wang, Haobo Yue, Zhicheng Zhang
― 5 min read
A new model improves music generation by focusing on individual instruments.
Zhongweiyang Xu, Debottam Dutta, Yu-Lin Wei
― 5 min read
Introducing DENSE, a method enhancing target speech extraction using dynamic embeddings.
Yiwen Wang, Zeyu Yuan, Xihong Wu
― 6 min read
A novel method improves audio transformation while preserving melody and sound quality.
Michele Mancusi, Yurii Halychanskyi, Kin Wai Cheuk
― 6 min read
This method enhances recognition accuracy for uncommon names in speech outputs.
Ernest Pusateri, Anmol Walia, Anirudh Kashi
― 6 min read
A new model improves detection of audio deepfakes with continuous learning.
Tuan Duy Nguyen Le, Kah Kuan Teh, Huy Dat Tran
― 5 min read
An overview of audio-visual speaker diarization methods, challenges, and systems.
Victoria Mingote, Alfonso Ortega, Antonio Miguel
― 5 min read
This study evaluates neural networks for replicating spring reverb characteristics.
Francesco Papaleo, Xavier Lizarraga-Seijas, Frederic Font
― 7 min read
BigCodec improves sound quality in low-bitrate audio transmission.
Detai Xin, Xu Tan, Shinnosuke Takamichi
― 4 min read
A new dataset enhances multilingual speech technology in India.
Ashwin Sankar, Srija Anand, Praveen Srinivasa Varadhan
― 5 min read
This article discusses the benefits of simplifying transformer models for speech tasks.
Teresa Dorszewski, Albert Kjøller Jacobsen, Lenka Tětková
― 4 min read
Sortformer integrates speaker diarization and ASR for improved audio processing.
Taejin Park, Ivan Medennikov, Kunal Dhawan
― 5 min read
A fresh approach to create realistic piano sounds using sound component separation.
Riccardo Simionato, Stefano Fasciani
― 8 min read
ParaEVITS improves emotional expression in TTS through natural language guidance.
Xin Jing, Kun Zhou, Andreas Triantafyllopoulos
― 5 min read
Learn how audio inpainting restores missing parts of signals.
Ondřej Mokrý, Peter Balušík, Pavel Rajmic
― 5 min read
New methods improve understanding of spoken language through innovative dataset.
Lennart Keller, Goran Glavaš
― 5 min read
A new framework enhances voice identity confirmation accuracy.
Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal
― 5 min read
New methods improve human-robot conversation by enhancing speech clarity.
Yue Li, Koen V. Hindriks, Florian A. Kunneman
― 5 min read
New methods improve access to spoken news by segmenting topics more effectively.
Sakshi Deo Shukla, Pavel Denisov, Tugtekin Turan
― 6 min read
A study on LLMs' capabilities in understanding musical intervals, chords, and scales.
Anna Kruspe
― 8 min read
A new method for music tagging using few-shot learning shows promising results.
T. Aleksandra Ma, Alexander Lerch
― 6 min read
FlowSep introduces a fresh method for extracting sounds using language queries.
Yi Yuan, Xubo Liu, Haohe Liu
― 5 min read
SSR-Speech offers new solutions for speech generation and editing.
Helin Wang, Meng Yu, Jiarui Hai
― 5 min read
Advancements in AI make fake audio common, prompting the need for detection.
Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen
― 6 min read
New model enhances speech generation in diverse dialects of pitch-accent languages.
Kazuki Yamauchi, Yuki Saito, Hiroshi Saruwatari
― 5 min read
A new method improves sound localization accuracy while ensuring data privacy.
Xinyuan Qian, Xianghu Yue, Jiadong Wang
― 4 min read
SoloAudio improves sound extraction using advanced techniques and synthetic data.
Helin Wang, Jiarui Hai, Yen-Ju Lu
― 5 min read
OpenACE provides a fair benchmark for assessing audio codecs across various conditions.
Jozef Coldenhoff, Niclas Granqvist, Milos Cernak
― 5 min read
A method to identify faults in electric motors through sound analysis and Bayesian neural networks.
Waldemar Bauer, Marta Zagorowska, Jerzy Baranowski
― 5 min read