Haizhou Li

Computation and Language Introducing the Comprehensive Medical Benchmark for LLMs in China

A new benchmark for evaluating language models in Chinese medical contexts.

2025-10-07T18:51:24+00:00 ― 9 min read

Audio and Speech Processing Advancements in Acoustic Word Embeddings

A new model improves how computers process spoken language.

2025-10-07T04:16:50+00:00 ― 4 min read

Computation and Language Harnessing Holistic Conversational Recommender Systems

A look at conversational recommenders using real dialogue for better suggestions.

2025-09-26T18:12:24+00:00 ― 6 min read

Artificial Intelligence Advancements in Emotion-Aware Text-to-Speech Technology

New model EmoPP enhances speech with emotional cues.

2025-09-24T07:13:12+00:00 ― 5 min read

Computation and Language AceGPT: Bridging Language and Culture for Arabic Speakers

AceGPT enhances Arabic language processing tailored for local culture and values.

2025-09-23T18:42:42+00:00 ― 5 min read

Sound New System Improves Voice Extraction from Unstable Head Positions

PIAVE helps machines extract voices clearly, even when speakers turn their heads.

2025-09-12T19:39:40+00:00 ― 6 min read

Sound Advancements in Text-Based Speech Editing

FluentEditor improves audio editing by focusing on natural flow and consistency.

2025-09-07T20:37:55+00:00 ― 4 min read

Neural and Evolutionary Computing Advancements in Spiking Neural Network Training

New learning methods enhance efficiency and accuracy of spiking neural networks.

2025-09-03T02:03:54+00:00 ― 6 min read

Audio and Speech Processing Advancements in Multimodal Processing with CoAVT

CoAVT integrates audio, visual, and text data for enhanced understanding.

2025-08-28T12:02:50+00:00 ― 7 min read

Audio and Speech Processing Advancing Active Speaker Detection Technology

New methods improve audio-visual speaker detection in challenging environments.

2025-08-14T01:29:10+00:00 ― 7 min read

Audio and Speech Processing Advancing Audio-Visual Target Speaker Extraction with SEANet

SEANet improves speaker isolation by reducing noise in audio processing.

2025-08-08T20:47:20+00:00 ― 6 min read

Computation and Language Assessing NLG Evaluation with AdvEval Framework

AdvEval exposes weaknesses in Natural Language Generation evaluation metrics.

2025-08-08T07:29:42+00:00 ― 6 min read

Computation and Language Enhancing Dialogue Systems Through Mutual Learning

A new approach improves dialogue systems by combining topic and rhetorical structures.

2025-08-04T06:19:30+00:00 ― 6 min read

Audio and Speech Processing Advancements in Speech Synthesis with ARDiT

New model ARDiT improves text-to-speech synthesis and speech editing.

2025-07-31T07:55:45+00:00 ― 5 min read

Audio and Speech Processing Advancements in Target Speech Diarization Technology

A look at new methods in understanding overlapping speech during conversations.

2025-07-30T14:06:55+00:00 ― 8 min read

Audio and Speech Processing Advances in Cross-Lingual Voice Conversion

A new method improves voice conversion between languages while preserving speaker traits.

2025-07-27T15:40:10+00:00 ― 4 min read

Computation and Language The Importance of Data Selection in Language Models

A review of how data selection improves language model performance.

2025-07-26T03:06:00+00:00 ― 4 min read

Audio and Speech Processing Enhancing Face and Voice Recognition Technology

A new framework improves connection between faces and voices, especially in noisy settings.

2025-07-10T17:11:20+00:00 ― 5 min read

Sound Advancements in Sound Source Localization with Incremental Learning

A new method improves sound localization accuracy while ensuring data privacy.

2025-06-14T07:59:10+00:00 ― 4 min read

Sound Advancements in Accent Conversion Techniques

A new method for generating accented speech using text transliteration.

2025-06-11T06:18:05+00:00 ― 6 min read

Audio and Speech Processing E1 TTS: A New Era in Text-to-Speech Technology

E1 TTS transforms text into natural speech faster and more efficiently.

2025-06-11T05:29:30+00:00 ― 5 min read

Audio and Speech Processing Matryoshka Speaker Embeddings: A Flexible Approach to Voice Recognition

Discover how Matryoshka embeddings improve speaker recognition efficiency and flexibility.

2025-06-02T20:40:50+00:00 ― 4 min read

Sound Advancing Multi-Audio Processing with MALLM

Introducing a new model and benchmark for evaluating multi-audio tasks.

2025-05-31T19:17:15+00:00 ― 5 min read

Sound Using Visual Cues to Clear Up Speech in Noise

New method enhances speech clarity using visual information from surroundings.

2025-05-18T20:42:14+00:00 ― 5 min read

Sound Bringing Emotion to Machines: The Future of TTS

Discover how emotional TTS changes communication with machines, making them more relatable.

2025-02-23T02:25:48+00:00 ― 6 min read