Shinji Watanabe

Quasicrystals exhibit unique magnetic behaviors challenging our traditional understanding of magnetism.

2025-10-09T07:36:27+00:00 ― 5 min read

Computation and Language Advancements in Speech Summarization with BASS

BASS improves summarization of long audio by processing in blocks.

2025-10-08T15:05:55+00:00 ― 5 min read

Computation and Language Advancements in Speech Recognition Technology

The Bayes Risk Transducer improves speech recognition efficiency and accuracy.

2025-10-06T21:31:36+00:00 ― 5 min read

Audio and Speech Processing VoxtLM: A Unified Approach to Speech and Text

VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.

2025-09-13T11:02:45+00:00 ― 4 min read

Audio and Speech Processing Introducing AV-SUPERB: A New Benchmark for Audio-Visual Models

AV-SUPERB evaluates audio and visual models across various tasks for better performance.

2025-09-08T22:32:35+00:00 ― 5 min read

Computation and Language Advancements in Simultaneous Speech Translation

Improving real-time translations through innovative methods and smart policies.

2025-09-07T17:23:35+00:00 ― 5 min read

Audio and Speech Processing Improving Speech Recognition with New Techniques

A look at advancements in speech recognition to boost speed and accuracy.

2025-09-03T21:05:05+00:00 ― 5 min read

Computation and Language Advancements in Speech Translation Through Context

New methods improve speech translation by focusing on contextual information.

2025-09-02T22:24:45+00:00 ― 5 min read

Sound Advancing Voice Technology with Code-Switching Data

A new method improves voice recognition for code-switching users.

2025-09-02T21:36:10+00:00 ― 5 min read

Audio and Speech Processing A Universal Approach to Speech Enhancement

This research presents a model for improving speech clarity across different conditions.

2025-09-02T02:10:10+00:00 ― 5 min read

Sound The Rise of Automated Audio Captioning

Exploring advancements in automated audio captioning and its impact on accessibility.

2025-09-02T01:21:35+00:00 ― 5 min read

Computation and Language Documenting Endangered Languages with IGT

A new method supports the preservation of at-risk languages through detailed documentation.

2025-08-27T17:35:42+00:00 ― 8 min read

Audio and Speech Processing Evaluating Speech Processing Models with SUPERB

A new framework for assessing foundation models in speech tasks.

2025-08-11T09:31:05+00:00 ― 8 min read

Exploring hedgehog and antihedgehog states in unique magnetic materials.

2025-08-03T10:24:48+00:00 ― 5 min read

Audio and Speech Processing Reducing Cross-Talk for Clearer Speech

A new system improves speech clarity in multi-speaker environments.

2025-08-02T14:10:50+00:00 ― 5 min read

Audio and Speech Processing Introducing the 4D Model in Speech Recognition

A new model improves speech recognition using multiple decoding methods.

2025-08-01T01:44:35+00:00 ― 6 min read

Computation and Language Advancements in Automatic Speech Recognition Technology

New methods improve accuracy and efficiency in speech recognition systems.

2025-07-22T03:41:05+00:00 ― 6 min read

Audio and Speech Processing SynesLM: Advancing Audio-Visual Speech Technology

A new model integrates audio and visual data for speech recognition and translation.

2025-07-06T20:04:15+00:00 ― 6 min read

Computation and Language Real-Time Translation: Bridging Language Gaps

This system translates English speech to German text instantly for seamless communication.

2025-06-27T20:53:06+00:00 ― 6 min read

Immunology COVID-19 Variants and Vaccine Responses: What We Know

New variants of COVID-19 challenge current vaccines and highlight the need for ongoing research.

2025-06-15T20:20:03+00:00 ― 5 min read

Sound ESPnet-EZ: Simplifying Speech Model Development

An easy-to-use tool for fine-tuning speech models without complex code.

2025-06-11T15:12:30+00:00 ― 6 min read

Computation and Language Advancements in Speech Recognition with LLMs

Exploring the GenSEC challenge to improve speech transcription accuracy.

2025-06-10T18:57:55+00:00 ― 4 min read

Computation and Language Advancements in Multilingual Speech Translation Systems

New methods enhance translation accuracy and efficiency for multiple languages.

2025-06-10T16:14:30+00:00 ― 6 min read

Computation and Language Advances in Text-to-Speech Technology: Preference Alignment

Discover how preference alignment improves text-to-speech systems for better user experiences.

2025-06-10T06:53:36+00:00 ― 5 min read

Audio and Speech Processing Advancements in Speaker Recognition Using i-Vectors

A study shows i-vectors can compete with complex models in speaker recognition.

2025-06-10T06:49:10+00:00 ― 5 min read

Audio and Speech Processing Design Choices Impacting Speech Model Performance

A study on how design choices affect speech foundation models.

2025-06-10T06:00:35+00:00 ― 7 min read

Audio and Speech Processing EVA: A New Era in Audiovisual Speech Recognition

EVA combines audio and visual signals for better speech recognition accuracy.

2025-06-07T22:08:20+00:00 ― 4 min read

Audio and Speech Processing Evaluating Neural Audio Codecs: Insights from Codec-SUPERB Challenge

A look at the Codec-SUPERB challenge results and codec performance metrics.

2025-06-05T06:58:50+00:00 ― 5 min read

Audio and Speech Processing Advancements in Neural Codecs with ESPnet-Codec

ESPnet-Codec enhances training and evaluation of neural codecs for audio and speech.

2025-06-03T03:09:30+00:00 ― 7 min read

Sound Advancements in Automatic Speech Recognition

New methods improve how machines recognize spoken language.

2025-04-20T10:37:12+00:00 ― 8 min read

Sound Meet VERSA: Your Audio Evaluation Companion

VERSA evaluates speech, audio, and music quality effectively.

2025-01-28T09:33:18+00:00 ― 9 min read

Audio and Speech Processing Audiovisual Speech Recognition: A New Frontier

Learn how AV-ASR combines audio and visuals for better speech recognition.

2025-01-24T21:39:36+00:00 ― 6 min read