VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.
― 4 min read
Cutting edge science explained simply
VoxtLM combines speech recognition, synthesis, text generation, and continuation in one model.
― 4 min read
Exploring advancements in automated audio captioning and its impact on accessibility.
― 5 min read
An overview of advancements in speaker recognition through the VoxCeleb Challenge.
― 4 min read
A study shows i-vectors can compete with complex models in speaker recognition.
― 5 min read
ESPnet-Codec enhances training and evaluation of neural codecs for audio and speech.
― 7 min read