A study on combining updates for language models effectively.
― 7 min read
Cutting edge science explained simply
A study on combining updates for language models effectively.
― 7 min read
How low-bit quantization affects large language models during training.
― 6 min read