New methods enhance performance of language models by optimizing memory usage.
― 5 min read
Cutting edge science explained simply
New methods enhance performance of language models by optimizing memory usage.
― 5 min read
Combining SmoothQuant and GPTQ improves efficiency and performance of large language models.
― 6 min read