Ilya Soloveychik

New methods enhance performance of language models by optimizing memory usage.

2025-08-29T22:15:42+00:00 ― 5 min read

Combining SmoothQuant and GPTQ improves efficiency and performance of large language models.

2025-08-11T22:23:42+00:00 ― 6 min read