Latest Articles for Quantization

Distributed, Parallel, and Cluster Computing Making Large Language Models Smaller

Learn how model compression improves efficiency of large language models.

2025-06-17T20:35:36+00:00 ― 5 min read

Machine Learning OPAL: A New Approach to Efficient Language Models

A method to enhance language models' efficiency and performance.

2025-06-16T21:56:48+00:00 ― 6 min read

Machine Learning Efficient Neural Network Compression Techniques

New methods improve neural network performance on limited-resource devices.

2025-06-16T14:18:36+00:00 ― 6 min read

Machine Learning Rate-Constrained Federated Learning: A New Approach to Efficient Model Training

RC-FED reduces communication costs while maintaining model quality in federated learning.

2025-06-15T02:36:35+00:00 ― 5 min read

Machine Learning Efficiency in Quantized Neural Networks

This study examines performance and conditions for quantized neural networks under fixed-point arithmetic.

2025-06-12T03:32:20+00:00 ― 6 min read

Systems and Control Efficient Communication in Distributed Optimization

A new algorithm improves coordination among nodes under communication limits.

2025-06-09T13:09:41+00:00 ― 6 min read

Computer Vision and Pattern Recognition DilateQuant: A New Way to Optimize Diffusion Models

This article discusses DilateQuant for improving diffusion models' speed and accuracy.

2025-06-07T22:48:12+00:00 ― 7 min read

Machine Learning AXE: A Framework for Efficient Post-Training Quantization

AXE improves model performance while minimizing overflow in accumulator-aware quantization.

2025-06-05T23:16:18+00:00 ― 5 min read

Artificial Intelligence Enhancing Student Support with an Educational Chatbot

A new chatbot assists students with STEM multiple-choice questions.

2025-06-05T16:41:18+00:00 ― 6 min read

Computer Vision and Pattern Recognition Introducing P4Q: A New Method for Visual-Language Models

P4Q combines fine-tuning and quantization for efficient visual-language model performance.

2025-06-04T18:02:30+00:00 ― 5 min read

Hardware Architecture Advancements in Power-of-Two Quantization for DNNs

Optimizing DNNs with power-of-two quantization for resource-limited devices.

2025-06-03T05:42:06+00:00 ― 5 min read

Computation and Language The Future of Compression in Large Language Models

Innovative methods aim to make large language models more efficient and deployable.

2025-06-02T22:35:30+00:00 ― 5 min read

Machine Learning 1-Bit Neural Networks: A New Approach

1-bit models show great potential in machine learning efficiency and performance.

2025-06-01T02:36:54+00:00 ― 5 min read

Machine Learning Chatbot Safety and Sneaky Tricks

Discover how simple tweaks can trick chatbots into unexpected responses.

2025-05-31T21:44:36+00:00 ― 6 min read

Machine Learning Making Large Language Models Smaller and Faster

Learn about quantization and its impact on language models.

2025-05-31T14:53:48+00:00 ― 6 min read

Machine Learning Understanding Precision in Language Model Training

Precision impacts the effectiveness and cost of language model training.

2025-05-29T19:22:21+00:00 ― 6 min read

Computer Vision and Pattern Recognition Understanding the Balance of Deep Learning Models

Examining how simplifying models affects decision-making clarity and performance.

2025-05-29T01:32:15+00:00 ― 7 min read

Hardware Architecture MicroScopiQ: A Step Forward in AI Efficiency

MicroScopiQ improves AI models' performance while consuming less energy.

2025-05-28T23:34:48+00:00 ― 5 min read

Cryptography and Security QuanCrypt-FL: A Safe Approach to Federated Learning

QuanCrypt-FL enhances security in Federated Learning using advanced techniques.

2025-05-28T22:29:33+00:00 ― 6 min read

Databases Improving High-Dimensional Searches with a New Approach

A novel method enhances AKNN searches for better speed and accuracy.

2025-05-26T23:04:39+00:00 ― 5 min read

Machine Learning Making Large Language Models Smaller and Faster

Learn how quantization helps optimize large language models for everyday use.

2025-05-26T20:28:03+00:00 ― 5 min read

Computation and Language The Impact of Super Weights in Language Models

Super weights are key to language model performance and efficiency.

2025-05-25T21:11:42+00:00 ― 5 min read

Computation and Language The Risks of Powerful Language Models

This study examines how large language models can misbehave and be manipulated.

2025-05-25T12:42:45+00:00 ― 5 min read

Machine Learning A New Approach to Quantization Challenges

ASER offers a way to enhance quantized language models without losing performance.

2025-05-24T23:00:36+00:00 ― 5 min read

Networking and Internet Architecture Making AI Accessible on Mobile Devices

Innovative strategies for running advanced AI on mobile devices.

2025-05-24T17:08:15+00:00 ― 8 min read

Machine Learning Making AI Models Smaller with ZipNN

ZipNN compresses AI models efficiently, keeping essential details intact.

2025-05-24T04:43:51+00:00 ― 5 min read

Software Engineering Challenges and Insights on Small Language Models for Coding

Smaller LLMs offer help but have significant quality issues in code generation.

2025-05-22T23:09:36+00:00 ― 5 min read

Machine Learning Advancements in AI Speed with 4-Bit Attention

A new method speeds up AI processing without losing accuracy.

2025-05-21T20:37:30+00:00 ― 5 min read

Machine Learning Streamlining Neural Networks with Sub-8-Bit Integer Training

Learn how ShiftQuant and L1 normalization improve neural network efficiency.

2025-05-21T19:45:18+00:00 ― 4 min read

Distributed, Parallel, and Cluster Computing Llama Guard: Your Chat Safety Companion

Keeping AI conversations safe on the go with Llama Guard.

2025-05-20T23:31:39+00:00 ― 6 min read

Computer Vision and Pattern Recognition Making Big Models Smaller: A New Approach

Model compression techniques enable heavy models to run smoothly on smaller devices.

2025-05-17T17:25:20+00:00 ― 6 min read

Machine Learning Streamlining Language Models with AutoMixQ

A new method to optimize large language models efficiently.

2025-05-17T03:54:40+00:00 ― 7 min read

Hardware Architecture Advancements in Spiking Neural Networks with Hybrid Architecture

A study showcasing hybrid architecture for improving SNN performance and energy efficiency.

2025-05-13T10:24:00+00:00 ― 5 min read

Computer Vision and Pattern Recognition Making Diffusion Models More Accessible through Pruning

Research shows how to compress diffusion models while maintaining quality.

2025-05-12T22:34:40+00:00 ― 6 min read

Hardware Architecture Anda: Transforming Activation Precision in Large Language Models

Learn about Anda, a new method for managing activation data in LLMs.

2025-05-10T14:25:20+00:00 ― 7 min read

Optimization and Control Reinforcement Learning: Improving Machine Communication and Control

Learn how reinforcement learning enhances machine communication and decision-making.

2025-05-07T10:33:04+00:00 ― 6 min read

High Energy Physics - Lattice Understanding Hadrons Through Lattice QCD

A look into hadrons and their interactions using lattice quantum chromodynamics.

2025-05-05T07:04:16+00:00 ― 4 min read

Machine Learning Simplifying Time Series Data with QABBA

QABBA streamlines time series data analysis for clearer insights.

2025-04-30T17:25:41+00:00 ― 6 min read

Machine Learning Navigating Deep Learning: Efficiency Meets Clarity

Discover how AI models can be fast and easy to understand.

2025-04-09T10:07:30+00:00 ― 8 min read

Computer Vision and Pattern Recognition The Future of Lossless Compression

Learn how lossless compression is reshaping data storage and processing.

2025-03-30T07:46:12+00:00 ― 7 min read