Latest Articles for Model Training

Machine Learning Setting Weight Decay in AdamW for Deep Learning

Learn how to adjust weight decay for better model performance in AdamW.

2025-08-08T22:46:06+00:00 ― 7 min read

Computation and Language Advancements in 360Zhinao Language Models

New language models show promise in understanding and generating human language.

2025-08-08T18:49:06+00:00 ― 5 min read

Machine Learning Using Weak AI Models to Train Stronger Ones

Weak models can help strong AI models learn more effectively.

2025-08-08T14:44:12+00:00 ― 6 min read

Machine Learning Improving Learning Efficiency with Dynamic Datasets

Dynamic datasets enhance model learning and reduce resource needs.

2025-08-08T07:53:24+00:00 ― 6 min read

Machine Learning Advancements in Sparse Neural Network Training

New method smup improves efficiency in training sparse neural networks.

2025-08-07T18:35:30+00:00 ― 5 min read

Computer Vision and Pattern Recognition Leveraging Language Models for Low-Level Vision Tasks

Exploring the use of LLMs for enhancing low-level vision tasks like denoising and deblurring.

2025-08-07T08:03:30+00:00 ― 6 min read

Computation and Language Advancing Code Generation for Improved Reasoning

This research focuses on generating pseudo-programs to enhance reasoning tasks in models.

2025-08-07T05:17:36+00:00 ― 5 min read

Machine Learning Advancements in Data-Free Meta-Learning

Exploring Task Groupings Regularization to manage model heterogeneity.

2025-08-06T19:09:18+00:00 ― 5 min read

Machine Learning Speeding Up Diffusion Model Training

A new method reduces time and cost in training diffusion models.

2025-08-06T10:27:54+00:00 ― 7 min read

Machine Learning Introducing FedHPL: A New Approach to Federated Learning

FedHPL enhances federated learning efficiency while ensuring data privacy across devices.

2025-08-06T09:01:00+00:00 ― 5 min read

Machine Learning Efficient Transfer of LoRA Modules Using Synthetic Data

A new method enables the transfer of LoRA modules with synthetic data, minimizing reliance on original data.

2025-08-06T08:45:12+00:00 ― 6 min read

Computer Vision and Pattern Recognition Jump-Teaching: Tackling Noisy Labels in Machine Learning

A new method improves model performance using data with noisy labels.

2025-08-06T07:26:12+00:00 ― 6 min read

Machine Learning Rethinking Training Strategies for Large Models

Exploring efficient training methods for large machine learning models.

2025-08-05T21:33:42+00:00 ― 6 min read

Machine Learning Impact of Low-Rank Adaptation on Knowledge Retention in Machine Learning

Analyzing how LoRA affects knowledge retention in pretrained models during continual learning.

2025-08-05T16:49:18+00:00 ― 7 min read

Machine Learning Password-Locked Models: Revealing Hidden AI Abilities

A new model concept shows how to test AI capabilities effectively.

2025-08-05T08:15:48+00:00 ― 7 min read

Machine Learning Understanding Outlier Features in Neural Networks

Examining the effects of outlier features on neural network training.

2025-08-05T06:01:30+00:00 ― 5 min read

Computation and Language Weak-to-Strong Search: A New Way to Guide Large Language Models

This article details an innovative approach to improve language models using smaller models.

2025-08-05T05:45:42+00:00 ― 7 min read

Computer Vision and Pattern Recognition Improving Model Generalization with DISAM Techniques

This article discusses Domain-Inspired Sharpness-Aware Minimization for better model adaptation.

2025-08-05T00:06:00+00:00 ― 4 min read

Computation and Language Aligning Language Models with Group Preferences

A new method aims to address bias in language model outputs.

2025-08-04T18:02:36+00:00 ― 7 min read

Computation and Language Enhancing Reward Models with Synthetic Critiques

A new method improves reward models using synthetic critiques for better alignment.

2025-08-03T23:12:54+00:00 ― 11 min read

Machine Learning Reassessing AI Learning: Influence and Logic

Analyzing how AI learns from data reveals significant gaps in logic and reasoning.

2025-08-03T18:12:42+00:00 ― 6 min read

Computation and Language Skywork-MoE: Advancements in Language Modeling

Skywork-MoE improves language processing with efficient techniques and innovative architecture.

2025-08-03T15:18:54+00:00 ― 6 min read

Computer Vision and Pattern Recognition Advancing Adversarial Training: A New Approach

Introducing PART, a method to boost machine learning models' accuracy and robustness.

2025-08-03T10:58:12+00:00 ― 5 min read

Machine Learning Efficient Fine-Tuning in Generative Models

DEFT enhances diffusion models for effective conditional sampling with minimal resources.

2025-08-03T06:45:24+00:00 ― 6 min read

Computation and Language Assessing Reasoning Abilities of Language Models

This study examines how LLMs handle reasoning in abstract and contextual scenarios.

2025-08-02T16:24:18+00:00 ― 5 min read

Cryptography and Security Improving Privacy in Deep Learning with DPDR

A new method enhances privacy protection while training deep learning models.

2025-08-02T15:29:00+00:00 ― 5 min read

Machine Learning Rethinking Model Growth in AI Training

This article presents a new approach to improving language model training efficiency.

2025-08-02T13:22:36+00:00 ― 4 min read

Machine Learning A New Approach to Measuring Sharpness in ML Models

Introducing a universal framework for sharpness measures in machine learning.

2025-08-02T04:49:06+00:00 ― 5 min read

Machine Learning Measuring Memorization in Language Models

A new method sheds light on how language models remember training data.

2025-08-01T14:04:18+00:00 ― 8 min read

Machine Learning Optimizing Text Embeddings with Efficient Training

Learn how to train models for text embeddings wisely and effectively.

2025-08-01T10:38:54+00:00 ― 5 min read

Machine Learning Enhancing Model Training with Counterfactually Augmented Data

PairCFR improves training models using counterfactual data for better performance.

2025-07-31T12:00:06+00:00 ― 7 min read

Machine Learning Advancements in Adversarial Training with ProFeAT

Introducing ProFeAT to enhance model robustness against adversarial attacks.

2025-07-31T09:14:12+00:00 ― 6 min read

Machine Learning Rethinking Model Training: The Role of Forgetting in Generalization

This article discusses how models can forget biases to improve predictions.

2025-07-31T03:18:42+00:00 ― 5 min read

Machine Learning In-Context Learning in Transformers: Key Insights

A study revealing factors that influence in-context learning in Transformers.

2025-07-31T02:26:12+00:00 ― 7 min read

Machine Learning Improved Empirical Fisher Method for Natural Gradient Descent

A new method enhances Empirical Fisher for better model optimization.

2025-07-31T00:40:42+00:00 ― 5 min read

Computer Vision and Pattern Recognition Adaptive Teaching in Knowledge Distillation

A method to enhance student models using insights from stronger teacher models.

2025-07-30T00:11:18+00:00 ― 5 min read

Computer Vision and Pattern Recognition Personalizing Generative Models with Weight Space

Customizing generative models to reflect unique identities through weight space.

2025-07-29T12:04:30+00:00 ― 7 min read

Machine Learning The Importance of Soft Labels in Dataset Distillation

Examining how soft labels enhance machine learning through dataset distillation.

2025-07-29T01:40:24+00:00 ― 6 min read

Distributed, Parallel, and Cluster Computing AI Training Efficiency in Large Models

Discussing methods to improve data management in training large AI models.

2025-07-28T13:41:30+00:00 ― 6 min read

Computation and Language Advancements in Model Merging with Twin-Merging

Twin-Merging improves model merging efficiency and adaptability across various tasks.

2025-07-28T10:08:12+00:00 ― 4 min read