Yingyu Liang

Machine Learning Large Language Models and Mathematical Reasoning

Examining LLMs' capability to address mathematical problems, especially modular arithmetic.

2025-09-01T14:38:32+00:00 ― 7 min read

Machine Learning The Role of Softmax in Neural Networks

Exploring the importance of softmax in neural network performance and applications.

2025-08-13T07:02:54+00:00 ― 4 min read

Machine Learning Improving Attention Efficiency in Transformers

A new method enhances attention mechanisms in language models for better performance.

2025-08-12T17:05:30+00:00 ― 6 min read

Machine Learning Understanding Diffusion Models in Machine Learning

Exploring the fundamentals and applications of diffusion models in various fields.

2025-08-07T06:52:24+00:00 ― 5 min read

Machine Learning Advancements in Tensor Attention Mechanisms

Exploring tensor attention and its impact on data processing in AI models.

2025-08-07T06:44:30+00:00 ― 4 min read

Machine Learning The Challenges of In-Context Learning in Large Models

Examining why larger models struggle with in-context learning compared to smaller ones.

2025-08-05T08:55:18+00:00 ― 6 min read

Machine Learning Enhancing Language Models with Prefix Learning and NTK-Attention

Advancements in fine-tuning language models using innovative techniques.

2025-07-26T01:47:00+00:00 ― 6 min read

Machine Learning Balancing Privacy and Performance in AI Systems

Examining differential privacy and NTK regression to protect user data in AI.

2025-07-11T01:56:18+00:00 ― 6 min read

Machine Learning Safeguarding Privacy in AI's Cross-Attention Mechanisms

Exploring differential privacy to protect sensitive information in AI applications.

2025-07-10T14:21:06+00:00 ― 5 min read

Computation and Language Assessing Compositional Ability in Large Language Models

Exploring how LLMs perform on composite tasks that combine simpler tasks.

2025-07-09T00:25:54+00:00 ― 7 min read

Data Structures and Algorithms Combining Differential Privacy with John Ellipsoid Computation

A new method enhances John ellipsoid computation while protecting sensitive data.

2025-06-29T13:26:18+00:00 ― 7 min read

Data Structures and Algorithms Enhancing Efficiency in Large Language Models

SparseGPT improves the speed and efficiency of large language models through parameter pruning.

2025-06-23T13:23:42+00:00 ― 4 min read

Machine Learning Innovative Method Reduces Gradient Calculation Time for Transformers

A new approach enhances gradient calculations, improving transformer efficiency in machine learning.

2025-06-23T07:28:12+00:00 ― 4 min read

Computation and Language Improving Long Input Processing in Language Models

A method to enhance efficiency of language models with long text inputs.

2025-06-06T05:27:36+00:00 ― 5 min read

Machine Learning Transformers in Language Processing: Limits and Potential

Exploring the capabilities and challenges of Transformer technology in understanding language.

2025-05-24T19:18:45+00:00 ― 6 min read

Computational Complexity Unlocking the Secrets of Modern Hopfield Networks

A closer look at how MHNs can enhance machine learning.

2025-04-03T15:17:33+00:00 ― 6 min read

Computational Complexity Mamba vs. State-Space Models: The AI Showdown

A look at Mamba and State-Space Models in AI capabilities.

2025-04-02T21:38:15+00:00 ― 6 min read

Machine Learning The Future of AI: Tensor Attention Explained

Discover how tensor attention transforms AI language processing.

2025-02-01T10:43:03+00:00 ― 7 min read

Machine Learning Fast Tracking AI: RoPE Attention Mechanisms

New methods improve RoPE attention, speeding up AI computations significantly.

2025-01-29T08:53:15+00:00 ― 5 min read