A study on improving training efficiency for language models using SlimPajama dataset.
― 7 min read
Cutting edge science explained simply
A study on improving training efficiency for language models using SlimPajama dataset.
― 7 min read
A new approach enhances the training of surrogate models in software development.
― 6 min read
This study examines how tokenization impacts gender bias in translation models.
― 6 min read
This method helps neural networks avoid local minima and learn more effectively.
― 6 min read
Two advanced language models for Modern and Rabbinic Hebrew are now available.
― 4 min read
Examining how machine learning models improve predictions in materials research.
― 7 min read
Introducing MetaCLIP for better image-text data collection.
― 7 min read
New methods improve image editing speed and quality using smaller models.
― 5 min read
A new model helps enhance recommendations by addressing noisy user feedback.
― 5 min read
A new approach enhances commit message quality by leveraging best practices.
― 7 min read
A look at how machine translation reflects and reinforces gender bias.
― 7 min read
CleanSheet advances model hijacking without altering training processes.
― 6 min read
Examining how deep neural networks learn and the challenges they face.
― 6 min read
New models aim to improve language tech for Brazilian Portuguese speakers.
― 6 min read
New unbounded language model improves predictions using extensive data.
― 6 min read
Examining harm amplification in text-to-image models and its societal impact.
― 6 min read
WiOpen efficiently recognizes both known and unknown gestures using Wi-Fi technology.
― 6 min read
A new open language model for research and innovation in natural language processing.
― 6 min read
Examining difficulties in recognizing languages in mixed-language communication.
― 7 min read
Learn how animated stickers are made from text and images.
― 4 min read
A new approach helps AI measure uncertainty and improve decision-making accuracy.
― 7 min read
Understanding unlearnable example attacks through game theory for better data protection.
― 6 min read
A study on enhancing language model learning using minimal style changes in training data.
― 11 min read
This article examines how language models can adopt ideological biases from training data.
― 5 min read
Examining biases and rationality in large language models used for financial analysis.
― 6 min read
A critical look at the true capabilities of Generative Adversarial Networks.
― 5 min read
Examining grokking in deep learning and its implications for performance.
― 5 min read
Study explores Elo rating system's impact on medical student learning.
― 8 min read
MobiLlama offers efficient language processing for devices with limited resources.
― 5 min read
RoadRunner helps robots navigate difficult outdoor terrains safely and efficiently.
― 5 min read
New datasets improve depth estimation models for various environments.
― 5 min read
Examining how to fairly reward artists in the age of AI-generated art.
― 6 min read
Research shows how LLMs can expose training data, raising privacy concerns.
― 5 min read
The Yi model family showcases strong language and multimodal processing capabilities.
― 4 min read
New training framework enhances language model learning through structured data.
― 5 min read
A novel approach to find backdoor samples without needing clean data.
― 8 min read
New tools are helping scientists predict protein stability and its implications for health.
― 6 min read
A two-stage method enhances model performance across different data groups.
― 7 min read
New methods improve accuracy in predicting protein-ligand interactions.
― 7 min read
A guide to understanding music similarity in generative models.
― 9 min read