Latest Articles for Performance Evaluation

Machine Learning CDALBench: A New Benchmark for Active Learning

CDALBench offers a reliable testing ground for various Active Learning methods.

2025-07-03T16:11:18+00:00 ― 6 min read

Computer Vision and Pattern Recognition A New Method for Performance Estimation Without Training Data

Introducing a method to estimate model performance without relying on training data.

2025-07-02T15:41:54+00:00 ― 6 min read

Computation and Language Impact of Format Restrictions on LLM Performance

This article examines how structured generation affects language model reasoning and comprehension.

2025-07-02T00:09:42+00:00 ― 5 min read

Hardware Architecture Advancements in High-Frequency Computing Technologies

Exploring the benefits of cryogenic and superconducting computing for improved speed and efficiency.

2025-07-01T12:58:12+00:00 ― 5 min read

Computer Vision and Pattern Recognition Evaluating SAM2 in 3D Medical Imaging

A look into SAM2's performance and challenges in medical image segmentation.

2025-06-25T09:06:30+00:00 ― 5 min read

Computer Vision and Pattern Recognition Improving Long-term Action Quality Assessment

A new method enhances evaluation of performances in long videos.

2025-06-24T09:56:06+00:00 ― 6 min read

Machine Learning Challenges and Insights in Multi-Task Learning

Exploring how multi-task learning affects model performance and generalization.

2025-06-22T07:06:42+00:00 ― 6 min read

Machine Learning Analyzing Machine Learning vs. Deep Learning for Tabular Data

This study benchmarks machine learning and deep learning on tabular datasets to determine effectiveness.

2025-06-21T06:37:18+00:00 ― 6 min read

Computers and Society A New Way to Rank Law Firms

Our ranking system uses real outcomes to better evaluate law firm performance.

2025-06-20T09:49:06+00:00 ― 12 min read

Distributed, Parallel, and Cluster Computing Managing Timing Anomalies in Self-Suspending Tasks

Strategies to handle timing issues in periodic task scheduling.

2025-06-18T18:42:48+00:00 ― 6 min read

Cryptography and Security Optimizing Machine Learning on Secure Hardware

Enhancing efficiency in secure processing of machine learning tasks.

2025-06-17T06:14:30+00:00 ― 6 min read

Computation and Language Predicting Language Model Performance with Fewer Evaluations

A new approach to evaluate language models efficiently.

2025-06-16T14:26:30+00:00 ― 6 min read

Robotics Improving Robot Learning Evaluations for Better Insights

Enhancing robot evaluations can lead to deeper insights into their capabilities.

2025-06-12T03:00:06+00:00 ― 7 min read

Mathematical Software New Library Enhances Gradient-Based Multiobjective Optimization

A new library improves methods for handling complex multiobjective optimization problems.

2025-06-12T01:27:35+00:00 ― 5 min read

Software Engineering Evaluating OpenAI's New Coding Models

This article reviews OpenAI's new coding models and their performance in web applications.

2025-06-09T17:12:00+00:00 ― 5 min read

Neural and Evolutionary Computing Quality-Diversity Algorithms: Balancing Performance and Reproducibility

Examining the role of reproducibility in Quality-Diversity algorithms for real-world applications.

2025-06-08T15:47:18+00:00 ― 7 min read

Machine Learning New Method for Detecting Knee Points in Data

A deep learning approach improves knee point detection accuracy in noisy datasets.

2025-06-07T19:46:30+00:00 ― 8 min read

Artificial Intelligence Evaluating AI: Challenges and Methods

Assessing AI capabilities is essential for safety and effectiveness.

2025-06-06T19:09:12+00:00 ― 5 min read

Computation and Language Evaluating AI Agents in CRM Systems

A new benchmark tests AI agents in realistic CRM tasks.

2025-05-31T13:19:00+00:00 ― 6 min read

Machine Learning A New Approach to Evaluating Reinforcement Learning Algorithms

Introducing a reliable method for assessing RL algorithm performance through a gap function.

2025-05-31T01:55:44+00:00 ― 5 min read

Optimization and Control A New Approach to Set Optimization Problems

Introducing a method for finding weakly minimal solutions in set optimization.

2025-05-30T15:49:46+00:00 ― 3 min read

Databases Understanding Database Transactions and Their Importance

Learn how database transactions ensure data consistency and efficiency.

2025-05-25T16:37:39+00:00 ― 7 min read

Machine Learning Introducing Milabench: A New Benchmark Tool for AI Research

Milabench provides tailored benchmarks to improve AI performance evaluations.

2025-05-20T12:26:06+00:00 ― 5 min read

Robotics New Method for Testing Soft Robotic Grippers

SoGraB offers a standardized way to evaluate soft grippers' performance on fragile objects.

2025-05-03T02:33:20+00:00 ― 8 min read

Theoretical Economics Winning Strategies: The Role of Performance Standards in Tournaments

Explore how performance standards shape competition and prize distribution.

2025-04-13T15:26:15+00:00 ― 7 min read

Robotics The Impact of Human-Robot Collaboration

Examining how task difficulty affects robot assistance and user experience.

2025-04-08T03:37:48+00:00 ― 7 min read

Primary Care Research Introducing TAPP: A Game Changer in Primary Care

TAPP helps clinics assess their performance for better patient care.

2025-04-01T02:13:10+00:00 ― 7 min read

Machine Learning DISCO: Choosing the Best AI Models

A new method to select pre-trained AI models efficiently.

2025-01-23T21:16:39+00:00 ― 7 min read