Advancements in Competitive Co-evolutionary Algorithms
Examining the role of algorithms in agent training through competition.
― 7 min read
Table of Contents
In recent years, machine learning has made significant progress. A key factor in this development is the use of large sets of training data. For agents that interact with their surroundings, it's important to place them in diverse and complex environments. However, creating such environments manually can be tough and costly.
One convenient way to address this challenge is through scenarios where multiple agents, which can adapt to their surroundings, interact with other agents that have different goals. This method is known as competitive co-evolution, or self-play. In these situations, learning agents face constantly changing conditions because of the actions taken by other agents. This allows for the automatic creation of significant amounts of training data.
Benefits of Competitive Settings
Competitive environments offer several advantages. They can create effective learning pathways, where the challenges gradually increase as the skills of the agents improve. This means that agents become better at handling complex situations as they learn. Additionally, competitive settings can promote a type of adversarial learning, where the training data is designed to challenge the agents' weaknesses.
However, just because agents are in a competitive setting doesn't mean they will become better over time. Sometimes, the evolutionary process can have different outcomes. For example, one side may completely outcompete the other, leading to extinction. Alternatively, one agent may reach a point of high performance, preventing the other from improving. There are cases where both agents can reach a temporary peak in performance, and sometimes, they may enter a cycle where they keep changing strategies, making no real progress.
Many early attempts to create competitive robots have led to this last scenario. While there might be initial improvements, agents often find themselves stuck in cycles, repeatedly adjusting their strategies without making real progress.
Requirements for True Progress
To encourage genuine progress in competitive co-evolution, specific algorithms must be employed. These algorithms should:
- Allow agents to compete against both current and previous opponents.
- Expose agents to a variety of opponents.
- Identify and keep only those variations that lead to real improvements.
Additionally, it's important to have proper measures to assess progress and evaluate the effectiveness of different solutions.
Measuring Progress
In experiments where agents operate alone, how well they perform can be directly measured. This is usually done by observing their fitness level, which may vary due to random changes in their environment. However, these changes aren't designed to be adversarial, making it easier for agents to adapt to them.
In competitive settings, an agent's fitness depends strongly on their opponents. This means the way opponents are chosen plays a significant role in how the agents evolve over time. This raises a few challenges:
- Identifying the best-performing solutions can be tricky since their success is tied to competitors.
- Estimating how effective a solution is can vary greatly depending on opponents.
- Comparing the effectiveness of different conditions can be challenging.
One way to address these issues is by selecting a specific group of strong opponents, often referred to as "champions." These champions are usually the best from independent experiments.
Another method is called "cross-test." This involves evaluating the top solutions from one experiment against the best opponents from another.
Since measuring progress can be complex in competitive settings, it's important to differentiate between different types of progress:
- Local progress: improvement against current opponents.
- Historical progress: improvement against opponents from earlier generations.
- Global progress: improvement against all possible opponents.
Local progress can be measured by evaluating agents against opponents from recent generations. Historical progress can be assessed by using data from older opponents, often visualized through specific plots. Global progress is estimated by testing agents against opponents that were not part of their training process.
Competitive Evolutionary Algorithms
This section will look into various competitive co-evolutionary algorithms that have been developed. The focus will be on algorithms that aim to maximize expected performance against randomly chosen opponents or average performance against all potential opponents.
Achieving genuine progress requires specialized algorithms. Here are some notable methods:
Archive Algorithm: This keeps a record of the best-performing individuals from earlier generations. Agents are then evaluated against these past opponents, encouraging historical progress. While it doesn't always guarantee global progress, it can lead to more generalized strategies.
Maxsolve Algorithm:* This variation maintains a maximum number of opponents in the archive. It removes weaker opponents based on performance and seeks to keep the best ones that can push agents toward discovering high-quality solutions.
Archive Algorithm:* This is a newer approach that keeps multiple groups of agents, each helping to create a combined archive of strong opponents. These different agent groups ensure richer competition and more varied challenges.
Generalist Algorithm: Instead of relying on an archive, this algorithm uses a method of identifying which variations lead to genuine progress, allowing weaker strategies to be discarded. Agents are evaluated against a changing set of opponents to promote progress.
Another approach involves using randomly generated opponents. While this can encourage global progress, it has a major drawback: these opponents do not improve over time, making it challenging for agents to truly develop robust strategies.
These methods should be incorporated into an evolutionary algorithm that allows agents to change over time. Historically, traditional evolutionary strategies have been used. Currently, a modern evolutionary strategy called OpenAI-ES is utilized. This method is particularly well-suited for changing environments, as it helps the population to learn from past experiences while adjusting to new challenges.
The Predator and Prey Problem
To test these algorithms, a predator and prey problem is used. This scenario is widely recognized as challenging and suitable for studying competitive evolution. In this situation, agents must adapt to dynamic and unpredictable conditions.
The robots used in this study are simulated models equipped with neural networks. Predators are evolved to enhance their ability to catch prey quickly, while prey are designed to avoid capture for as long as possible. The success of predators is measured by how quickly they can touch the prey, while the success of prey is measured by how long they can evade capture.
Each algorithm goes through a large number of evaluation steps to determine its effectiveness.
Results of Different Algorithms
After completing the experiments, results were gathered to compare the methods. Data was collected from multiple experiments, showing how well the robots performed against opponents from various stages.
All methods showed some historical progress, meaning that the robots often performed better against older opponents than against newer ones. Notably, the Generalist algorithm consistently led to improved performance across all stages. In contrast, the other algorithms showed more variability and occasional regressions.
When specifically looking at the performance of robots from the latest generation against older opponents, the Generalist method stood out. It demonstrated a clear ability to adapt and improve consistently.
To further assess the effectiveness of each method, cross-tests were conducted, comparing the top-performing agents from each algorithm against each other. Results showed that the Generalist method consistently outperformed the others, establishing itself as the most effective approach.
Observations of Agent Behavior
The champions developed through these algorithms exhibited advanced behavior. For example, some were able to move forward and backward, adjusting their direction based on the situation. This adaptability allowed them to handle a variety of opponents effectively.
However, some champions showed vulnerabilities against specific strategies employed by their opponents. These interactions provided valuable insights into the strengths and weaknesses of the agents.
Conclusion
This analysis highlighted the factors that influence competitive evolution and genuine progress. Several methods for measuring both historical and global progress were introduced, along with discussions on what makes progress possible.
The four algorithms analyzed were: the Archive algorithm, the Maxsolve* algorithm, the Archive* algorithm, and the Generalist algorithm. All methods showed the potential to achieve global progress over the long term, but the rate of improvement differed significantly.
Among the methods, the Generalist algorithm proved to be the most effective, consistently producing agents that improved over time against various opponents. The Archive* algorithm also showed promise, outperforming some of the other methods.
Future work should focus on whether these findings hold in different settings and if continuous evolutionary progress can lead to solutions that keep advancing without bounds.
Title: Global Progress in Competitive Co-Evolution: a Systematic Comparison of Alternative Methods
Abstract: We investigate the use of competitive co-evolution for synthesizing progressively better solutions. Specifically, we introduce a set of methods to measure historical and global progress. We discuss the factors that facilitate genuine progress. Finally, we compare the efficacy of four qualitatively different algorithms. The selected algorithms promote genuine progress by creating an archive of opponents used to evaluate evolving individuals, generating archives that include high-performing and well-differentiated opponents, identifying and discarding variations that lead to local progress only (i.e. progress against a subset of possible opponents and retrogressing against a larger set). The results obtained in a predator-prey scenario, commonly used to study competitive evolution, demonstrate that all the considered methods lead to global progress in the long term. However, the rate of progress and the ratio of progress versus retrogressions vary significantly among algorithms.
Authors: Paolo Pagliuca, S. Nolfi
Last Update: 2024-06-08 00:00:00
Language: English
Source URL: https://www.biorxiv.org/content/10.1101/2024.06.06.597852
Source PDF: https://www.biorxiv.org/content/10.1101/2024.06.06.597852.full.pdf
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to biorxiv for use of its open access interoperability.