Simple Science

Cutting edge science explained simply

# Computer Science# Computer Vision and Pattern Recognition

Realistic Avatar Movement in Virtual Worlds

New method enables avatars to imitate human movements realistically in AR/VR.

― 7 min read


Realistic Virtual AvatarsRealistic Virtual Avatarsusing limited user data.New method enhances avatar animations
Table of Contents

Avatars play a key role in making experiences in virtual worlds feel interactive and engaging. However, creating these avatars to accurately represent a user's movements presents some challenges. Most current AR/VR devices only provide limited data on user positions, which mainly comes from a headset and a couple of controllers. Additionally, avatars often have different body structures compared to humans, making it hard to translate human movements to these characters.

This article explores a method developed to deal with these challenges, allowing avatars in virtual environments to mimic human movements in real time, even when the data available is limited. The approach utilizes a combination of techniques to enable avatars to move in ways that feel believable and natural.

Understanding the Challenges

One of the biggest issues faced in avatar animation is the limited information available about the user's movements. When using AR/VR devices, users typically have only a head-mounted display and two handheld controllers, which do not provide enough detail about the lower body. This lack of data can lead to unrealistic animations of the avatar.

Another significant challenge is that avatars might have different sizes, shapes, and skeletal structures compared to humans. With disparate structures, simply copying human movements onto an avatar is not straightforward. The transformation requires a clear understanding of how to map movements from one form to another.

Lastly, traditional animation methods sometimes lead to movements that don’t follow the laws of physics, resulting in actions that seem unnatural. Without a proper grasp of how each part of an avatar should move, it leads to animations lacking weight and realism.

A New Method to Retarget Movements

The new method proposed handles the problems mentioned. It employs techniques from Reinforcement Learning, allowing avatars to be controlled based on data received from a user's movements. Instead of needing detailed full-body motion capture data for each avatar, the method uses pre-existing human motion data to train the system, which can then handle variations in avatar structures.

With this method, the training only requires human motion capture data. This means that creators do not need to set up individual animations for every type of avatar, which is often impractical given the large variety of characters.

How It Works

The core of the approach is based on a Physics Simulation that takes into account the specific characteristics of each avatar. For instance, a dinosaur might have a heavy tail, while a mouse-like character would have shorter legs. By training the system with diverse motions, it learns to adapt the control to different avatars in a way that respects their physical properties.

During training, the system uses human motion capture data to generate an initial estimate of the avatar's pose. The policy guiding the simulation is designed to make the avatar imitate this pose while also adhering to the physical laws programmed into the simulation.

After the training process, the avatars can be controlled solely with the head-mounted display and controllers, without needing additional full-body information. The characters respond in real time to the sparse input data, maintaining physical realism in their movements.

The Role of Reinforcement Learning

Reinforcement learning supports the process of developing a policy that helps in retargeting user actions to avatars. The system operates in an environment where it continuously learns how to improve its movements based on the feedback it receives.

At each step during training, the policy observes the state of the environment, which includes data from the user's controls and the current position of the simulated avatar. Based on this input, the policy takes action and receives a reward signal that informs it how well it performed.

The aim is to optimize the policy to generate movements that are both realistic and appropriate for the character being controlled. The method uses an algorithm that updates the policy based on past experiences, gradually refining its ability to represent the user's movements accurately.

Training Data Generation

To train the avatars, it’s necessary to generate the input data that the policy will use. This data mimics the information that would be produced by a head-mounted display and controllers. The training process involves creating a rough mapping between the movements of a person and the corresponding positions in the avatar, which may include correcting for any differences in body structure.

Using human motion capture data, the system offsets the positions of key joints to create an approximate pose for the avatar. While the initial mapping may contain artifacts, such as sliding feet due to differences in leg lengths, the physics simulation corrects these errors during the training phase.

The final model trained does not require any artist-created animations, relying instead on the wide range of human motion capture data available. This flexibility makes it easier to apply this method to many different avatars without the need for extensive manual input.

Reward System Design

A crucial element of the method is the reward system, which guides the training process and influences how well the avatars imitate users. Different components of the reward function help the policy learn what aspects of the user's movement to prioritize.

The imitation reward encourages the avatar to match its pose with the reference pose, which is derived from the human motion data. By comparing joint angles, velocities, and positions, the system can assess how closely the avatar's movements correspond to those of the human.

Additionally, the contact reward reinforces the importance of maintaining accurate foot placements; it checks if the avatar's feet are in contact with the ground at the right moments. This helps prevent common issues such as sliding or unnatural transitions between poses.

Lastly, the action reward regulates the overall energy expenditure of the avatar's movements. By minimizing sudden changes in torque, the policy is encouraged to produce smoother, more natural motions.

Resulting Motion Quality

The method has been tested on various avatars, demonstrating its effectiveness across different character types, including mice, dinosaurs, and human-like figures. The resulting motions often match the user's actions quite closely, even when the system operates only with limited data from the headset and controllers.

The avatars are able to perform actions that feel realistic within their respective physical constraints, avoiding problems like jitter or unnatural sliding. The incorporation of physics into the movement control enables the avatars to display behaviors that resonate well with human movements, regardless of the disparities in body structure.

During tests, even avatars that differ significantly in size or morphology from the human user managed to replicate movements in a convincing way. Users who were not part of the training set could also be tracked in real time, underscoring the method's adaptability.

Addressing Limitations

While the method shows promise, it still has its limitations. In situations where users perform quick, dynamic actions or uncoordinated upper and lower body movements, the system struggles to generate a high-quality response. As the control requires sequential torque outputs to animate the character, tracking errors can accumulate, leading to failures in motion generation.

To overcome this, future work could separate the task into two stages: first predicting a full-body pose and then refining it into torque outputs for character control. Such an approach could leverage the strengths of both reinforcement learning and traditional kinematic-based methods, improving overall performance in more complex scenarios.

Future Directions

This research paves the way for expanding avatar animations in virtual environments, allowing users to express themselves through a variety of character types. The method could be further developed to include more complex character designs, moving beyond simple bipedal forms.

Potential areas of exploration may include improving the policy's ability to adapt to even more diverse body types and skeletons. Techniques like graph neural networks may be employed to learn flexible policies that account for increased character complexity.

Moreover, integrating additional feedback mechanisms from the environment could provide better contextual understanding, enhancing the accuracy and realism of the avatar responses.

Conclusion

The proposed method represents an exciting step forward in the field of avatar animation within virtual environments. By utilizing reinforcement learning and physics-based simulation, it effectively bridges the gap between limited user input and complex character movements. The ability to control a vast range of characters while maintaining realism offers new possibilities for user interaction in AR/VR. Continued research and refinement of the technology could lead to even greater advancements, allowing for richer and more immersive experiences in virtual worlds.

Original Source

Title: Physics-based Motion Retargeting from Sparse Inputs

Abstract: Avatars are important to create interactive and immersive experiences in virtual worlds. One challenge in animating these characters to mimic a user's motion is that commercial AR/VR products consist only of a headset and controllers, providing very limited sensor data of the user's pose. Another challenge is that an avatar might have a different skeleton structure than a human and the mapping between them is unclear. In this work we address both of these challenges. We introduce a method to retarget motions in real-time from sparse human sensor data to characters of various morphologies. Our method uses reinforcement learning to train a policy to control characters in a physics simulator. We only require human motion capture data for training, without relying on artist-generated animations for each avatar. This allows us to use large motion capture datasets to train general policies that can track unseen users from real and sparse data in real-time. We demonstrate the feasibility of our approach on three characters with different skeleton structure: a dinosaur, a mouse-like creature and a human. We show that the avatar poses often match the user surprisingly well, despite having no sensor information of the lower body available. We discuss and ablate the important components in our framework, specifically the kinematic retargeting step, the imitation, contact and action reward as well as our asymmetric actor-critic observations. We further explore the robustness of our method in a variety of settings including unbalancing, dancing and sports motions.

Authors: Daniele Reda, Jungdam Won, Yuting Ye, Michiel van de Panne, Alexander Winkler

Last Update: 2023-07-04 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2307.01938

Source PDF: https://arxiv.org/pdf/2307.01938

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles