Bringing AI to Music Creation on Bela
A guide to using AI models for music on the Bela platform.
― 5 min read
Table of Contents
Creating music with technology has always been a part of human expression. As we move forward, new tools and platforms make it easier for musicians and developers to combine sound with innovative technology. One such platform is Bela, a powerful tool that helps people create music and other audio projects.
In recent years, there has been a rise in interest around using artificial intelligence (AI) in music-making. AI can assist with creating sounds, patterns, and even interpreting gestures. However, using these advanced technologies on small devices, like Bela, comes with its own challenges. This article outlines a method to make it easier to use deep learning models on the Bela platform, allowing more people to experiment with music and technology.
The Challenge of Using AI in Music
When trying to run AI Models on small devices, developers often face significant hurdles. Such devices typically have limited processing power and memory. This means normal AI models, which can require a lot of resources, do not work well. In addition, tasks need to be performed in Real-time, meaning they must respond to user inputs without delay.
Because of these limitations, many potential users give up on deploying AI in music creation. The available guidelines for using such models on platforms like Bela are often complicated, making it difficult for non-experts to get started.
The Bela Platform
Bela is a platform specifically designed for creating audio projects. It allows users to record, process, and manipulate sound in real-time. By providing a low-latency environment-meaning there is little delay between input and output-Bela supports interactive projects that require immediate feedback.
Bela supports multiple inputs and outputs, making it great for capturing audio from various sources. It also allows room for creativity, as developers can create their own applications ranging from simple sound makers to complex installations.
Introducing the Pipeline
To help users deploy AI models on Bela more easily, a new pipeline has been created. This pipeline consists of a series of steps that guide users through recording data, Training AI models, and running those models on Bela. Each step is designed to make the process smoother and more efficient.
Step 1: Recording Data
The first step is to gather data from various sensors. This can include microphones, piezo sensors, and other digital inputs. By using multiple Bela boards, users can record many channels of data at once. This helps capture a wide range of audio signals that can be used for training AI models.
Once the data is recorded, it is sent to a host computer for processing. This step involves aligning the signals from different boards and converting them into a format that AI models can easily use.
Step 2: Training the AI Model
After the data has been collected and processed, the next step is training the AI model. This is where the real magic happens. Using tools like PyTorch, users can create models that learn from the data they gathered.
Once trained, the model is exported into a format that Bela understands, typically as a TensorFlow Lite model. This makes it compatible with the Bela platform and ready to run in real-time.
Step 3: Cross-Compiling the Code
Now that the model is ready, users need to prepare the code to run on Bela. Due to Bela's limitations, it is often necessary to cross-compile the code on a more powerful machine. This means the code is written and compiled on a laptop or desktop computer before being transferred to Bela.
A Docker container is used to simplify this process. Docker allows users to package the software and its necessary components together, making it easier to run on different machines. In this case, the compiled code can be directly transferred to the Bela platform, where it is ready to run.
Step 4: Running the Model in Real-Time
With everything set up, it’s time to run the model in real-time. This is where everything comes together. The AI model processes incoming audio signals and generates an output almost instantly. To ensure this process runs smoothly, it is essential to follow best practices in coding.
Memory allocation should be managed carefully, as allocating memory during audio processing can cause delays. Instead, all memory should be pre-allocated before audio processing begins.
Additionally, tasks that require significant computation, such as the AI model's inference, should be handled in a separate thread. This allows the main audio thread to continue processing audio without hesitation, reducing the risk of dropouts.
Benefits of the Pipeline
The pipeline outlined in this article provides several benefits for musicians and developers:
Simplifies the Process: The set of clear steps makes it easier for non-experts to start using AI in their projects.
Reduces Wait Times: Cross-compiling on a faster machine means users spend less time waiting for their code to build.
Encourages Experimentation: With easier access to AI capabilities, more people can try new ideas and create unique audio experiences.
Real-Time Performance: Thanks to optimized code practices, users can enjoy responsive audio applications without delay.
Conclusion
As technology continues to develop, the intersection of music and AI presents exciting opportunities for creativity. Platforms like Bela open doors for musicians and developers alike. By providing a streamlined pipeline for using deep learning models on Bela, more people can dive into the world of embedded AI.
This approach not only makes it more accessible but also encourages experimentation and innovation. With the right tools and processes in place, the world of real-time audio interaction will continue to grow. Musicians can now explore new avenues of creativity, merging sound with advanced technology to produce something truly special. This is just the beginning, and the future looks bright for those willing to experiment and push the boundaries of music creation.
Title: Pipeline for recording datasets and running neural networks on the Bela embedded hardware platform
Abstract: Deploying deep learning models on embedded devices is an arduous task: oftentimes, there exist no platform-specific instructions, and compilation times can be considerably large due to the limited computational resources available on-device. Moreover, many music-making applications demand real-time inference. Embedded hardware platforms for audio, such as Bela, offer an entry point for beginners into physical audio computing; however, the need for cross-compilation environments and low-level software development tools for deploying embedded deep learning models imposes high entry barriers on non-expert users. We present a pipeline for deploying neural networks in the Bela embedded hardware platform. In our pipeline, we include a tool to record a multichannel dataset of sensor signals. Additionally, we provide a dockerised cross-compilation environment for faster compilation. With this pipeline, we aim to provide a template for programmers and makers to prototype and experiment with neural networks for real-time embedded musical applications.
Authors: Teresa Pelinski, Rodrigo Diaz, Adán L. Benito Temprano, Andrew McPherson
Last Update: 2023-06-20 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2306.11389
Source PDF: https://arxiv.org/pdf/2306.11389
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.
Reference Links
- https://dl.acm.org/ccs_flat.cfm
- https://bela.io/
- https://developer.nvidia.com/embedded/jetson-modules
- https://coral.ai/products/dev-board-mini/
- https://coral.ai/products/accelerator
- https://www.intel.com/content/www/us/en/developer/articles/tool/neural-compute-stick.html
- https://pytorch.org/docs/stable/jit.html
- https://www.tensorflow.org/lite
- https://onnxruntime.ai/
- https://github.com/acids-ircam/nn_tilde
- https://github.com/rodrigodzf/torchplugins
- https://neutone.space/
- https://youtu.be/jAIRf4nGgYI
- https://github.com/ninon-io/Neurorave-hardware
- https://github.com/cpmpercussion/empi
- https://github.com/domenicostefani/deep-classf-runtime-wrappers
- https://github.com/pelinski/bela-dl-pipeline
- https://numpy.org/
- https://github.com/alibaba/TinyNeuralNetwork
- https://github.com/pelinski/bela-tflite-example
- https://www.docker.com/
- https://cmake.org/
- https://www.youtube.com/watch?v=xGmRaTaBNZA&list=PLCrgFeG6pwQmdbB6l3ehC8oBBZbatVoz3&index=20
- https://www.youtube.com/watch?v=Cdh_BAzr8aE&t=21258s
- https://learn.bela.io/using-bela/languages/c-plus-plus/#the-bela-api-for-c
- https://github.com/acids-ircam/lottery_generative/tree/master/code/statistics
- https://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing