New Approach to Optical Accelerators for Neural Networks
Innovative architecture enhances efficiency in deep learning through optical computation.
Sijie Fei, Amro Eldebiky, Grace Li Zhang, Bing Li, Ulf Schlichtmann
― 6 min read
Table of Contents
- The Problem with Current GOAs
- The Proposed Hybrid Architecture
- Structure of the New GOA
- Genetic Algorithm for Parameter Search
- How This Architecture Works
- Adjustments to Neural Networks
- Hardware-Aware Training
- Experimental Results
- Energy and Latency Improvements
- Accuracy Maintenance
- Comparison with Other Architectures
- Conclusion
- Original Source
Recent advancements in deep neural networks (DNNs) have made them popular for solving complex problems. However, as these networks grow deeper, they require more calculations, specifically a lot of multiply-accumulate (MAC) operations. This has created a need for better hardware to speed up these calculations. One solution that has gained attention is the use of general-purpose optical accelerators (GOAs). These devices use light to perform calculations, which can be much faster and use less energy compared to traditional electronic devices.
GOAs are built using components called Mach-Zehnder Interferometers (MZIs). These components can manipulate light signals to perform calculations. While they are promising, existing GOAs often struggle with efficiency when trying to map the various sizes of neural networks onto their structures. This inefficiency is mainly due to the design of the MZI arrays and how they relate to the weight matrices of the neural networks.
The Problem with Current GOAs
Current GOAs use interleaving MZI arrays, where multiple MZIs are arranged in a specific way. However, these designs have limitations. When attempting to use smaller weight matrices with larger GOAs, the MZIs are not fully utilized. This means that a lot of resources are wasted, and the potential speed and energy efficiency benefits of optical acceleration are not fully realized.
Another issue is that the area required for existing GOAs is large because they require several MZIs arranged in this interleaved manner. To represent a weight matrix accurately, significant space is needed, which may not be feasible in all cases.
To address these challenges, researchers have proposed a new hybrid architecture for GOAs. This new design aims to improve mapping efficiency and reduce the area needed for the hardware.
The Proposed Hybrid Architecture
The proposed hybrid architecture consists of smaller, independent MZI modules that are linked together with Microring Resonators (MRRs). This structure allows these smaller modules to work together efficiently to handle larger neural networks.
Structure of the New GOA
Each MZI module in the new architecture can perform calculations that are adjusted with tunable coefficients. This means that the input for each module can be tailored based on the needs of the calculation. By using this method, the architecture can better utilize the available space and resources, improving overall efficiency.
The architecture also uses a method called Singular Value Decomposition (SVD) to break down larger weight matrices into smaller unitary matrices. This helps to maintain accuracy while simplifying the calculations performed by the MZIs.
Genetic Algorithm for Parameter Search
To optimize the design of the GOA, researchers used a genetic algorithm to search for the best parameters for the architecture. This algorithm considers multiple factors such as mapping efficiency, area, power consumption, and costs related to converting electrical signals to optical signals and vice versa.
How This Architecture Works
The basic components of the new GOA architecture include:
- MZI Modules: These are the core calculation units that manipulate light to perform mathematical operations.
- Microring Resonators (MRRs): These elements connect the MZI modules and help in accumulating the results of the calculations.
- Tunable Coefficients: These allow the MZI modules to be adjusted dynamically based on the calculations needed.
- SVD Implementation: This breaks complex weight matrices down into simpler forms that can be handled by the smaller MZI modules.
This combination of components allows the new architecture to work with larger neural networks more effectively, maximizing the use of the optical accelerators.
Adjustments to Neural Networks
To make the most of the new GOA architecture, adjustments to the neural networks themselves may be necessary. This means increasing the numbers of filters and kernel depths in the neural networks. By doing this, the authors of the architecture can ensure that all parts of the optical accelerator are being used efficiently.
Hardware-Aware Training
The architecture also implements a method known as hardware-aware training. Essentially, this involves training the neural network while keeping in mind the specific quirks and limitations of the optical hardware. This way, the models can be fine-tuned to perform optimally on the GOA.
Matrix Approximations: During training, some matrices may be approximated. This means that their exact form may be simplified. To balance this, the method aims to keep the approximated form as close as possible to the original.
Restoring Critical Matrices: If some matrices are essential for the accuracy of the network, they can be reverted back to their original forms, bypassing the approximations when necessary.
Experimental Results
The proposed GOA architecture has been tested using two well-known neural networks, VGG16 and Resnet18, on two datasets: Cifar10 and Cifar100.
Energy and Latency Improvements
Results showed impressive improvements in mapping efficiency compared to the previous interleaving architecture. The reductions were as follows:
- For VGG16 on Cifar10, the mapping cost was reduced by 21.87%.
- For Resnet18 on Cifar100, the mapping cost saw a reduction of 25.52%.
These improvements resulted in significant decreases in energy consumption, with reductions over 67% noted in many cases. Furthermore, computation latency was lowered by more than 21% in various scenarios.
Accuracy Maintenance
In terms of maintaining accuracy while implementing these changes, the new GOA model did a good job. While some degradation was noted with specific configurations, the overall accuracy of the neural networks was preserved, and in some cases, it even improved.
After the adjustments and hardware-aware training, results showed that networks could outperform traditional setups in terms of speed and energy efficiency without sacrificing accuracy.
Comparison with Other Architectures
To evaluate how the proposed architecture stacks up against existing systems, comparisons were made with other optical accelerators. When tested against a traditional SVD interleaving accelerator, the new proposed architecture demonstrated notable efficiency gains.
Area Efficiency: The area needed for the new structure was reduced by an impressive margin, showing a 18% to 25% decrease compared to older methods.
Energy Consumption: Even when accounting for the extra components necessary for the new structure, the overall energy use was substantially lower, highlighting how the mapping efficiency mitigated the increased power needs.
Conclusion
In summary, the hybrid architecture proposed for optical accelerators shows significant promise for improving the efficiency of deep neural networks. By using smaller, independent MZI modules and connecting them with microring resonators, the architecture can handle larger networks while utilizing space and resources more effectively. Through a combination of optimizing neural network structures and applying innovative training methods, notable advancements were demonstrated in terms of energy consumption, latency, and overall performance. This work paves the way for more efficient computing in the field of deep learning, showing how optical technologies can be harnessed to match and exceed the capabilities of traditional computing systems.
Title: An Efficient General-Purpose Optical Accelerator for Neural Networks
Abstract: General-purpose optical accelerators (GOAs) have emerged as a promising platform to accelerate deep neural networks (DNNs) due to their low latency and energy consumption. Such an accelerator is usually composed of a given number of interleaving Mach-Zehnder- Interferometers (MZIs). This interleaving architecture, however, has a low efficiency when accelerating neural networks of various sizes due to the mismatch between weight matrices and the GOA architecture. In this work, a hybrid GOA architecture is proposed to enhance the mapping efficiency of neural networks onto the GOA. In this architecture, independent MZI modules are connected with microring resonators (MRRs), so that they can be combined to process large neural networks efficiently. Each of these modules implements a unitary matrix with inputs adjusted by tunable coefficients. The parameters of the proposed architecture are searched using genetic algorithm. To enhance the accuracy of neural networks, selected weight matrices are expanded to multiple unitary matrices applying singular value decomposition (SVD). The kernels in neural networks are also adjusted to use up the on-chip computational resources. Experimental results show that with a given number of MZIs, the mapping efficiency of neural networks on the proposed architecture can be enhanced by 21.87%, 21.20%, 24.69%, and 25.52% for VGG16 and Resnet18 on datasets Cifar10 and Cifar100, respectively. The energy consumption and computation latency can also be reduced by over 67% and 21%, respectively.
Authors: Sijie Fei, Amro Eldebiky, Grace Li Zhang, Bing Li, Ulf Schlichtmann
Last Update: 2024-09-02 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2409.12966
Source PDF: https://arxiv.org/pdf/2409.12966
Licence: https://creativecommons.org/licenses/by-nc-sa/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.