Using Tech for Faster Flood Rescue Efforts
A new dataset and models speed up search and rescue after floods.
Ibne Hassan, Aman Mujahid, Abdullah Al Hasib, Andalib Rahman Shagoto, Joyanta Jyoti Mondal, Meem Arafat Manab, Jannatun Noor
― 7 min read
Table of Contents
- The New Dataset: Your Friendly Neighborhood Flood Images
- The Super Smart Models
- Understanding the Flood Situation
- Learning from Past Works
- Vast Potential for Change
- A Closer Look at the Dataset
- Making More Images with Augmentation
- Keeping It Ethical
- Our Models: The Stars of the Show
- The Results Are In!
- The Impact of Our Research
- What’s Next?
- Original Source
- Reference Links
Floods can be a real pain, especially for countries in South Asia like Bangladesh, India, and Pakistan. They deal with floods so often that it’s like nature’s way of saying, “Surprise! Here’s some water!” But seriously, these floods can cause serious trouble, submerging homes and putting lives at risk.
Imagine having to search for survivors in a flooded area. It takes time, and every minute counts. Luckily, with some crafty tech skills, we can speed things up. By using aerial images and smart algorithms, we can tell where the floods are and exactly where people and houses are located. This means search and Rescue teams can get to the right places faster and save more lives.
The New Dataset: Your Friendly Neighborhood Flood Images
To make this work, we created a new dataset full of aerial images from floods in South Asia. This collection is like a treasure chest for rescue missions. The dataset has images sorted into four categories:
- Just flood
- Flood with houses
- Flood with people
- No flood at all
We didn’t just click random photos. We carefully selected images that show the unique features of floods in South Asian countries. For instance, the house shapes and flood water colors are all similar in this region, making it easier for our tech tools to recognize patterns and differences.
Models
The Super SmartTo help classify these images, we used several advanced computer models. We tried out a special Compact Convolutional Transformer (CCT), along with a few other well-known models built on a similar foundation. Think of them as a group of superheroes, each with their own skills to tackle the flood scene classification challenge.
We also used a cool object detection model called YOLOv8 to locate houses and people in the images. It’s like having a pair of eagle eyes that can spot what’s important in the chaos of a flood. Then we compared how well these models worked, just like a friendly competition among superheroes.
Understanding the Flood Situation
Floods are among the most frustrating and damaging natural disasters. South Asia is particularly vulnerable due to its geography. High precipitation, rising sea levels, and houses built with different materials can all contribute to the havoc wreaked by floods.
For example, in June 2024, a massive flood in Bangladesh left around 1.8 million people stuck. This shows how unprepared many people are during such events. A similar situation happened in Pakistan in 2022, with floods affecting one-third of the nation and taking a toll on around 33 million lives.
In times of disaster, various government and aid groups often use boats and aircraft to search for survivors, but this can take a lot of valuable time. Therefore, finding smarter ways to locate people quickly is crucial.
Learning from Past Works
Other researchers have tried to tackle the challenges of post-flood rescue operations too. For instance, some have used Drones and neural networks to identify flooded areas. Using remote sensing and satellite images is one way to gather data, but this approach has its limits. Drones can get up close and personal, giving a much clearer picture of the current situation.
The main goal of our work is to speed up the rescue efforts and minimize casualties. By using aerial images, we can quickly pinpoint where the floods are, especially in South Asian countries where the geographic and cultural environment is similar.
Vast Potential for Change
Our work focuses on improving search and rescue initiatives in these flood-prone areas of South Asia. Employing drones for aerial imaging can give rescue teams the upper hand by helping them accurately map flooded zones and find people. With the introduction of transformer-based models into image classification, we can make this process even more effective.
A Closer Look at the Dataset
We call our dataset the AFSSA (Aerial Flood Scene South Asia). Unlike other Datasets that include images from all over the world, ours is tailored specifically for South Asia. This gives it a better chance of doing well with flood classification tasks in the region.
To gather the images, we scoured YouTube for footage of real flood events captured by drones. This footage gave us a more authentic view of the situation. We collected videos from Bangladesh, India, and Pakistan to ensure we had a well-rounded dataset with varied flood scenes.
After collecting the footage, we extracted images and categorized them into the four classes we mentioned earlier. We gathered over 300 images for each category, ensuring we had enough data to work with.
Making More Images with Augmentation
To make our dataset even larger, we used a technique called image augmentation. This involves creating variations of our images by rotating, shifting, and flipping them. After this step, we ended up with over 8600 images, making our dataset quite robust.
We also enhanced the contrast of our images using a method called CLAHE. This helps bring out the important details, making it easier for our models to learn and make accurate predictions.
Keeping It Ethical
We made sure to follow ethical practices while collecting our images. All the YouTube videos we used were public, and we credited the content creators appropriately. No need to be sneaky when there’s a way to keep it all above board.
Our Models: The Stars of the Show
We implemented several different models for our classification tasks. Each model has its own number of parameters, which is basically a fancy way of saying how complicated the model is. The CCT model stood out with the best performance, scoring an impressive accuracy of 98.62%.
The other transformer-based models we tested, like Vision Transformer (ViT) and Swin Transformer, performed decently too, but they couldn’t keep up with the CCT.
Meanwhile, our CNN-based models showed varying levels of success. The ensemble model, which combines several CNNs, managed to achieve quite high accuracy as well.
The Results Are In!
After running all our models, we evaluated their performance using metrics like accuracy, precision, and recall. The transformer-based models generally performed better than the CNN-based ones. CCT was the leading champion, showcasing how effective it is in classifying flood scenes.
The confusion matrix is like a scoreboard that shows how well each model did. CCT had a great number of true positives – meaning it correctly identified flooded areas and human presence.
The Impact of Our Research
This research isn’t just an academic exercise. It has real-world implications for people living in flood-prone regions. By enabling drones and other aerial systems to identify houses and people in flooded areas, we can help rescuers reach those in need far faster.
In a critical moment, this technology could be the difference between life and death for someone stranded due to flooding.
What’s Next?
Looking ahead, we plan to enhance our dataset further. We want to gather as many additional images as possible and dial up the complexity of our models. The more data we have, the better our models can learn and adapt.
We also want to explore the idea of integrating our classification models into existing UAV platforms. This way, we could have a powerful search and rescue toolset readily available for those who need it most in the midst of natural disasters.
In conclusion, our work offers a glimpse into how technology can help tackle the challenges posed by floods. With a little creativity and the right tools, we can make a difference, potentially saving countless lives in the process. It’s all about turning those floods from a disaster into a manageable situation, one image at a time.
Let’s keep our fingers crossed for fewer floods in the future and more tech solutions to help those affected!
Title: Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia
Abstract: Countries in South Asia experience many catastrophic flooding events regularly. Through image classification, it is possible to expedite search and rescue initiatives by classifying flood zones, including houses and humans. We create a new dataset collecting aerial imagery of flooding events across South Asian countries. For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN). We also implement the YOLOv8 object detection model and detect houses and humans within the imagery of our proposed dataset, and then compare the performance with our classification-based approach. Since the countries in South Asia have similar topography, housing structure, the color of flood water, and vegetation, this work can be more applicable to such a region as opposed to the rest of the world. The images are divided evenly into four classes: 'flood', 'flood with domicile', 'flood with humans', and 'no flood'. After experimenting with our proposed dataset on our fine-tuned CCT model, which has a comparatively lower number of weight parameters than many other transformer-based architectures designed for computer vision, it exhibits an accuracy and macro average precision of 98.62% and 98.50%. The other transformer-based architectures that we implement are the Vision Transformer (ViT), Swin Transformer, and External Attention Transformer (EANet), which give an accuracy of 88.66%, 84.74%, and 66.56% respectively. We also implement DCECNN (Deep Custom Ensembled Convolutional Neural Network), which is a custom ensemble model that we create by combining MobileNet, InceptionV3, and EfficientNetB0, and we obtain an accuracy of 98.78%. The architectures we implement are fine-tuned to achieve optimal performance on our dataset.
Authors: Ibne Hassan, Aman Mujahid, Abdullah Al Hasib, Andalib Rahman Shagoto, Joyanta Jyoti Mondal, Meem Arafat Manab, Jannatun Noor
Last Update: 2024-10-31 00:00:00
Language: English
Source URL: https://arxiv.org/abs/2411.00169
Source PDF: https://arxiv.org/pdf/2411.00169
Licence: https://creativecommons.org/licenses/by-nc-sa/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to arxiv for use of its open access interoperability.