Open In App

Image Classification with Web App

Improve
Improve
Like Article
Like
Save
Share
Report

Detecting Emergency Vehicles Using CNNs 

Motivation: Recently I participated in JanataHack: Computer Vision Hackathon hosted by Analytics Vidhya. The aim of the competition was to create a binary image classifier that could differentiate the Non?-?Emergency Vehicles eg. private owned vehicles. from the emergency vehicles (police vehicles, ambulances, etc).

Problem Statement: 
We need to create a classifier which is able to differentiate between Emergency and Non- Emergency vehicles. The Emergency vehicles are labelled 1 and Non- Emergency vehicles are labeled 0. In this article I am going to show the approach I followed to create a models that got be in the 147 place out of 10000. 

The models shows in this article are Convolution Neural Networks. I have tried to make the code as simple as I can. Readers are required to have some knowledge of neural networks.

Problem Solving Steps:  

  1. Loading and Visualizing the data
  2. Data Cleaning
  3. Modeling
  4. Transfer Learning
  5. Parameter Tuning
  6. Final Model.

Code: Loading And Visualizing Data 

Python3




# imports
import numpy as np
import os
import matplotlib.pyplot as plt
from PIL import Image, ImageOps, ImageFilter, ImageEnhance
import pandas as pd
  
# importing pytorch library.
import torchvision.transforms as transforms
import torch.nn.functional as F
import torch.nn as nn
from torch.utils.data import Dataset, random_split, DataLoader


We will be using:

  • numpy: to store the images into arrays,
  • matplotlib: to visualize the images,
  • PILLOW or(PIL): library to load and transform images
  • Pytorch: For our deep learning framework.

Data Loading: 
The above image shows the datasets provided to us all the images the train and test set are present in the images folder, the train and test CVS files contains the name of images. 

Code:  

Python3




# name of the image folder
imagePaths = 'images'
  
# reading the train.csv file using pandas
trainImages = pd.read_csv('train.csv')
# reading the test.csv file using pandas
testImages = pd.read_csv('test.csv')
# reading the submission file using pandas
samples = pd.read_csv('sample_submission.csv')


Code: Loading the images into numpy arrays 

Python3




# defining train and labels list to store images and labels respectively.
train = []
labels = []
  
for image, label in zip(trainImages.iloc[:, 0], trainImages.iloc[:, 1]):
    # create a image path and store in img_path variable
    imgPath = os.path.join(imagePaths, image)
    # Use PIl Image class to load the image
    img = Image.open(imgPath)
  
    # apply median filter to the image this helps in reducing noise
    img = img.filter(ImageFilter.MedianFilter)
    # convert the image to numpy array and store the loaded images into train
    train.append(np.asarray(img))
    # store the label into the labels list
    labels.append(label)


Code: opening and displaying Images. 

Python3




# create subplots using the plt.subplots function
# the number of subplots depend on the n_rows and n_cols
# all the subplots are stored in ax variables
_, ax = plt.subplots(nrows = 4, ncols = 7, figsize =(12, 12))
  
# iterate through the ax variable by flattening it
for index, i in enumerate(ax.flatten()):
    # the imshow is used to show the image
    i.imshow(train[index])
    # set the title
    i.set_title(index)
  
    # this below lines makes the code better visualize.
    i.set_xticks([])
    i.set_yticks([])


Output: 

Output of the above cell.

Now that we have the images stored in the train and output classes stored in labels we can move on to the next step.

Data Cleaning
In this section, we will look at miss classified labels and improper image samples by removing these images my accuracy increased the val_score by 2%. It went from 94% to 96% and sometimes 97%.

Miss labelledImages: the code used to visualize the data is same as above  

Miss-classified labels

Improper Data: Images of dashboards.  

By removing these images the accuracy become more stable (less oscillations). One think to note here I was able to remove these dashboard images because I didn’t find any similar images in the test data.

Defining the DatasetClass: For the model to load the dataset from the disk pytorch provides a DatasetClass using this we don’t need to fit the entire model into memory.

Code:  

Python3




# Creating a VehicleDataset class for loading the images and labels .
# the following class needs to extend from the Dataset class
# provided by pytorch framework and implement the __len__ and __getitem__ methods.
  
class VehicleDataset(Dataset):
  
    def __init__(self, csv_name, folder, transform = None, label = False):
  
        self.label = label
  
        self.folder = folder
        print(csv_name)
        self.dataframe = pd.read_csv(self.folder+'/'+csv_name+'.csv')
        self.tms = transform
  
    def __len__(self):
        return len(self.dataframe)
  
    def __getitem__(self, index):
  
        row = self.dataframe.iloc[index]
  
        imgIndex = row['image_names']
  
        imageFile = self.folder + '/' + img_index
  
        image = Image.open(image_file)
  
        if self.label:
            target = row['emergency_or_not']
  
            if target == 0:
                encode = torch.FloatTensor([1, 0])
            else:
                encode = torch.FloatTensor([0, 1])
  
            return self.tms(image), encode
  
        return self.tms(image)
  
  
# creating objects of VehicleDataset
  
# the deep learning models accepts the image to be in tensor format
# this is done using the transforms.ToTensor() methods
  
transform = transforms.Compose([transforms.ToTensor(),
                                ])
  
'''
arguments:
csv_name - name of the csv file in out case train.csv 
folder - folder in which the images are stored 
transform - transforms the image to tensor,
label - used to differentiate between train and test set.
''''
  
trainDataset = VehicleDataset('train', 'images', label = True, transform = transform)


Now that we have our data pipeline ready we need to create the deep learning model.

CNN Model:
This post assumes that you have some knowledge of Neural Nets as explaining that is out of scope of this article. I am going to use a CNN (Convolution neural network). The model has 3 main layers naming the conv2d layer, batch Norm, and max pooling 2d the activation function used over here is relu:

Code:  

Python




# the EmergencyCustomModel class defines our Neural Network
# It inherites from the ImageClassificationBase class which has helper methods
# for printing the loss and accuracy at each epochs.
  
  
class EmergencyCustomModel(ImageClassificationBase):
    def __init__(self):
        super().__init__()
        self.network = nn.Sequential(
  
            nn.Conv2d(3, 32, kernel_size = 3, padding = 1),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
  
            nn.Conv2d(32, 64, kernel_size = 3, stride = 1, padding = 1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
  
            nn.Conv2d(64, 64, kernel_size = 3, stride = 1, padding = 1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
  
            nn.Conv2d(64, 128, kernel_size = 3, stride = 1, padding = 1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
  
            nn.Conv2d(128, 128, kernel_size = 3, stride = 1, padding = 1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
  
            nn.Conv2d(128, 256, kernel_size = 3, stride = 1, padding = 1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.AdaptiveAvgPool2d(1),
  
            nn.Flatten(),
            nn.Linear(256, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Linear(64, 2),
            # nn.Sigmoid(),
        )
  def forward(self, xb):
        return self.network(xb)


One can find the entire model definition in this notebook in my github repo
Training function: 

Code: the following function is used to train all the models in the post. 

Python3




# defining the training method.
# the evaluation method is used to calculate validation accuracy.
  
  
@torch.no_grad()
def evaluate(model, val_loader):
    model.eval()
    outputs = [model.validation_step(batch) for batch in val_loader]
    return model.validation_epoch_end(outputs)
  
  
# The fit method is used to train the model
# parameters
'''
epochs: no. of epochs the model trains 
max_lr: maximum learning rate.
train_loader: here we pass the train dataset 
val_loader: here we pass the val_dataset 
opt_func : The learning algorithm that performs gradient descent.
model : the neural network to train on.
'''
  
def fit(epochs, max_lr, model, train_loader, val_loader,
        weight_decay = 0, grad_clip = None, opt_func = torch.optim.SGD):
    torch.cuda.empty_cache()
    history = []
  
    # Set up custom optimizer with weight decay
    optimizer = opt_func(model.parameters(), max_lr, weight_decay = weight_decay)
  
    # the loop iterates  from 0 to number of epochs.
    # the model needs to be set in the train model by calling the model.train.
  
    for epoch in range(epochs):
        # Training Phase
        model.train()
        train_losses = []
  
        for batch in train_loader:
            loss = model.training_step(batch)
            train_losses.append(loss)
            loss.backward()
  
            # Gradient clipping
            if grad_clip:
                nn.utils.clip_grad_value_(model.parameters(), grad_clip)
  
            optimizer.step()
            optimizer.zero_grad()
  
        # Validation phase
        result = evaluate(model, val_loader)
        result['train_loss'] = torch.stack(train_losses).mean().item()
        model.epoch_end(epoch, result)
        history.append(result)
    return history


Before starting training we need to split our data into train and validation set. This is done so that the model generalizes well on unseen data. We will do a 80 – 20 split 80% train and 20%test. After splitting the data we need to pass the datasets to a Data Loader this is provided by pytorch.

Code: Splitting and creating dataloaders. 

Python3




# the batchSize is the number of images passes by the loader at a time.
# reduce this number if there is  an out of memory error.
batchSize = 32
valPct = 0.2
  
# code for splitting the data
# valPct variable is used to split dataset
  
valSize = int(valPct * len(trainDataset))
trainSize = len(trainDataset) - valSize
trainDs, valDs = random_split(trainDataset, [trainSize, valSize])
  
# Creating dataloaders.
train_loader = DataLoader(trainDs, batchSize)
val_loader = DataLoader(valDs, batchSize)


Now we are ready to start training by calling the fit() method.

Python3




customModel = EmergencyCustomModel()
epochs = 10
lr = 0.01
  
# save the history to visualize later.
history = fit(epochs, lr, customModel, trainDl, valDl)


Output of above code: 

output of fit function

The entire code is available in github repo link is provide below.

Code: The plot function is used for producing the loss and accuracy graphs shown below  

Python3




'''
parameters:
epochs = number of epochs the model was trained on
hist = the history returned by the fit function.
  
'''
  
def plot(hist, epochs = 10):
    trainLoss = []
    valLoss = []
    valScore = []
    for i in range(epochs):
  
        trainLoss.append(hist[i]['train_loss'])
        valLoss.append(hist[i]['val_loss'])
        valScore.append(hist[i]['val_score'])
  
    plt.plot(trainLoss, label ='train_loss')
    plt.plot(valLoss, label ='val_loss')
    plt.legend()
    plt.title('loss')
  
    plt.figure()
    plt.plot(valScore, label ='val_score')
    plt.legend()
    plt.title('accuracy')
  
    # calling the function
    plot(history)


Output: Plotting the loss and accuracy plots. 

loss and accuracy graphs

There is very less over-fitting and the val_accuracy reaches its peak value at 90%. here again, I would like to add when I had created a custom model in keras the height val_score I was able to achieve was 83% changing the framework got we can increase of 7%. One more thing the size of the mode, using pytorch I am able to use a model having more than 3 Conv2d layers without over-fitting. But in keras I could only use 2 layers not more than that anything higher or lower would just add to the training cost without improving the accuracy.

Transfer Learning: 
Using The Pre-trained Models: I made use of two model architectures resnet and densenet. One thing to not the densenet models produce almost similar results to resnet models with lower epochs and most important the saved model takes half the memory space.

Code:  

Python3




# to use the pretrained model we make use of the torchvision.models library
  
  
class ResNet50(ImageClassificationBase):
  
    def __init__(self):
        super().__init__()
        # this following line adds the downloads the resnet50 model is it doesn't exits
        # and stores it in pretrainedModle
        self.pretrainedModel = models.resnet50(pretrained = True)
        # since this model was trained on ImageNet data which has 1000 classes but for
        # problem we have only 2 so will need to modify the final layer of the model
        feature_in = self.pretrainedModel.fc.inFeatures
        self.pretrainedModel.fc = nn.Linear(feature_in, 2)
  
    def forward(self, x):
        return self.pretrainedModel(x)
  
  
# Trainin the model.
# final Learning with
lr = 1e-4
epochs = 5
optFunc = torch.optim.Adam
  
# Here I have made use of the wd this is used as a regularization parameter
# It helps in preventing overfitting and helps our model to generalize.
  
bestWd = 1e-4
  
  
custom_model = to_device(ResNet50(), device)
hist = fit(epochs, lr, customModel, trainDl, valDl, bestWd, optFunc)


Output: Plotting the loss and accuracy plots. 

here one can see a lot of over-fitting and now improvement in val_score. I decided to try using the cyclic scheduler training strategy here’s the result. I still need to do more experiment with this method but as one can see. I have reduced the overfitting to some extend but the val_accuracy is still low.

Using Densenet169: The dense net is similar to Resnet to instead to adding the skip connection it concatenates it hence the blocks are called as dense blocks.

Code:  

Python3




class Densenet169(ImageClassificationBase):
  
    def __init__(self):
        super().__init__()
        # the below statement is used to download and store the pretrained model.
        self.pretrained_model = models.densenet169(pretrained = True)
  
        feature_in = self.pretrained_model.classifier.in_features
        self.pretrained_model.classifier = nn.Linear(feature_in, 2)
  
    def forward(self, x):
        return self.pretrained_model(x)
  
Training the model
# final Learning with 
lr = 1e-4
epochs = 5
optFunc = torch.optim.Adam
bestWd = 1e-4
customModel2 = Densenet169()
  
hist = fit(epochs, lr, customModel2, trainDl, valDl, bestWd, optFunc)


If you look at the loss and accuracy plots. The over-fitting has decreased. the val accuracy is better but this was done without the cyclic scheduler.

Code: Plotting the loss and accuracy plots. 

Using early stopping the training can be stopped at 5 epochs.

Web APP: 

https://emervehicledetector.herokuapp.com/

Note: the web app only accepts jpg images.
Conclusion: I was able to get a 200 rank out of 10000 so I made it in the top 2% using the above model. all the code will be available in my github repo: https://github.com/evilc3/EmergencyVehicleDetector
Entire notebook: https://colab.research.google.com/drive/13En-V2A-w2o4uXuDZk0ypktxzX9joXIY?usp=sharing
Web app link: https://emervehicledetector.herokuapp.com/
 



Last Updated : 08 Mar, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads