Sunday, May 10, 2020

Deep learning 5-Objection Detection with Region-CNN

Learn A-Z Deep Learning in 15 Days


In this Lecture, will learn about Objection Detection with Region-CNN


Object Detection is the process of localization and Recognition.



Region-CNN or RCNN

RCNN takes the input image-

  • Performs the sliding window
  • Proposes bounding box using Selective Search Algo.
  • CNN classifies the Propose region
  • Linear Regressor generates the tighter bounding box

Disadvantages

  • Slow: calculate a feature map (one CNN forward pass) for each region proposal.
  • Hard to train: Remember that in the R-CNN System we had 3 different parts (CNN, SVM, Bounding Box Regressor) that we had to train separately. This makes training very difficult.
  • Large memory requirement: Save every feature map of each region proposal. This needs a lot of memory.






Fast RCNN
  • Image is passed to a ConvNet which in turn generates the Regions of Interest.
  • RoI pooling layer is applied to all of these regions to reshape them as per the input of the ConvNet. Then, each region is passed on to a fully connected network.
  • The softmax layer is used on top of the fully connected network to output classes. Along with the softmax layer, a linear regression layer is also used parallelly to output bounding box coordinates for predicted classes.
Disadvantage
  • Selective search is slow and hence computation time is still high.





Faster RCNN

Takes an input image


  • Pass it to the ConvNet which returns feature maps for the image
  • Apply Region Proposal Network (RPN) on these feature maps and get proposals(anchors).
  • Apply ROI pooling layer to bring down all the proposals to the same size
  • Finally, pass these proposals to a fully connected layer in order to classify any predict the bounding boxes for the image

Total_loss = rpn_class + rpn_regressor

Disadvantage

  • Selective search is slow and hence computation time is still high.








Comparison of RCNN, Fast RCNN & Faster RCNN



Data Preprocessing with Berekely dataset


Categories         frequency

    car-----------------------713211
   traffic sign---------------239686
traffic light-------------186117
    person--------------------91349
     truck-----------------------29971
   bus-----------------------11672
  bike-----------------------7210
  rider-----------------------4517
     motor-----------------------3002
train-----------------------136

  • Total annotation count = 1286871
  • Total images = 69863


**Run the Code in Google Colab

import os
import pandas as pd
import json

df = pd.DataFrame()
filepath='./bdd100k/labels/bdd100k_labels_images_train.json'
jsfile = json.loads(open(filepath).read())

for js in jsfile:
    for label_point in js['labels']:
        if label_point['category']=='lane' or label_point['category']=='drivable area':
            continue

        
        df=df.append({'label':label_point['category'],'name':js['name'],
        'xmin':label_point['box2d']['x1'],'ymin':label_point['box2d']['y1'],
        'xmax':label_point['box2d']['x2'],'ymax':label_point['box2d']['y2']
        }, ignore_index=True)

df.to_csv('./bdd100k_train.csv', index=False)


# Replace the label
df.loc[df.label=='bike','label']='Bicycle'
pdtxtdata.loc[pdtxtdata.label=='rider','label']='cyclist/rider'
# Drop label inplace
x = df.loc[df["label"] == 'train']
df.drop(x.index, axis = 0 , inplace = True)


 # Select via label as 'motor'   'cyclist/rider'    'bicycle'  and 'bus'                                                      import pandas as pd
csvtrain = pd.read_csv('bdd100k_train.csv')
for img_name in list(set(csvtrain['name'])):
    x = csvtrain.loc[csvtrain ['name'] == img_name]                                                    
    if len(x.loc[ ( (x['label']=='motor') | (x['label']=='cyclist/rider') | (x['label']=='Bicycle') |                (x['label']== 'Bus' ) ) , : ]) ==0 :
         csvtrain.drop(x.index, axis = 0 , inplace = True)                       df.to_csv('./bdd100k_train.csv', index=False)


import pandas as pd
csvtrain = pd.read_csv('bdd100k_train.csv')
# Make the annotation txt file from csv
with open('bk100_train.txt','a') as f:
    for label, name, x2, x1, y2, y1 in csvtrain.iloc[:,[0,1,2,3,4,5]].values:
        f.write(str('/C:/Users/mayank singh/Desktop/berekely/'+name)+','+str(round(x1))+','+str(round(y1))+','+str(round(x2))+','+str(round(y2))+','+label+'\n')


Faster RCNN Implementation

!git clone https://github.com/kentaroy47/frcnn-from-scratch-with-keras.git

Train RPN for Frcnn

  • Upload the 'bk100_train.txt' file in the 'frcnn-from-scratch-with-keras'
cd /content/frcnn-from-scratch-with-keras

Start the RPN Training

#elen is the total number of annotation
!python train_rpn.py --network resnet50 -o simple -p bk100_train.txt --num_epochs 50 --elen 1286871 


Train Frcnn
  • Use the RPN weight 'rpn.resnet50.weights.88-0.63.hdf5' stored in 'models/rpn' direcctory

!python train_frcnn.py --network resnet50 -o simple -p bdd100_train.txt --num_epochs 50 --elen 1286871  --rpn models/rpn/rpn.resnet50.weights.88-0.63.hdf5

Test Frcnn

!python test_frcnn.py --network resnet50 -p test_image/ --load models/resnet50/voc.hdf5



In the next blog, we will cover  Object Detection with Retinanet.
https://sngurukuls247.blogspot.com/2020/05/deep-learning-6-object-detection-with.html

                                                                                                                                                  

Follow the link below to access Free Python Lectures-
https://www.youtube.com/sngurukul

Feel free contact me on-

No comments:

Post a Comment