Sunday, May 10, 2020

Deep Learning 7-Object Detection with YOLO

May 10, 2020 19

Learn A-Z Deep Learning in 15 Days


In this Lecture, will learn about Objection Detection with Retinanet


You Only Look Once

YOLO first takes an input image. The framework then divides the input image into grids (say a n X n grid- Each grid predict):


  • Predicts B boundary boxes and each box has one box confidence score Pr(Object) ∗ IOU
  • Detects one object only regardless of the number of boxes B,
  • Predicts C conditional class probabilities (one per class for the likeliness of the object class).
  • The bounding boxes having the class probability above a threshold value is selected and used to locate the object within the image
  • Total loss = classification loss+ localization loss (errors between the predicted boundary box and the ground truth) + confidence loss (the objectness of the box)


Data Preprocessing with Kitti dataset


Categories                    frequency

  Car----------------------28742
Pedestrian----------------4487
  Van------------------------2914
Cyclist-------------------1627
Truck--------------------1094
Misc----------------------973
Tram----------------------511
Person_sitting------------222

  • Total annotation count = 40570
  • Total images = 7481



import pandas as pd

path="kitti/training/label_2/"

df = pd.DataFrame()

# iterate the list of files in the directory 'label_2'
for filename in os.listdir(path): 
    
    filepath= os.path.join(path,filename)
# read file in csv format  
    pdtxtdata=pd.read_csv(filepath, delimiter = ' ', header=None)
    
# create the folder 'updatelabel' in the current folder    
    with open('./updatelabel/'+filename,'a') as f: 
        
# iloc takes the values at 0,4,5,6,7 i,e class_name, xmin, ymin, xmax and ymax.
        for label, xmin, ymin, xmax, ymax in pdtxtdata.iloc[:,[0,4,5,6,7]].values:
# categories 'Pedestrian' and "Person_sitting" are considered
            if label=='Pedestrian' or label=="Person_sitting":
                f.write(str(0)+' '+str(xmin)+' '+str(ymin)+' '+str(xmax)+' '+str(ymax)+'\n')


# Remove the files with 0 size
import os
path="./updatelabel/"
# Iterate the list of files in the directory 'updatelabel'
for filename in os.listdir(path):
    filepath= os.path.join(path,filename)
    if os.path.getsize(filepath)==0:
        os.remove(filepath)

# Copy and change the file name 
import os
import shutil
path="./updatelabel/"
for filename in os.listdir(path):
    shutil.copy("/kitti/training/image_2/"+filename.split('.')[0]+'.png',"./yoloimagelabel/" +filename.split('.')[0]+'.png')

# Convert the kitti to Yolo format i.e (xmin, ymin, xmax ,ymax) to (x, y, w, h) 
import pandas as pd
path="./udatelabel/"
df = pd.DataFrame()
for filename in os.listdir(path):
    filepath= os.path.join(path,filename)
    image= cv2.imread("./image/"+filename.split('.')[0]+'.png')
    image_width= image.shape[1]
    image_height= image.shape[0]
    pdtxtdata=pd.read_csv(filepath, delimiter = ' ', header=None)
    with open('./yoloimagelabel/'+filename,'a') as f:
        for label, xmin, ymin, xmax, ymax in pdtxtdata.iloc[:,[0,1,2,3,4]].values:
            f.write(str(0)+' '+str('%.6f' % float((xmin+xmax)/(2.0*image_width)))+' '+str('%.6f' % float((ymin+ymax)/(2.0*image_height)))+' '+str('%.6f' % float((xmax-xmin)/image_width))+' '+str('%.6f' % float((ymax-ymin)/image_height))+'\n')

Total Images = 1796 & Total Annotation = 4709

# Split the train & test 
# Generate the train.txt and test.txt
import glob, os

current_dir = 'dataset'
# Percentage of images to be used for the test set
percentage_test = 20;
# Create and/or truncate train.txt and test.txt
file_train = open('train.txt', 'w')  
file_test = open('test.txt', 'w')
# Populate train.txt and test.txt
counter = 1  
index_test = round(100 / percentage_test)  
for pathAndFilename in glob.glob(os.path.join(current_dir, "*.*g")):  
if counter == index_test:
counter = 1
file_test.write(current_dir + "/" + os.path.basename(pathAndFilename) + "\n")
else:
file_train.write(current_dir + "/" + os.path.basename(pathAndFilename) + "\n")
counter = counter + 1
file_test.close()
file_train.close()

Run the below code on google colab

!git clone https://github.com/AlexeyAB/darknet.git

cd /content/darknet


Makefile

  • Download the Makefile
  • Change GPU = 1, CUDNN = 1 & OPENCV = 1
  • Upload the Modified Makefile in darknet folder





!make

!./darknet


Train & Test file

  • Upload the train & test file in darknet folder

cd /content/darknet/cfg


Obj.data


  • Create obj.data file ( as show figure)


  1. classes = 1
  2. train = train.txt
  3. valid = test.txt
  4. names = obj.names
  5. backup = backup/


  • Upload the in cfg folder



Obj.names
  • Create obj.data file and
  • write the name of all the categories ( as show figure)
    1. Person
  • Upload the in cfg folder



Yolov3 Network
  • Download the Yolov3.cfg from cfg folder
  • classes and filters params of [yolo] and [convolutional] layers that are just before the [yolo] layers
  • filters=(classes + 5) * 3. For a single class we should set filters= (1+5)*3 = 18
    1. classes = 1
    2. filters = 18
  • Upload the Yolov3.cfg to cfg folder

Download Weight

cd /content/darknet
!wget https://pjreddie.com/media/files/darknet19_448.conv.23

!./darknet detector train cfg/obj.data cfg/yolov3.cfg darknet19_448.conv.23 -dont_show 0



In the next blog, we will cover  Deeping Learning NLP.
https://sngurukuls247.blogspot.com/2018/09/python-ninja-bootcamp-1-course.html

                                                                                                                                                

Follow the link below to access Free Python Lectures-
https://www.youtube.com/sngurukul

Feel free contact me on-
Email - sn.gurukul24.7uk@gmail.com

Deep Learning 6-Object Detection with Retinanet

May 10, 2020 1

Learn A-Z Deep Learning in 15 Days


In this Lecture, will learn about Objection Detection with Retinanet


Retinanet

The one-stage RetinaNet network architecture uses a Feature Pyramid Network (FPN) backbone on top of a feedforward ResNet architecture (a) to generate a rich, multi-scale convolutional feature pyramid (b). To this backbone, RetinaNet attaches two subnetworks, one for classifying anchor boxes (c) and one for regressing from anchor boxes to ground-truth object boxes (d). The network design is intentionally simple, which enables this work to focus on a novel focal loss function that eliminates the accuracy gap between our one-stage detector and state-of-the-art two-stage detectors like Faster R-CNN with FPN while running at faster speeds.




Data Preprocessing with Berekely dataset


Categories         frequency

    car-----------------------713211
   traffic sign---------------239686
traffic light-------------186117
    person--------------------91349
     truck-----------------------29971
   bus-----------------------11672
  bike-----------------------7210
  rider-----------------------4517
     motor-----------------------3002
train-----------------------136

  • Total annotation count = 1286871
  • Total images = 69863


**Run the Code in Google Colab

import os
import pandas as pd
import json

df = pd.DataFrame()
filepath='./bdd100k/labels/bdd100k_labels_images_train.json'
jsfile = json.loads(open(filepath).read())

for js in jsfile:
    for label_point in js['labels']:
        if label_point['category']=='bike' :
            
            df=df.append({'label':label_point['category'],
            'xmin': str(label_point['box2d']['x1']), 'ymin': str(label_point['box2d']['y1']),
            'xmax': str(label_point['box2d']['x2']), 'ymax':str(label_point['box2d']['y2']),
            'name':js['name']
}, ignore_index=True)

df.to_csv('./bdd100k_train.csv', index=False, sep=",", header= False)




Create Mapping for class
  • class name to ID mapping file should contain one mapping per line


Retinanet Implementation

!git clone https://github.com/fizyr/keras-retinanet.git

cd /content/keras-retinanet/

 !python setup.py build_ext --inplace

 !pip install keras_retinanet


Train Retinanet
  • Upload the 'bdd100k_train.csv' file in the 'keras-retinanet'
  • Upload the 'mapping_file.csv' file in the 'fkeras-retinanet'

!python keras_retinanet/bin/train.py --batch-size 1 --epochs 50 --tensorboard-dir ./Tensorboard_files/ --steps 171487 csv bdd100k_train.csv mapping_file.csv 


Convert the 'resnet50' to 'inference' model
!python keras_retinanet/bin/convert_model.py snapshots/resnet50_csv_02.h5 inference/inference_v20.h5


Test Retinanet model
  • Use the link below the to test the images








In the next blog, we will cover  Object Detection with YOLO.
https://sngurukuls247.blogspot.com/2020/05/deep-learning-7-object-detection-with.html

                                                                                                                                                  

Follow the link below to access Free Python Lectures-
https://www.youtube.com/sngurukul

Feel free contact me on-
Email - sn.gurukul24.7uk@gmail.com

Deep learning 5-Objection Detection with Region-CNN

May 10, 2020 0

Learn A-Z Deep Learning in 15 Days


In this Lecture, will learn about Objection Detection with Region-CNN


Object Detection is the process of localization and Recognition.



Region-CNN or RCNN

RCNN takes the input image-

  • Performs the sliding window
  • Proposes bounding box using Selective Search Algo.
  • CNN classifies the Propose region
  • Linear Regressor generates the tighter bounding box

Disadvantages

  • Slow: calculate a feature map (one CNN forward pass) for each region proposal.
  • Hard to train: Remember that in the R-CNN System we had 3 different parts (CNN, SVM, Bounding Box Regressor) that we had to train separately. This makes training very difficult.
  • Large memory requirement: Save every feature map of each region proposal. This needs a lot of memory.






Fast RCNN
  • Image is passed to a ConvNet which in turn generates the Regions of Interest.
  • RoI pooling layer is applied to all of these regions to reshape them as per the input of the ConvNet. Then, each region is passed on to a fully connected network.
  • The softmax layer is used on top of the fully connected network to output classes. Along with the softmax layer, a linear regression layer is also used parallelly to output bounding box coordinates for predicted classes.
Disadvantage
  • Selective search is slow and hence computation time is still high.





Faster RCNN

Takes an input image


  • Pass it to the ConvNet which returns feature maps for the image
  • Apply Region Proposal Network (RPN) on these feature maps and get proposals(anchors).
  • Apply ROI pooling layer to bring down all the proposals to the same size
  • Finally, pass these proposals to a fully connected layer in order to classify any predict the bounding boxes for the image

Total_loss = rpn_class + rpn_regressor

Disadvantage

  • Selective search is slow and hence computation time is still high.








Comparison of RCNN, Fast RCNN & Faster RCNN



Data Preprocessing with Berekely dataset


Categories         frequency

    car-----------------------713211
   traffic sign---------------239686
traffic light-------------186117
    person--------------------91349
     truck-----------------------29971
   bus-----------------------11672
  bike-----------------------7210
  rider-----------------------4517
     motor-----------------------3002
train-----------------------136

  • Total annotation count = 1286871
  • Total images = 69863


**Run the Code in Google Colab

import os
import pandas as pd
import json

df = pd.DataFrame()
filepath='./bdd100k/labels/bdd100k_labels_images_train.json'
jsfile = json.loads(open(filepath).read())

for js in jsfile:
    for label_point in js['labels']:
        if label_point['category']=='lane' or label_point['category']=='drivable area':
            continue

        
        df=df.append({'label':label_point['category'],'name':js['name'],
        'xmin':label_point['box2d']['x1'],'ymin':label_point['box2d']['y1'],
        'xmax':label_point['box2d']['x2'],'ymax':label_point['box2d']['y2']
        }, ignore_index=True)

df.to_csv('./bdd100k_train.csv', index=False)


# Replace the label
df.loc[df.label=='bike','label']='Bicycle'
pdtxtdata.loc[pdtxtdata.label=='rider','label']='cyclist/rider'
# Drop label inplace
x = df.loc[df["label"] == 'train']
df.drop(x.index, axis = 0 , inplace = True)


 # Select via label as 'motor'   'cyclist/rider'    'bicycle'  and 'bus'                                                      import pandas as pd
csvtrain = pd.read_csv('bdd100k_train.csv')
for img_name in list(set(csvtrain['name'])):
    x = csvtrain.loc[csvtrain ['name'] == img_name]                                                    
    if len(x.loc[ ( (x['label']=='motor') | (x['label']=='cyclist/rider') | (x['label']=='Bicycle') |                (x['label']== 'Bus' ) ) , : ]) ==0 :
         csvtrain.drop(x.index, axis = 0 , inplace = True)                       df.to_csv('./bdd100k_train.csv', index=False)


import pandas as pd
csvtrain = pd.read_csv('bdd100k_train.csv')
# Make the annotation txt file from csv
with open('bk100_train.txt','a') as f:
    for label, name, x2, x1, y2, y1 in csvtrain.iloc[:,[0,1,2,3,4,5]].values:
        f.write(str('/C:/Users/mayank singh/Desktop/berekely/'+name)+','+str(round(x1))+','+str(round(y1))+','+str(round(x2))+','+str(round(y2))+','+label+'\n')


Faster RCNN Implementation

!git clone https://github.com/kentaroy47/frcnn-from-scratch-with-keras.git

Train RPN for Frcnn

  • Upload the 'bk100_train.txt' file in the 'frcnn-from-scratch-with-keras'
cd /content/frcnn-from-scratch-with-keras

Start the RPN Training

#elen is the total number of annotation
!python train_rpn.py --network resnet50 -o simple -p bk100_train.txt --num_epochs 50 --elen 1286871 


Train Frcnn
  • Use the RPN weight 'rpn.resnet50.weights.88-0.63.hdf5' stored in 'models/rpn' direcctory

!python train_frcnn.py --network resnet50 -o simple -p bdd100_train.txt --num_epochs 50 --elen 1286871  --rpn models/rpn/rpn.resnet50.weights.88-0.63.hdf5

Test Frcnn

!python test_frcnn.py --network resnet50 -p test_image/ --load models/resnet50/voc.hdf5



In the next blog, we will cover  Object Detection with Retinanet.
https://sngurukuls247.blogspot.com/2020/05/deep-learning-6-object-detection-with.html

                                                                                                                                                  

Follow the link below to access Free Python Lectures-
https://www.youtube.com/sngurukul

Feel free contact me on-

Saturday, May 9, 2020

Deep Learning 4-Classification_with_Transfer_Learning

May 09, 2020 0

Learn A-Z Deep Learning in 15 Days


In this Lecture, will learn about Classification with Transfer Learning

What is Transfer Learning?
Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. It is a popular approach in deep learning where pre-trained models are used as the starting point. The pre-trained model is a model created by someone else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, you use the model trained on other problems as a starting point.


Imp:-Use Kaggle Kernel to Run Code. Download the Cat Vs Dog dataset from the link below.

import cv2
import os 
import numpy as np


Unzip  the Dataset

!unzip ../input/train.zip

!unzip ../input/test1.zip


Import the Train 

IMAGE_WIDTH = 128
IMAGE_HEIGHT = 128
IMAGE_CHANNELS = 1
IMAGE_SIZE=(IMAGE_WIDTH, IMAGE_HEIGHT)

directory = "/kaggle/working/train"
data = []
label = []

for filename in os.listdir(directory):

    image = cv2.imread(directory+r'/'+filename,0)
    
    if image is None:
        continue
    image = cv2.resize(image,IMAGE_SIZE)
    
    category = filename.split('.')[0]
    if category == 'dog':
        label.append(1)
    else:
        label.append(0)

    data.append(image/255)


List to Array Conversion

data=np.array(data)
data=data.reshape((data.shape)[0],(data.shape)[1],(data.shape)[2],1)
label=np.array(label)
print(data.shape)
print(label.shape)


Train Test Split

from sklearn.model_selection import train_test_split
x_train, x_val, y_train, y_val = train_test_split(data, label, test_size=0.3, random_state=42) 


One hot encoding


from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train,num_classes=2)
y_val = np_utils.to_categorical(y_val,num_classes=2)


Transfer learnikng


from keras import applications
# load the ResNet-50 network, ensuring the head FC layer sets are left off
baseModel = applications.VGG16(weights = 'imagenet',              
                               include_top = False,
                               pooling = None,
                               input_shape = (IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS))
   
## You can use applications.ResNet50, applications.VGG16 pooling='avg'
baseModel.summary()




from keras import layers
from keras import models
from keras import optimizers

# construct the head of the model that will be placed on top of the the base model
headModel = baseModel.output
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(2, activation="softmax")(headModel)

# place the head FC model on top of the base model (this will become
# the actual model we will train)
model = Model(inputs=baseModel.input, outputs=headModel)

# loop over all layers in the base model and freeze them so they will
# *not* be updated during the training process
for layer in baseModel.layers:
layer.trainable = False

model.summary()
model.compile(loss='categorical_crossentropy',optimizer=optimizers.adam(lr=1e-8),metrics=['acc'])


Callbacks


from keras.callbacks import ModelCheckpoint, EarlyStopping

filepath = "./cp-{epoch:02d}.h5"

checkpoint = ModelCheckpoint(filepath,
                             monitor="val_loss",
                             mode="min",
                             save_best_only = True,
                             verbose=1)


earlystop = EarlyStopping(monitor = 'val_loss', 
                          min_delta = 0, 
                          patience =4,
                          verbose = 1,
                          restore_best_weights = True)

# put our call backs into a callback list
callbacks = [earlystop, checkpoint]


Start the Training

model.fit(x_train,y_train,validation_data=(x_val,y_val),epochs=50,batch_size=64, callbacks = callbacks)


Visualize Graph


### Graph Epoch vs acc AND Epoch vs Loss

import matplotlib.pyplot as plt

acc = model.history.history['acc']
val_acc = model.history.history['val_acc']
loss = model.history.history['loss']
val_loss = model.history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'blue', label='Training acc')
plt.plot(epochs, val_acc, 'red', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'blue', label='Training loss')
plt.plot(epochs, val_loss, 'red', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()



Import test dataset


IMAGE_WIDTH = 128
IMAGE_HEIGHT = 128
IMAGE_CHANNELS = 1
IMAGE_SIZE=(IMAGE_WIDTH, IMAGE_HEIGHT)

directory = "/kaggle/working/test1"
data = []


for filename in os.listdir(directory):

    image = cv2.imread(directory+r'/'+filename,0)
    if image is None:
        continue
    image = cv2.resize(image,IMAGE_SIZE)
    
    data.append(image/255)


List to Array Conversion


data = np.array(data)
test_image = np.array(data.reshape((data.shape)[0],(data.shape)[1],(data.shape)[2],1))


Make the Prediction

predictions = model.predict(test_image)
results = np.argmax(predictions, axis = 1)


Convert the label into the category

key={0:'cat',1:'dog'}
label_prediction=[key[r] for r in results]


Display the Result

import matplotlib.image as img
import matplotlib.pyplot as plt

nb_rows = 3
nb_cols = 3
fig, axs = plt.subplots(nb_rows, nb_cols, figsize=(6, 6), dpi =100)

n = 0
for i in range(0, nb_rows):
    for j in range(0, nb_cols):
        axs[i,j].set_title(label_prediction[n])
        axs[i,j].imshow(data[n],cmap = "gray")
        n += 1  
        
plt.tight_layout()
plt.show()


Create the Dataframe


import pandas as pd

df=pd.DataFrame(data={'imagename':os.listdir(directory), 'predicted_labels': label_prediction})
df.head()


Save the dataframe in csv format


df.to_csv('submission_new_model.csv', index=False, header=True)



In the next blog, we will start  Deep Learning Classification with Object Detection.
https://sngurukuls247.blogspot.com/2020/05/deep-learning-5-objection-detection.html

                                                                                                                                                    

Follow the link below to access Free Python Lectures-
https://www.youtube.com/sngurukul

Feel free contact me on-
Email - sn.gurukul24.7uk@gmail.com