CNN image classification

Introduction

The dataset between Dogs and cats is a standard computer vision dataset. It involves classifying prints as either containing a canine or cat. Though the problem sounds easy, it was only effectively addressed in the last many times using deep literacy convolutional neural networks. While the dataset is practically answered. It may be used as the base for literacy and rehearsing how to developestimate, and use convolutional deep literacy neural networks for image bracket from scrape. This comprises;

In this article, we will walk through how to make an image classification model based on Convolution Neural Network (CNN) step by step.

Description

Prediction Problem

We have provided a set of dog and cat images. The task is to create a model to predict the category of an animal: dog or cat?

Data overview

  • The dataset may be downloaded for free from the Kaggle website.
  • Sign-up first to the Kaggle account.
  • Download the dataset by visiting the Dogs vs. Cats Data page.
  • Click the Download All button.
  • Unzip the 850-megabyte file.
  • The data we gathered is a subset of the Kaggle dog and cat dataset.
  • There are total 10, 000 images, 80 percent for the training set, and 20% for the test set.
  • There are 4000 images of dogs are in the training set.

CNN image classification

Develop a CNN Model

Generally, we need four steps to make a CNN model. They are;

  • Convolution
  • Max pooling
  • Flattening, and
  • Full connection

Develop a CNN Model

  • The feature detector is similarly an array of numbers.
  • We slide it over the image and produce a new array of numbers, representing a feature of the image for each feature detector.
  • Therefore, the operation between an input image and a feature detector that results in a feature map is Convolution as shown below in the figure.

Convolution

classifier = Sequential()
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), 

activation = 'relu'))
  • The final argument is the activation function.
  • We use ReLU to replace any negative pixel values in feature maps.
  • This is as depending on the parameters used in convolution, we can obtain negative pixels in feature maps.
  • Replacing negative pixels adds non-linearity for a non-linear classification problem.

Max pooling

Max pooling

  • Reiterating max pooling on every feature map makes a pooling layer.
  • Basically, max pooling is to decrease the number of nodes in the fully linked layers without losing main features and spatial structure information in the images.
  • Particularly, we use the MaxPooling2D() function to add the pooling layer.
  • We use a 2×2 filter for pooling in general.
classifier.add(MaxPooling2D(pool_size = (2, 2)))

Flattening

  • Flattening is to receive all pooled feature maps into a single vector as the input for the fully linked layers as shown in the below figure.

Flattening

Full connection

  • We changed an image into a one-dimensional vector with the above.
  • Now we will make a classifier using this vector as the input layer.
  • We will create a hidden layer. output_dim is the number of nodes in the hidden layer.
  • We select 128 to start with and use ReLU as the activation function as a common practice.
classifier.add(Dense(output_dim = 128, activation = ‘relu’))
  • After that add an output layer. For binary classification, output_dim is 1, and the activation function is Sigmoid.
classifier.add(Dense(output_dim =1, activation = ‘sigmoid’))

Final model structure

Final model structure

  • Let’s compile the CNN by selecting an SGD algorithm, a loss function, and performance metrics with all layers added.
  • We use binary_crossentropy for binary classification, and use categorical_crossentropy for multiple classification problems.
classifier.compile(optimizer = ‘adam’, loss = ‘binary_crossentropy’, metrics =’accuracy’)
train_datagen = ImageDataGenerator(rescale=1./255, 
     shear_range=0.2, zoom_range=0.2, horizontal_flip=True)test_datagen = ImageDataGenerator(rescale=1./255)train_set = train_datagen.flow_from_directory(‘dataset/training_set’, target_size=(64, 64), batch_size=32, class_mode=’binary’)test_set = 
test_datagen.flow_from_directory(‘dataset/test_set’, target_size=(64, 64), batch_size=32, class_mode=’binary’)classifier.fit_generator(train_set, steps_per_epoch=8000/32, epochs=25, validation_data=test_set, validation_steps=2000/32)
classifier.add(Conv2D(32, 3, 3, activation = ‘relu’))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
  • Run the model on training and test sets again. Finally, we got an enhanced test accuracy of 91 percent and a test accuracy of 82 percent.
  • We generate a folder ‘single_prediction’ for the images to be predicted as shown in the below figure.

Final model structure

  • We use the image module from Keras to load test images.
  • Set the target_size of the image to be (64, 64).
import numpy as np
from keras.preprocessing import imagetest_image = image.load_img(‘dataset/single_prediction/cat_or_dog_1.jpg’, target_size = (64, 64))
  • We require to add a dimension for the channel, from 2D array to 3D array.
test_image = image.img_to_array(test_image)
  • Now add batch dimension at index 0.
test_image = np.expand_dims(test_image, axis = 0)
  • For a prediction.
result = classifier.predict(test_image)
  • We obtained a result of 1. To know the mapping between animals and their linked numerical values, we use:
training_set.class_indices
  • We understand that 0 is a cat, and 1 is a dog. Our Convolutional Neural Network made a correct prediction.