In the machine learning terminology Classification refers to a predictive modelling problem where the input data is classified as one of the predefined labelled classes. For example, predicting Yes or No, True or False falls in the category of Binary Classification as the number of outputs are limited to two labels.
Similarly, output having multiple classes like classifying different age groups are called multiclass classification problems. Classification problems are one of the most commonly used or defined types of ML problem that can be used in various use cases. There are various Machine Learning models that can be used for classification problems.
Ranging from Bagging to Boosting techniques although ML is more than capable of handling classification use cases, Neural Networks come into picture when we have a high amount of output classes and high amount of data to support the performance of the model. Going forward we’ll look at how we can implement a Classification Model using Neural Networks on Keras (Python).
Learn Artificial Intelligence Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.
Neural networks are loosely representative of the human brain learning. An Artificial Neural Network consists of Neurons which in turn are responsible for creating layers. These Neurons are also known as tuned parameters.
The output from each layer is passed on to the next layer. There are different nonlinear activation functions to each layer, which helps in the learning process and the output of each layer. The output layer is also known as terminal neurons.
The weights associated with the neurons and which are responsible for the overall predictions are updated on each epoch. The learning rate is optimised using various optimisers. Each Neural Network is provided with a cost function which is minimised as the learning continues. The best weights are then used on which the cost function is giving the best results.
For this article, we will be using Keras to build the Neural Network. Keras can be directly imported in python using the following commands.
import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense
Dataset and Target variable
We will be using Diabetes dataset which will be having the following features:
Input Variables (X):
- Pregnancies: Number of times pregnant
- Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
- BloodPressure: Diastolic blood pressure (mm Hg)
- SkinThickness: Triceps skin fold thickness (mm)
- Insulin: 2-Hour serum insulin (mu U/ml)
- BMI: Body mass index (weight in kg/(height in m)^2)
- DiabetesPedigreeFunction: Diabetes pedigree function
- Age: Age (years)
Output Variables (y):
Outcome: Class variable (0 or 1) [Patient is having Diabetes or not]
# load the dataset
df= loadtxt(‘pima-indians-diabetes.csv’, delimiter=’,’)
# Split data into X (input) and Y (output)
X = dataset[:,0:8]
y = dataset[:,8]
Define Keras Model
We can start building the neural network using sequential models. This top down approach helps build a Neural net architecture and play with the shape and layers. The first layer will have the number of features which can be fixed using input_dim. We will set it to 8 in this condition.
Creating Neural Networks is not a very easy process. There are many trials and errors that take place before a good model is built. We will build a Fully Connected network structure using the Dense class in keras. The Neuron counts as the first argument to be provided to the dense layer.
The activation function can be set using the activation argument. We will use the Rectified Linear Unit as the activation function in this case. There are other options like Sigmoid or TanH, but RELU is a very generalised and a better option.
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation=’relu’))
Compile Keras Model
Compiling the model is the next step after model definition. Tensorflow is used for model compilation. Compilation is the process where parameters are set for model training and predictions. CPU/GPU or distributed memories can be used in the background.
We have to specify a loss function which is used to evaluate weights for the different layers. The optimiser adjusts the learning rate and goes through various sets of weights. In this case we will use Binary Cross Entropy as the loss function. In the case of optimizer, we will use ADAM which is an efficient stochastic gradient descent algorithm.
It is very popularly used for tuning. Finally, because it is a classification problem, we will collect and report the classification accuracy, defined via the metrics argument. We will use accuracy in this case.
# compile the keras model
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
Model fit and Evaluation
Fitting the model is essentially known as model training. After Compiling the model, the model is ready to efficiently go over the data and train itself. The fit() function from Keras can be used for the process of model training. The two main parameters used before model training are:
- Epochs: One pass through the whole dataset.
- Batch Size: Weights are updated at each batch size. Epochs consist of equally distributed batches of data.
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
A GPU or a CPU is used in this process. The training can be a very long process depending on the epochs, batch size and most importantly the size of Data.
We can also evaluate the model on the training dataset using the evaluate() function. The data can be divided into training and testing sets and testing X and Y can be used for model evaluation.
For each input and output pair, this will produce a forecast and gather scores, including the average loss and any measurements we have installed, such as precision.
A list of two values will be returned by the evaluate() function. The first will be the model loss on the dataset and the second will be the model’s accuracy on the dataset. We are only interested in the accuracy of the report, so we will disregard the importance of the loss.
# evaluate the keras model
_, accuracy = model.evaluate(Xtest, ytest)
print(‘Accuracy: %.2f’ % (accuracy*100))
Also Read: Neural Network Model Introduction
We created and evaluated a classification based Neural Network. Although the data used was small in this case, Neural networks are mostly suitable for big numerical datasets.
Checkout upGrad’s Advanced Certificate Programme in Machine Learning & NLP. This course has been crafted keeping in mind various kinds of students interested in Machine Learning, offering 1-1 mentorship and much more.
How can neural networks be used for classification?
Classification is about categorizing objects into groups. A type of classification is where multiple classes are predicted. In neural networks, neural units are organized into layers. In the first layer, the input is processed and an output is produced. This output is then sent through the remaining layers to produce the final output. The same input is processed through the layer to produce different outputs. This can be represented with a multi-layer perceptron. The type of neural network used for classification depends on the data set, but neural networks have been used for classification problems.
Why are artificial neural networks good for classification?
In order to answer this question, we need to understand the basic principle of neural networks and the problem that neural networks are designed to solve. As the name suggests, neural networks are a biologically inspired model of the human brain. The basic idea is that we want to model a neuron as a mathematical function. Every neuron takes inputs from other neurons and computes an output. Then we connect these neurons in a way that mimics the neural network in the brain. The objective is to learn a network that can take in some data and produce an appropriate output.
When should we use Artificial Neural Networks?
Artificial Neural Networks are used in situations where you’re trying to duplicate the performance of living organisms or detect patterns in data. Medical diagnoses, recognizing speech, visualizing data, and predicting handwritten digits are all good use cases for an ANN. Artificial neural networks are used when there is a need to understand complex relationships between inputs and outputs. For example, there may be a lot of noise in the variables and it may be difficult to understand the relationships between these variables. Therefore, using Artificial Neural Networks is a common practice to retain the knowledge and data.