Stock Market Prediction Using Machine Learning [Step-by-Step Implementation]


Prediction and analysis of the stock market are some of the most complicated tasks to do. There are several reasons for this, such as the market volatility and so many other dependent and independent factors for deciding the value of a particular stock in the market. These factors make it very difficult for any stock market analyst to predict the rise and fall with high accuracy degrees.

Top Machine Learning and AI Courses Online

However, with the advent of Machine Learning and its robust algorithms, the latest market analysis and Stock Market Prediction developments have started incorporating such techniques in understanding the stock market data.

In short, Machine Learning Algorithms are being used widely by many organisations in analysing and predicting stock values. This article shall go through a simple Implementation of analysing and predicting a Popular Worldwide Online Retail Store’s stock values using several Machine Learning Algorithms in Python.

Trending Machine Learning Skills

Problem Statement 

Before we get into the program’s implementation to predict the stock market values, let us visualise the data on which we will be working. Here, we will be analysing the stock value of Microsoft Corporation (MSFT) from the National Association of Securities Dealers Automated Quotations (NASDAQ). The stock value data will be presented in the form of a Comma Separated File (.csv), which can be opened and viewed using Excel or a Spreadsheet.

MSFT has its stocks registered in NASDAQ and has its values updated during every working day of the stock market. Note that the market doesn’t allow trading to happen on Saturdays and Sundays; hence there is a gap between the two dates. For each date, the Opening Value of the stock, Highest and Lowest values of that stock on the same days are noted, along with the Closing Value at the end of the day.

The Adjusted Close Value shows the stock’s value after dividends are posted (Too technical!). Additionally, the total volume of the stocks in the market are also given, With these data, it is up to the work of a Machine Learning/Data Scientist to study the data and implement several algorithms that can extract patterns from the Microsoft Corporation stock’s historical data.

Long Short-Term Memory 

To develop a Machine Learning model to predict the stock prices of Microsoft Corporation, we will be using the technique of Long Short-Term Memory (LSTM). They are used to make small modifications to the information by multiplications and additions. By definition, long-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in deep learning.

Unlike standard feed-forward neural networks, LSTM has feedback connections. It can process single data points (such as images) and entire data sequences (such as speech or video).To understand the concept behind LSTM, let us take a simple example of an online customer review of a Mobile Phone.

FYI: Free nlp course!

Suppose we want to buy the Mobile Phone, we usually refer to the net reviews by certified users. Depending on their thinking and inputs, we decide whether the mobile is good or bad and then buy it. As we go on reading the reviews, we look for keywords such as “amazing”, “good camera”, “best battery backup”, and many other terms related to a mobile phone.

We tend to ignore the common words in English such as “it”, “gave”, “this”, etc. Thus, when we decide whether to buy the mobile phone or not, we only remember these keywords defined above. Most probably, we forget the other words.

This is the same way in which the Long short-term Memory Algorithm works. It only remembers the relevant information and uses it to make predictions ignoring the non-relevant data. In this way, we have to build an LSTM model that essentially recognises only the essential data about that stock and leaves out its outliers.


Though the above-given structure of an LSTM architecture may seem intriguing at first, it is sufficient to remember that LSTM is an advanced version of Recurrent Neural Networks that retains Memory to process sequences of data. It can remove or add information to the cell state, carefully regulated by structures called gates.

The LSTM unit comprises a cell, an input gate, an output gate, and a forget gate. The cell remembers values over arbitrary time intervals, and the three gates regulate the flow of information into and out of the cell.

Program Implementation

We shall move on to the part where we put the LSTM into use in predicting the stock value using Machine Learning in Python.

Step 1 – Importing the Libraries

As we all know, the first step is to import libraries that are necessary to preprocess the stock data of Microsoft Corporation and the other required libraries for building and visualising the outputs of the LSTM model. For this, we will use the Keras library under the TensorFlow framework. The required modules are imported from the Keras library individually.

#Importing the Libraries

import pandas as PD

import NumPy as np

%matplotlib inline

import matplotlib. pyplot as plt

import matplotlib

from sklearn. Preprocessing import MinMaxScaler

from Keras. layers import LSTM, Dense, Dropout

from sklearn.model_selection import TimeSeriesSplit

from sklearn.metrics import mean_squared_error, r2_score

import matplotlib. dates as mandates

from sklearn. Preprocessing import MinMaxScaler

from sklearn import linear_model

from Keras. Models import Sequential

from Keras. Layers import Dense

import Keras. Backend as K

from Keras. Callbacks import EarlyStopping

from Keras. Optimisers import Adam

from Keras. Models import load_model

from Keras. Layers import LSTM

from Keras. utils.vis_utils import plot_model

Step 2 – Getting Visualising the Data

Using the Pandas Data reader library, we shall upload the local system’s stock data as a Comma Separated Value (.csv) file and store it to a pandas DataFrame. Finally, we shall also view the data.

#Get the Dataset

df = pd.read_csv(“MicrosoftStockData.csv”,na_values=[‘null’],index_col=’Date’,parse_dates=True,infer_datetime_format=True)


Get AI certification online from the World’s top Universities – Masters, Executive Post Graduate Programs, and Advanced Certificate Program in ML & AI to fast-track your career.

Step 3 – Print the DataFrame Shape and Check for Null Values. 

In this yet another crucial step, we first print the shape of the dataset. To make sure that there are no null values in the data frame, we check for them. The presence of null values in the dataset tend to cause problems during training as they act as outliers causing a wide variance in the training process.

#Print Dataframe shape and Check for Null Values

print(“Dataframe Shape: “, df. shape)

print(“Null Value Present: “, df.IsNull().values.any())

>> Dataframe Shape: (7334, 6)

>>Null Value Present: False


Date Open High Low Close Adj Close Volume
1990-01-02 0.605903 0.616319 0.598090 0.616319 0.447268 53033600
1990-01-03 0.621528 0.626736 0.614583 0.619792 0.449788 113772800
1990-01-04 0.619792 0.638889 0.616319 0.638021 0.463017 125740800
1990-01-05 0.635417 0.638889 0.621528 0.622396 0.451678 69564800
1990-01-08 0.621528 0.631944 0.614583 0.631944 0.458607 58982400

Step 4 – Plotting the True Adjusted Close Value 

The final output value that is to be predicted using the Machine Learning model is the Adjusted Close Value. This value represents the closing value of the stock on that particular day of stock market trading. 

#Plot the True Adj Close Value

df[‘Adj Close’].plot()

Step 5 – Setting the Target Variable and Selecting the Features

In the next step, we assign the output column to the target variable. In this case, it is the adjusted relative value of the Microsoft Stock. Additionally, we also select the features that act as the independent variable to the target variable (dependent variable). To account for training purpose, we choose four characteristics, which are:

  • Open
  • High
  • Low
  • Volume

#Set Target Variable

output_var = PD.DataFrame(df[‘Adj Close’])

#Selecting the Features

features = [‘Open’, ‘High’, ‘Low’, ‘Volume’]

Step 6 – Scaling

To reduce the data’s computational cost in the table, we shall scale down the stock values to values between 0 and 1. In this way, all the data in big numbers get reduced, thus reducing memory usage. Also, we can get more accuracy by scaling down as the data is not spread out in tremendous values. This is performed by the MinMaxScaler class of the sci-kit-learn library.


scaler = MinMaxScaler()

feature_transform = scaler.fit_transform(df[features])

feature_transform= pd.DataFrame(columns=features, data=feature_transform, index=df.index)



Date Open High Low Volume
1990-01-02 0.000129 0.000105 0.000129 0.064837
1990-01-03 0.000265 0.000195 0.000273 0.144673
1990-01-04 0.000249 0.000300 0.000288 0.160404
1990-01-05 0.000386 0.000300 0.000334 0.086566
1990-01-08 0.000265 0.000240 0.000273 0.072656

As mentioned above, we see that the feature variables’ values are scaled down to smaller values compared to the real values given above.

Step 7 – Splitting to a Training Set and Test Set.

Before feeding the data into the training model, we need to split the entire dataset into training and test set. The Machine Learning LSTM model will be trained on the data present in the training set and tested upon on the test set for accuracy and backpropagation.

For this, we will be using the TimeSeriesSplit class of the sci-kit-learn library. We set the number of splits as 10, which denotes that 10% of the data will be used as the test set, and 90% of the data will be used for training the LSTM model. The advantage of using this Time Series split is that the split time series data samples are observed at fixed time intervals.

#Splitting to Training set and Test set

timesplit= TimeSeriesSplit(n_splits=10)

for train_index, test_index in timesplit.split(feature_transform):

        X_train, X_test = feature_transform[:len(train_index)], feature_transform[len(train_index): (len(train_index)+len(test_index))]

        y_train, y_test = output_var[:len(train_index)].values.ravel(), output_var[len(train_index): (len(train_index)+len(test_index))].values.ravel()

Step 8 – Processing the Data For LSTM

Once the training and test sets are ready, we can feed the data into the LSTM model once it is built. Before that, we need to convert the training and test set data into a data type that the LSTM model will accept. We first convert the training data and test data to NumPy arrays and then reshape them to the format (Number of Samples, 1, Number of Features) as the LSTM requires that the data be fed in 3D form. As we know, the number of samples in the training set is 90% of 7334, which is 6667, and the number of features is 4, the training set is reshaped to (6667, 1, 4). Similarly, the test set is also reshaped.

#Process the data for LSTM

trainX =np.array(X_train)

testX =np.array(X_test)


X_train = trainX.reshape(X_train.shape[0], 1, X_train.shape[1])

X_test = testX.reshape(X_test.shape[0], 1, X_test.shape[1])

Step 9 – Building the LSTM Model

Finally, we come to the stage where we build the LSTM Model. Here, we create a Sequential Keras model with one LSTM layer. The LSTM layer has 32 unit, and it is followed by one Dense Layer of 1 neuron.

We use Adam Optimizer and the Mean Squared Error as the loss function for compiling the model. These two are the most preferred combination for an LSTM model. Additionally, the model is also plotted and is displayed below.

#Building the LSTM Model

lstm = Sequential()

lstm.add(LSTM(32, input_shape=(1, trainX.shape[1]), activation=’relu’, return_sequences=False))


lstm.compile(loss=’mean_squared_error’, optimizer=’adam’)

plot_model(lstm, show_shapes=True, show_layer_names=True)

Step 10 – Training the Model

Finally, we train the LSTM model designed above on the training data for 100 epochs with a batch size of 8 using the fit function.

#Model Training

history =, y_train, epochs=100, batch_size=8, verbose=1, shuffle=False)

Epoch 1/100

834/834 [==============================] – 3s 2ms/step – loss: 67.1211

Epoch 2/100

834/834 [==============================] – 1s 2ms/step – loss: 70.4911

Epoch 3/100

834/834 [==============================] – 1s 2ms/step – loss: 48.8155

Epoch 4/100

834/834 [==============================] – 1s 2ms/step – loss: 21.5447

Epoch 5/100

834/834 [==============================] – 1s 2ms/step – loss: 6.1709

Epoch 6/100

834/834 [==============================] – 1s 2ms/step – loss: 1.8726

Epoch 7/100

834/834 [==============================] – 1s 2ms/step – loss: 0.9380

Epoch 8/100

834/834 [==============================] – 2s 2ms/step – loss: 0.6566

Epoch 9/100

834/834 [==============================] – 1s 2ms/step – loss: 0.5369

Epoch 10/100

834/834 [==============================] – 2s 2ms/step – loss: 0.4761




Epoch 95/100

834/834 [==============================] – 1s 2ms/step – loss: 0.4542

Epoch 96/100

834/834 [==============================] – 2s 2ms/step – loss: 0.4553

Epoch 97/100

834/834 [==============================] – 1s 2ms/step – loss: 0.4565

Epoch 98/100

834/834 [==============================] – 1s 2ms/step – loss: 0.4576

Epoch 99/100

834/834 [==============================] – 1s 2ms/step – loss: 0.4588

Epoch 100/100

834/834 [==============================] – 1s 2ms/step – loss: 0.4599

Finally, we see that the loss value has decreased exponentially over time during the training process of 100 epochs and has reached a value of 0.4599

Step 11 – LSTM Prediction

With our model ready, it is time to use the model trained using the LSTM network on the test set and predict the Adjacent Close Value of the Microsoft stock. This is performed by using the simple function of predict on the lstm model built.

#LSTM Prediction

y_pred= lstm.predict(X_test)

Step 12 – True vs Predicted Adj Close Value – LSTM

Finally, as we have predicted the test set’s values, we can plot the graph to compare both Adj Close’s true values and Adj Close’s predicted value by the LSTM Machine Learning model.


#True vs Predicted Adj Close Value – LSTM

plt.plot(y_test, label=’True Value’)

plt.plot(y_pred, label=’LSTM Value’)

plt.title(“Prediction by LSTM”)

plt.xlabel(‘Time Scale’)

plt.ylabel(‘Scaled USD’)


The above graph shows that some pattern is detected by the very basic single LSTM network model built above. By fine-tuning several parameters and adding more LSTM layers to the model, we can achieve a more accurate representation of any given company’s stock value.

Popular AI and ML Blogs & Free Courses


If you’re interested to learn more about artificial intelligence examples, machine learning, check out IIIT-B & upGrad’s Executive PG Programme in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Can you predict the stock market using machine learning?

Today, we have a number of indicators to help predict market trends. However, we have to look no further than a high-powered computer to find the most accurate indicators for the stock market. The stock market is an open system, and it can be viewed as a complex network. The network is made up of the relationships between the stocks, companies, investors and trade volumes. By using a data-mining algorithm like the support vector machine, you can apply a mathematical formula to extract the relationships among these variables. The stock market is now beyond human prediction.

Which algorithm is best for stock market prediction?

For best results, you should use Linear Regression. Linear Regression is a statistical approach that is used to determine the relationship between two different variables. In this example, the variables are price and time. In stock market prediction, the price is the independent variable, and the time is the dependent variable. If a linear relationship between these two variables can be determined, then it is possible to accurately predict the value of the stock at any point in the future.

Is stock market prediction a classification or regression problem?

Before we answer, we need to understand what stock market predictions mean. Is it a binary classification problem or a regression problem? Suppose we want to predict the future of a stock, where future means the next day, week, month, or year. If the past performance of the stock at some time point is the input and future is the output, then it is a regression problem. If the past performance of a stock and the future of a stock are independent, then it is a classification problem.

Want to share this article?

Leave a comment

Your email address will not be published. Required fields are marked *

Our Popular Machine Learning Course

Get Free Consultation

Leave a comment

Your email address will not be published. Required fields are marked *

Get Free career counselling from upGrad experts!
Book a session with an industry professional today!
No Thanks
Let's do it
Get Free career counselling from upGrad experts!
Book a Session with an industry professional today!
Let's do it
No Thanks