Skip to content

DL_Lab_3_Wiki

Bhavya Teja edited this page Dec 12, 2017 · 1 revision

Welcome to the Python-DeepLearning_CS5590 Deep Learning - Lab 3 wiki!

The main objective of the task is to perform the text classification on the same dataset by using three different models i.e., CNN, RNN and LSTM and should decide which is the best model of the three. We divide the dataset into test and train samples and perform the modeling on them. After training the model, we test the accuracy of the model. And based on the test accuracies we can decide that which model is the best-fitted model for the text classification.

The workflow of the task is as follows.

  1. The task starts with importing the classes and the functions that are used in the program.
  2. IMDB dataset will be loaded and is confined to only 5000 words. The dataset is split into train and test sets each 50%.
  3. The sentences in the reviews are restricted to 500 words each as the inputs should be of the same length.
  4. This restriction is applied to both the train and test sets.
  5. Now the dataset is ready to build the model and hence we start with the embedding layer with 32 length vectors for each word.
  6. The next which will be applied varies from one model to another. ‘SimpleRNN’ for RNN model, ‘Convolution2D’ for CNN model and ‘LSTM’ for LSTM model.
  7. The model will be trained for 3 epochs as it leads to overfitting if we apply for more.
  8. An efficient optimizing algorithm called ‘Adam’ is used for this model.
  9. By this, we estimate the accuracy of the model which is used to evaluate the performance of each model.
  10. Based on the accuracy of the model, we can say which is the best model for the text classification.

The Screenshots of the accuracies that are generated by the three models are:

LSTM Model accuracy screenshot:

RNN Model accuracy screenshot:

CNN Model accuracy screenshot:

By looking at the accuracies of the three models we can determine that CNN model performs better than the other two models. LSTM model stands at the second position which is somewhat closer to the CNN Model whereas RNN is way far from the top two models.

Clone this wiki locally