Practical: RNN vs CNN with Fashion-MNIST¶

Anastasia Giachanou, Tina Shahedi

Machine Learning with Python - Utrecht Summer School

In this practical, we'll explore and understand the application of advanced deep learning models, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), using the Fashion-MNIST dataset.

In [1]:
!pip install scikeras[tensorflow] > /dev/null 2>&1     # gpu compute platform
!pip install scikeras[tensorflow-cpu] > /dev/null 2>&1
!pip install scikeras > /dev/null 2>&1
!pip install pydot graphviz > /dev/null 2>&1
!pip uninstall -y scikit-learn
!pip install scikit-learn==1.5.2
Found existing installation: scikit-learn 1.6.1
Uninstalling scikit-learn-1.6.1:
  Successfully uninstalled scikit-learn-1.6.1
Collecting scikit-learn==1.5.2
  Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Requirement already satisfied: numpy>=1.19.5 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (2.0.2)
Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (1.16.0)
Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (1.5.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (3.6.0)
Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 110.2 MB/s eta 0:00:00
Installing collected packages: scikit-learn
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
umap-learn 0.5.9.post2 requires scikit-learn>=1.6, but you have scikit-learn 1.5.2 which is incompatible.
Successfully installed scikit-learn-1.5.2
In [2]:
from scikeras.wrappers import KerasClassifier
In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import seaborn as sns
import random
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import plot_model
from tensorflow.keras.layers import Dense, Flatten, BatchNormalization, Dropout, Conv2D, MaxPooling2D, LSTM
from tensorflow.keras.regularizers import l2
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import *
from tensorflow.keras.optimizers import Adam, SGD, RMSprop
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import classification_report, confusion_matrix

Run the following lines to prepare the data for the models (the code was also used in the previous practical session).

In [4]:
# Load the dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(sample_images, sample_labels), (test_images, test_labels) = fashion_mnist.load_data()

# Set a random seed for reproducibility
np.random.seed(100)

# Randomly choose 15,000 indices from the range of train_images length
indices = np.random.choice(sample_images.shape[0], 15000, replace=False)

# Use these indices to sample images and labels
train_images = sample_images[indices]
train_labels = sample_labels[indices]

# Now sample_images and sample_labels contain your 15,000 samples
print("train_images shape:", train_images.shape)
print("train_labels shape:", train_labels.shape)

# Flatten the image data and convert it to a DataFrame
# The images are reshaped from 28x28 to 784 per image
train_images_flattened = train_images.reshape(train_images.shape[0], -1)
test_images_flattened = test_images.reshape(test_images.shape[0], -1)

# Create DataFrames
train_df = pd.DataFrame(train_images_flattened)
test_df = pd.DataFrame(test_images_flattened)

# Add labels to the DataFrames
train_df['label'] = train_labels
test_df['label'] = test_labels

# Normalize the pixel values to be between 0 and 1
X_train= train_images.astype('float32') / 255.0
X_test = test_images.astype('float32') / 255.0

# One-hot encode the labels
y_train = tf.keras.utils.to_categorical(train_labels, num_classes=10)
y_test = tf.keras.utils.to_categorical(test_labels, num_classes=10)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
29515/29515 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26421880/26421880 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
5148/5148 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4422102/4422102 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
train_images shape: (15000, 28, 28)
train_labels shape: (15000,)

Lets refresh our memory regarding the structure and characteristics of the dataset.

In [5]:
fig, ax = plt.subplots(6, 6, figsize=(8, 8))
fig.suptitle('Fashion images and labels', fontsize=14)
ax = ax.ravel()  # This line ensuring ax is a flat array

for i in range(36):
    sample_n = random.randint(0, X_train.shape[0] - 1)
    ax[i].imshow((X_train[sample_n]).reshape(28, 28), cmap='Greys')
    ax[i].get_xaxis().set_visible(False)
    ax[i].get_yaxis().set_visible(False)
    label_index = np.argmax(y_train[sample_n])
    ax[i].set_title(label_index, fontsize=12)


plt.subplots_adjust(hspace=0.3)

As we can see, each training example is assigned to one of the following labels:

  1. T-shirt/top
  2. Trouser
  3. Pullover
  4. Dress
  5. Coat
  6. Sandal
  7. Shirt
  8. Sneaker
  9. Bag
  10. Ankle boot

Let's begin!¶

Recurrent Neural Networks¶

A Recurrent Neural Network (RNN) is typically used for sequential data like text. However, it can be adapted for image data such as the Fashion MNIST dataset by processing rows or columns of pixels as sequences. This allows the RNN to capture spatial information over sequences of the image. Long-short term memory (LSTMs), a specialized kind of RNN, are particularly effective for this purpose. They selectively remember patterns over long sequences, which can be useful for learning the nuances of fashion item images.

1. Reshape the X_train and X_test variables. Use the shape (number of samples, 28, 28).

In [6]:
# Reshape for RNN input

x_train_rnn = X_train.reshape(X_train.shape[0], 28, 28)
x_test_rnn = X_test.reshape(X_test.shape[0], 28, 28)

2. Build a Recurrent Neural Network (RNN) using the Keras Sequential API to classify images from the Fashion-MNIST dataset. The model should include: 1. Two LSTM layers: one with 64 units (returning sequences), the next with 32 units 2. A Dropout layer (rate 0.25) after each LSTM to prevent overfitting. 3. A Dense layer with 16 units and ReLU activation 4. A final Dense output layer with 10 units (softmax) for multi-class classification Compile the model using the Adam optimizer and categorical crossentropy as the loss function.

In [7]:
# Define the RNN model architecture
# Creates a new Sequential model, where layers will be added one after another.
model_rnn = Sequential()

# Assuming X_train_shape is (num_samples, 28, 28)
# Adds the first LSTM layer with: 64 units (memory cells).
# input_shape=(28, 28): treating each image as a sequence of 28 rows with 28 features (pixels).
# activation='relu'.
# return_sequences=True: necessary because another LSTM will follow.
model_rnn.add(LSTM(64, input_shape=(28, 28), activation='relu', return_sequences=True))

# Adds a Dropout layer that randomly sets 25% of the LSTM output units to zero during training — this helps prevent overfitting.
model_rnn.add(Dropout(0.25))
model_rnn.add(LSTM(32, activation='relu'))
model_rnn.add(Dropout(0.25))

# Adds a Dense (fully connected) layer with 16 units and ReLU activation — this acts as a hidden layer before the output.
model_rnn.add(Dense(16, activation='relu'))
model_rnn.add(Dropout(0.25))

# Final output layer with 10 units (one per Fashion-MNIST class).
# Uses softmax activation to produce a probability distribution over the classes.
model_rnn.add(Dense(10, activation='softmax'))

# Compile the model with the optimizer and define the loss and metrics
model_rnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
/usr/local/lib/python3.11/dist-packages/keras/src/layers/rnn/rnn.py:200: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)

3. Output the summary of the model and visualize the architecture of RNN model by using the plot_model function from keras.utils.

In [8]:
model_rnn.summary()
plot_model(model_rnn, to_file='model_rnn.png', show_shapes=True, show_layer_names=True, dpi=66)
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 28, 64)         │        23,808 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 28, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM)                   │ (None, 32)             │        12,416 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 16)             │           528 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout)             │ (None, 16)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 10)             │           170 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 36,922 (144.23 KB)
 Trainable params: 36,922 (144.23 KB)
 Non-trainable params: 0 (0.00 B)
Out[8]:

Model Parameter Breakdown (LSTM for Fashion-MNIST)¶

This cell explains how the number of parameters is calculated for each layer in the Sequential LSTM model.


Layer 1: LSTM(64) Output shape: (None, 28, 64)
Parameters: 23,808

  • Formula: 4 * units * (input_dim + units + 1)
  • Here:
    • units = 64
    • input_dim = 28 (each row of the image)
    • 4 comes from the internal gates in the LSTM (input, forget, cell, output)
  • Calculation:
    4 * 64 * (28 + 64 + 1) = 4 * 64 * 93 = 23,808

Layer 2: Dropout(0.25) Output shape: (None, 28, 64)
Parameters: 0

  • Dropout is a regularization layer. It has no learnable parameters.

Layer 3: LSTM(32) Output shape: (None, 32)
Parameters: 12,416

  • Formula: 4 * units * (input_dim + units + 1)
  • Here:
    • units = 32
    • input_dim = 64 (from previous LSTM)
  • Calculation:
    4 * 32 * (64 + 32 + 1) = 4 * 32 * 97 = 12,416

Layer 4: Dropout(0.25) Output shape: (None, 32)
Parameters: 0


Layer 5: Dense(16) Output shape: (None, 16)
Parameters: 528

  • Formula: (input_dim + 1) * output_units
  • Calculation: (32 + 1) * 16 = 33 * 16 = 528

Layer 6: Dropout(0.25) Output shape: (None, 16)
Parameters: 0


Layer 7: Dense(10) Output shape: (None, 10)
Parameters: 170

  • Formula: (input_dim + 1) * output_units
  • Calculation: (16 + 1) * 10 = 17 * 10 = 170

Total Parameters: 36,922

This is the number of values that the model will learn during training.

4. Now it's time to train! Train (with fit()) the RNN model for 5 epochs and a batch size of 64, using x_train_rnn and y_train, and validate using x_test_rnn and y_test. Save the training history.

In [9]:
# Train the model
history = model_rnn.fit(
    x_train_rnn, y_train,
    epochs=5,
    batch_size=64,
    validation_data=(x_test_rnn, y_test))

tf.keras.backend.clear_session()
Epoch 1/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 19s 30ms/step - accuracy: 0.2966 - loss: 1.9026 - val_accuracy: 0.6716 - val_loss: 0.9706
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 11ms/step - accuracy: 0.5851 - loss: 1.1036 - val_accuracy: 0.7087 - val_loss: 0.7625
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 11ms/step - accuracy: 0.6344 - loss: 0.9448 - val_accuracy: 0.7485 - val_loss: 0.6675
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 5s 10ms/step - accuracy: 0.6868 - loss: 0.8066 - val_accuracy: 0.7690 - val_loss: 0.6103
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 9ms/step - accuracy: 0.7102 - loss: 0.7588 - val_accuracy: 0.7661 - val_loss: 0.6118

5. Plot the model's accuracy and loss over epochs. Ypu can use the same function as in the previous practical

In [10]:
def plot_training_history(history):

    plt.figure(figsize=(12, 5))

    # Plotting accuracy
    plt.subplot(1, 2, 1)
    plt.plot(history.history['accuracy'], label='Training Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend(loc='lower right')

    # Plotting loss
    plt.subplot(1, 2, 2)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title('Model Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend(loc='upper right')

    plt.show()

plot_training_history(history)

6. Calculate the RNN model's accuracy on the training and test datasets.

In [11]:
# Evaluate the model on test dataset
loss, accuracy = model_rnn.evaluate(x_train_rnn, y_train, verbose=False)
print("Training Accuracy: {:.4f}".format(accuracy))
loss, accuracy = model_rnn.evaluate(x_test_rnn, y_test, verbose=False)
print("Testing Accuracy:  {:.4f}".format(accuracy))
Training Accuracy: 0.7736
Testing Accuracy:  0.7661

7. Train the RNN model for 10 epochs and compare the results.

In [12]:
# Define the RNN model architecture
model_rnn = Sequential()

# Adding LSTM layers and dropout to avoid overfitting
model_rnn.add(LSTM(64, input_shape=(28, 28), activation='relu', return_sequences=True))
model_rnn.add(Dropout(0.25))
model_rnn.add(LSTM(32, activation='relu'))
model_rnn.add(Dropout(0.25))
model_rnn.add(Dense(16, activation='relu'))
model_rnn.add(Dropout(0.25))
model_rnn.add(Dense(10, activation='softmax'))


# Compile the model with the optimizer and define the loss and metrics
model_rnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model_rnn.fit(
    x_train_rnn, y_train,
    epochs=10,
    batch_size=64,
    validation_data=(x_test_rnn, y_test))

tf.keras.backend.clear_session()
/usr/local/lib/python3.11/dist-packages/keras/src/layers/rnn/rnn.py:200: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
Epoch 1/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 13s 30ms/step - accuracy: 0.2878 - loss: 2.0091 - val_accuracy: 0.6446 - val_loss: 1.0349
Epoch 2/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 9ms/step - accuracy: 0.5664 - loss: 1.1923 - val_accuracy: 0.6928 - val_loss: 0.8227
Epoch 3/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 9ms/step - accuracy: 0.6367 - loss: 0.9776 - val_accuracy: 0.7311 - val_loss: 0.7197
Epoch 4/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 12ms/step - accuracy: 0.6770 - loss: 0.8637 - val_accuracy: 0.7527 - val_loss: 0.6507
Epoch 5/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 9ms/step - accuracy: 0.7072 - loss: 0.7817 - val_accuracy: 0.7525 - val_loss: 0.6212
Epoch 6/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 9ms/step - accuracy: 0.7283 - loss: 0.7378 - val_accuracy: 0.7627 - val_loss: 0.6239
Epoch 7/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 9ms/step - accuracy: 0.7385 - loss: 0.7146 - val_accuracy: 0.7863 - val_loss: 0.5794
Epoch 8/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 9ms/step - accuracy: 0.7505 - loss: 0.6973 - val_accuracy: 0.7845 - val_loss: 0.5735
Epoch 9/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 3s 13ms/step - accuracy: 0.7622 - loss: 0.6515 - val_accuracy: 0.7914 - val_loss: 0.5794
Epoch 10/10
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 9ms/step - accuracy: 0.7606 - loss: 0.6545 - val_accuracy: 0.8043 - val_loss: 0.5637

8. Plot the results of the model with the 10 epochs.

In [13]:
plot_training_history(history)

Convolutional neural networks¶

Convolutional Neural Networks (CNNs), often referred to as convnets, have significantly transformed the landscape of machine learning, especially in image classification and computer vision. Their ability to autonomously extract and learn features from images has set new benchmarks in how machines interpret visual information.

9. Reshape Fashion MNIST images to include a channel dimension for CNN input, converting 2D arrays (28x28) into 3D arrays (28x28x1).

In [14]:
x_train_cnn = X_train.reshape(-1, 28, 28, 1)  # Add channel dimension for training set
x_test_cnn = X_test.reshape(-1, 28, 28, 1)   # Add channel dimension for test set

10. Build a CNN model with Keras for the Fashion-MNIST dataset. Reshape images for Conv2D compatibility, add a MaxPooling2D (pool size 2) and a Dropout layer (rate 0.25), flatten for dense layers, then use a 32-unit Dense layer and a 10-unit 'softmax' output layer. Compile with Adam optimizer and categorical_crossentropy.

In [15]:
# Define the model architecture
cnn_model = Sequential([
    Conv2D(filters=64, kernel_size=3, activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D(pool_size=2),
    Dropout(0.25),
    Flatten(),
    Dense(32, activation='relu'),
    Dropout(0.25),
    Dense(10, activation='softmax')
])

# Compile the model
cnn_model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)

11. Print the summary and use the plot_model function from keras.utils to create a visual representation of your CNN model's architecture.

In [16]:
cnn_model.summary()
#Visualize the CNN Model Architecture
plot_model(cnn_model, to_file='cnn_model.png', show_shapes=True, dpi=66)
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 26, 26, 64)     │           640 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 13, 13, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 13, 13, 64)     │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 10816)          │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 32)             │       346,144 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 10)             │           330 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 347,114 (1.32 MB)
 Trainable params: 347,114 (1.32 MB)
 Non-trainable params: 0 (0.00 B)
Out[16]:

12. Fit the model in 5 epochs with a batch size of 64. Save the training process in a variable named history.

In [17]:
# Train the model
history = cnn_model.fit(x_train_cnn, y_train, epochs=5, batch_size=64, validation_data=(x_test_cnn, y_test))
tf.keras.backend.clear_session()
Epoch 1/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 8s 20ms/step - accuracy: 0.6101 - loss: 1.1235 - val_accuracy: 0.8263 - val_loss: 0.4991
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8019 - loss: 0.5695 - val_accuracy: 0.8412 - val_loss: 0.4337
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.8407 - loss: 0.4573 - val_accuracy: 0.8513 - val_loss: 0.4112
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8512 - loss: 0.4118 - val_accuracy: 0.8640 - val_loss: 0.3752
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8622 - loss: 0.3822 - val_accuracy: 0.8764 - val_loss: 0.3507

13. Evaluate the accuracy of the model on the training and test data.

In [18]:
# Evaluate the model on the reshaped training dataset
loss, accuracy = cnn_model.evaluate(x_train_cnn, y_train, verbose=False)
print("Training Accuracy: {:.4f}".format(accuracy))
print('training loss: {:.4f}'.format(loss))

# Evaluate the model on the reshaped test dataset
loss, accuracy = cnn_model.evaluate(x_test_cnn, y_test, verbose=False)
print("Testing Accuracy:  {:.4f}".format(accuracy))
print('testing loss: {:.4f}'.format(loss))
Training Accuracy: 0.9021
training loss: 0.2708
Testing Accuracy:  0.8764
testing loss: 0.3507

Hyperparameter Optimization¶

14. Define a function named create_model that constructs a CNN model for the Fashion MNIST dataset. The function should accept the number of filters, kernel size, and dense layer size (embedding size) as input arguments. Utilize the structure of your prior CNN model, ensuring these parameters dynamically define the model's convolutional layers and dense layer accordingly.

In [19]:
def create_model(num_filters, kernel_size, dense_size):
    model = Sequential([
        Conv2D(filters=num_filters, kernel_size=kernel_size, activation='relu', input_shape=(28, 28, 1)),
        MaxPooling2D(pool_size=2),
        Dropout(0.2),
        Flatten(),
        Dense(dense_size, activation='relu'),
        Dense(10, activation='softmax')  # Output layer for 10 classes
    ])

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

15. Use a Python dictionary to define a set of hyperparameters for your CNN model, including the number of filters, kernel size, and dense layer size. Implement this in your create_model function, allowing for dynamic model configuration.

In [20]:
param_grid = dict(num_filters=[16, 32, 64],
                  kernel_size=[3, 5, 7],
                  dense_size=[16, 32])

16. Use the KerasClassifier from scikeras and set model to create_model function. Set epochs = 5 and batch_size=64

In [21]:
model = KerasClassifier(model=create_model,
                  epochs = 5,
                  batch_size=64,
                  num_filters = 32,
                  kernel_size = 3,
                  dense_size = 32,
                  verbose=True)

17. Use RandomizedSearchCV together with the KerasClassifier model you created and the predefined hyperparameter grid. Set it up for 5-fold cross-validation. Set n_iter to 3 (we do that so the fitting part will not take a lot of time during the practical.)

In [22]:
grid = RandomizedSearchCV(
    estimator=model,
    param_distributions=param_grid,
    n_iter=3,
    cv=5,
    verbose=2,
    n_jobs=1  # Here is where n_jobs should be set
)

18. Fit the RandomizedSearchCV instance with x_train_cnn and y_train, initiating the search for the best hyperparameters.

In [23]:
grid_result = grid.fit(x_train_cnn, y_train)
Fitting 5 folds for each of 3 candidates, totalling 15 fits
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
Epoch 1/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 5s 10ms/step - accuracy: 0.5545 - loss: 1.2760
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.7886 - loss: 0.5964
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8165 - loss: 0.5156
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8355 - loss: 0.4663
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8460 - loss: 0.4319
47/47 ━━━━━━━━━━━━━━━━━━━━ 1s 9ms/step
[CV] END .......dense_size=32, kernel_size=7, num_filters=16; total time=  10.8s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 10ms/step - accuracy: 0.6241 - loss: 1.1549
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8097 - loss: 0.5156
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8376 - loss: 0.4499
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8530 - loss: 0.4172
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8632 - loss: 0.3878
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=7, num_filters=16; total time=   9.0s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 10ms/step - accuracy: 0.5737 - loss: 1.1815
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8108 - loss: 0.5266
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8331 - loss: 0.4680
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8577 - loss: 0.4078
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8546 - loss: 0.4035
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=7, num_filters=16; total time=   8.3s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 3s 8ms/step - accuracy: 0.6268 - loss: 1.1111
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8105 - loss: 0.5120
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - accuracy: 0.8387 - loss: 0.4515
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8538 - loss: 0.4075
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8670 - loss: 0.3795
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=7, num_filters=16; total time=   6.4s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 11ms/step - accuracy: 0.5949 - loss: 1.1834
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8070 - loss: 0.5384
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8289 - loss: 0.4853
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8458 - loss: 0.4361
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - accuracy: 0.8640 - loss: 0.3885
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=7, num_filters=16; total time=   7.5s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 10ms/step - accuracy: 0.6292 - loss: 1.1017
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 3s 5ms/step - accuracy: 0.8203 - loss: 0.4984
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 7ms/step - accuracy: 0.8604 - loss: 0.3969
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8689 - loss: 0.3678
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8863 - loss: 0.3209
47/47 ━━━━━━━━━━━━━━━━━━━━ 1s 8ms/step
[CV] END .......dense_size=32, kernel_size=5, num_filters=32; total time=  10.8s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6362 - loss: 1.0497
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8240 - loss: 0.4845
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8535 - loss: 0.4120
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8731 - loss: 0.3591
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8758 - loss: 0.3447
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step
[CV] END .......dense_size=32, kernel_size=5, num_filters=32; total time=   6.6s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 10ms/step - accuracy: 0.6297 - loss: 1.0924
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8305 - loss: 0.4836
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8464 - loss: 0.4351
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8701 - loss: 0.3667
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8754 - loss: 0.3559
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step
[CV] END .......dense_size=32, kernel_size=5, num_filters=32; total time=   9.2s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6337 - loss: 1.0726
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.8220 - loss: 0.4941
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8562 - loss: 0.4101
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8669 - loss: 0.3664
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8755 - loss: 0.3447
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=5, num_filters=32; total time=   9.1s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.6300 - loss: 1.0706
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8371 - loss: 0.4607
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8595 - loss: 0.4000
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8679 - loss: 0.3684
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8807 - loss: 0.3357
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=32, kernel_size=5, num_filters=32; total time=   6.6s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 5s 11ms/step - accuracy: 0.5650 - loss: 1.2079
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8141 - loss: 0.5132
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8547 - loss: 0.4116
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8646 - loss: 0.3681
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8785 - loss: 0.3356
47/47 ━━━━━━━━━━━━━━━━━━━━ 1s 10ms/step
[CV] END .......dense_size=16, kernel_size=5, num_filters=64; total time=   9.2s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 5s 12ms/step - accuracy: 0.5177 - loss: 1.4369
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.7813 - loss: 0.5997
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8212 - loss: 0.5069
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8377 - loss: 0.4425
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8570 - loss: 0.4051
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=16, kernel_size=5, num_filters=64; total time=   9.9s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 9ms/step - accuracy: 0.5793 - loss: 1.2147
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.8039 - loss: 0.5439
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8369 - loss: 0.4623
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8499 - loss: 0.4250
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8610 - loss: 0.3935
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step
[CV] END .......dense_size=16, kernel_size=5, num_filters=64; total time=   9.6s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 5s 11ms/step - accuracy: 0.5603 - loss: 1.3054
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8203 - loss: 0.5049
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step - accuracy: 0.8426 - loss: 0.4338
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 8ms/step - accuracy: 0.8681 - loss: 0.3760
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8802 - loss: 0.3392
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step
[CV] END .......dense_size=16, kernel_size=5, num_filters=64; total time=  11.9s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
188/188 ━━━━━━━━━━━━━━━━━━━━ 4s 9ms/step - accuracy: 0.5696 - loss: 1.3112
Epoch 2/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8046 - loss: 0.5865
Epoch 3/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8548 - loss: 0.4193
Epoch 4/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8645 - loss: 0.3808
Epoch 5/5
188/188 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8818 - loss: 0.3373
47/47 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step
[CV] END .......dense_size=16, kernel_size=5, num_filters=64; total time=   7.7s
Epoch 1/5
/usr/local/lib/python3.11/dist-packages/keras/src/layers/convolutional/base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
235/235 ━━━━━━━━━━━━━━━━━━━━ 5s 9ms/step - accuracy: 0.6242 - loss: 1.0858
Epoch 2/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8263 - loss: 0.4942
Epoch 3/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8536 - loss: 0.4119
Epoch 4/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8716 - loss: 0.3672
Epoch 5/5
235/235 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8778 - loss: 0.3397

19. Identify and note the best score and hyperparameters obtained from the RandomizedSearchCV results.

In [24]:
print(grid_result.best_score_)
print(grid_result.best_params_)
0.8741999999999999
{'num_filters': 32, 'kernel_size': 5, 'dense_size': 32}

20. Evaluate the tuned model on the test set (X_test, y_test) to measure its final performance.

In [25]:
test_accuracy = grid.score(x_test_cnn, y_test)
test_accuracy
157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step
Out[25]:
0.8522

Predict¶

21. Predict the labels for the test dataset using the trained model.

In [26]:
# Predict
predictions = grid_result.predict(x_test_cnn)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step

Evaluating Model¶

22. Evaluate the performance of the model using a classification report and confusion matrix. You can use the fashiol labels if you want to add the categories of the items on the confusion matrix (fashion_labels = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"] )

In [30]:
fashion_labels = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
y_pred = np.argmax(predictions, axis=1)
y_true = np.argmax(y_test, axis=1)

# Generate a classification report
print(classification_report(y_true, y_pred, target_names=fashion_labels))

# Optional: Confusion matrix
conf_matrix = confusion_matrix(y_true, y_pred)

# Plotting the confusion matrix
plt.figure(figsize=(12, 10))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=fashion_labels, yticklabels=fashion_labels)
plt.title('Confusion Matrix')
plt.xlabel('Predicted Labels')
plt.ylabel('True Labels')
plt.show()
              precision    recall  f1-score   support

 T-shirt/top       0.83      0.80      0.82      1000
     Trouser       0.99      0.96      0.97      1000
    Pullover       0.63      0.90      0.74      1000
       Dress       0.86      0.89      0.88      1000
        Coat       0.80      0.69      0.74      1000
      Sandal       0.97      0.93      0.95      1000
       Shirt       0.73      0.47      0.57      1000
     Sneaker       0.89      0.97      0.93      1000
         Bag       0.90      0.98      0.94      1000
  Ankle boot       0.96      0.93      0.95      1000

    accuracy                           0.85     10000
   macro avg       0.86      0.85      0.85     10000
weighted avg       0.86      0.85      0.85     10000

Optional. Visualize predictions of a model on the Fashion MNIST dataset by plotting images with predicted and true labels, highlighting correct and incorrect predictions.

In [32]:
def plot_image(i, predictions_array, true_label, img):
    true_label, img = np.argmax(true_label), img.reshape(28, 28)
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    plt.imshow(img, cmap=plt.cm.binary)

    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

    plt.xlabel("{} {:2.0f}% ({})".format(fashion_labels[predicted_label],
                                         100*np.max(predictions_array),
                                         fashion_labels[true_label]),
               color=color)


# Plotting a few predictions
num_rows = 4
num_cols = 4
num_images = num_rows*num_cols
plt.figure(figsize=(1.5*1.7*num_cols, 2*num_rows))
for i in range(num_images):
    plt.subplot(num_rows, 2*num_cols, 2*i+1)
    plot_image(i, predictions[i], y_test[i], x_test_cnn[i])
plt.tight_layout()
plt.show()

End of practical!