Anastasia Giachanou, Tina Shahedi
Machine Learning with Python - Utrecht Summer School
In this practical, we'll focus on the Fashion-MNIST dataset, which is a collection of 60,000 grayscale images representing 10 different categories of fashion items like T-shirts, trousers, and shoes. This dataset is appropriate for understanding and implementing multiclass image classification using deep learning techniques. We'll use the Keras library, an API for neural networks which runs on top of Tensorflow (Google), and Theano.
We will also construct and train neural network models to accurately classify the fashion images and we will optimise the parameters.
Learning Goals:
Let's start by installing TensorFlow using !pip install.
TensorFlow is an open-source machine learning library developed by Google. It provides tools to build and train machine learning models — especially deep learning models like neural networks.
Tensors are multi-dimensional arrays that generalize vectors and matrices. They can have any number of dimensions, which makes them suitable for representing diverse types of data — such as images, text, or audio. Tensors are the building blocks of data representation and computation in deep learning models.
They store:
!pip install scikeras[tensorflow] > /dev/null 2>&1 # gpu compute platform
!pip install scikeras[tensorflow-cpu] > /dev/null 2>&1
!pip install scikeras > /dev/null 2>&1
!pip uninstall -y scikit-learn
!pip install scikit-learn==1.5.2
Found existing installation: scikit-learn 1.5.2 Uninstalling scikit-learn-1.5.2: Successfully uninstalled scikit-learn-1.5.2 Collecting scikit-learn==1.5.2 Using cached scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB) Requirement already satisfied: numpy>=1.19.5 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (2.0.2) Requirement already satisfied: scipy>=1.6.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (1.15.3) Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (1.5.1) Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn==1.5.2) (3.6.0) Using cached scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB) Installing collected packages: scikit-learn ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. umap-learn 0.5.9.post2 requires scikit-learn>=1.6, but you have scikit-learn 1.5.2 which is incompatible. Successfully installed scikit-learn-1.5.2
We used >/dev/null 2>&1 to hide the output. Additionally, we can check the TensorFlow version we've installed.
As usual we will start with importing the required libraries and datasets.
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import pprint as pp # for nicely formatting complex data structures
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, BatchNormalization, Dropout
from tensorflow.keras.regularizers import l2
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import Adam, SGD
from sklearn.model_selection import RandomizedSearchCV
from scikeras.wrappers import KerasClassifier
scikeras.wrappers.KerasClassifier is a wrapper class that allows you to use Keras models inside scikit-learn tools like GridSearchCV, RandomizedSearchCV, or cross-validation. It makes a Keras model behave like a scikit-learn estimator.
# Set a random seed for reproducibility
np.random.seed(100)
tf.random.set_seed(221)
Let's load the dataset Fashion-MNIST first. Fashion-MNIST (https://www.tensorflow.org/datasets/catalog/fashion_mnist) is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
1. Load the Fashion-MNIST which is part of the keras datasets. First you need to call the fashion_mnist module from the package tensorflow.keras.datasets. Once you do that you can use the method
load_data() to load the dataset.
The method load_data() will return a tuple that contains two tuples. The first tuple contains the training data and the second tuple contains the test data: test_images and test_labels.
You can load the data into the tuple (sample_images, sample_labels), (test_images, test_labels)
# Load the dataset
fashion_mnist = tf.keras.datasets.fashion_mnist
(sample_images, sample_labels), (test_images, test_labels) = fashion_mnist.load_data()
Because neural networks need some time to run and to be optimised, we decided to randomly select a part of the training images and work with that. This is a common strategy when we have to develop code and we have a lot of data.
Note This is a strategy that you can use only for coding because the code will run faster and NOT for model selction or evaluation etc.
2. Now randomly select 30,000 train images and train labels (from the tuple you made before) and save them into the train_images and train_labels variables. First, you can create random indices that then you can use to sample the data. For the random selection you can use np.random.choice (https://numpy.org/doc/stable/reference/random/generated/numpy.random.choice.html)
# Randomly choose 30,000 indices from the range of train_images length
# replace: Whether the sample is with or without replacement. Default is True, meaning that a value of a can be selected multiple times. In this case we need without replacement
indices = np.random.choice(sample_images.shape[0], 30000, replace=False)
# Use these indices to sample images and labels
train_images = sample_images[indices]
train_labels = sample_labels[indices]
# Now sample_images and sample_labels contain your 30,000 samples
print("train_images shape:", train_images.shape)
print("train_labels shape:", train_labels.shape)
train_images shape: (30000, 28, 28) train_labels shape: (30000,)
If we want to see how the data look like then we can print some of the initial images together with the labels.
3. Plot the first 9 images together with their labels. Note that you can use a for loop for that. To show the images, you can use the plt.imshow() inside the loop and set the parameter cmap='gray'. If you want to place the images in a 3x3 grid you can do it with plt.subplot(3, 3, i + 1) where i can be an iterator of a for loop. The subplot needs to run before the imshow
# As an example this will show just one image
# Display the first image and its label
plt.figure(figsize=(3, 3))
plt.imshow(train_images[0], cmap='gray')
plt.title(f'Label: {train_labels[0]}')
plt.axis('off') # Hide axis ticks
plt.show()
# Display the first few images and their labels
plt.figure(figsize=(5, 5))
for i in range(9):
plt.subplot(3, 3, i + 1)
plt.imshow(train_images[i], cmap='gray')
plt.title(f'Label: {train_labels[i]}')
plt.axis('off')
plt.show()
The original pixel values in train_images and test_images are stored as integers ranging from 0 to 255.
With the following lines we will normalise the data. We divide all pixel values by 255.0, which rescales them from the range [0, 255] → [0.0, 1.0]. Normalization helps neural networks:
Let's normalise the data using the following lines.
X_train= train_images.astype('float32') / 255.0
X_test = test_images.astype('float32') / 255.0
tf.keras.utils.to_categorical() function to encode the categorical labels (both the train and test labels) and then split your training data into train and validation sets. Select the first 20,000 observations as the new training set (this code X_train[:20000] will return the first 20,000 observations from the X_train) and the rest as the validation# One-hot encode the labels, if train_labels[0] = 3, it becomes: [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
y_train = tf.keras.utils.to_categorical(train_labels, num_classes=10)
y_test = tf.keras.utils.to_categorical(test_labels, num_classes=10)
# Split the training data into training and validation sets
X_train, X_val = X_train[:20000], X_train[20000:]
y_train, y_val = y_train[:20000], y_train[20000:]
We now finished with data preprocessing and preparation and we will move to the modeling part!
In this section, we will build a simple neural network using the Sequential API from Keras. Our goal is to see how well a basic fully connected model (also known as a dense neural network) can perform on the Fashion-MNIST image classification task.
What is the Sequential API? The Sequential API in Keras allows you to create models layer by layer, where each layer has exactly one input and one output (https://www.tensorflow.org/guide/keras/sequential_model)
What if I need more flexibility? The functional API (https://www.tensorflow.org/guide/keras/functional) allows you to create models that have a lot more flexibility as you can define models where layers connect to more than just the previous and next layers. In this way, you can connect layers to (literally) any other layer.
Let's start with a basic example. The following code defines a Sequential neural network for classifying Fashion-MNIST images:
This structure allows the model to learn increasingly abstract patterns from the image data and make predictions about which class each image belongs to.
model = Sequential([
Flatten(input_shape=(28, 28)), # Input layer to flatten the images
Dense(256, activation='relu'), # Hidden layer with considerable complexity
Dense(128, activation='relu'), # Subsequent hidden layer to further refine the learned features
Dense(10, activation='softmax') # Output layer with 10 units for each category
])
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
5. Visualize the architecture of the neural network using keras.utils.plot_model function. The first parameter of the function is the model. Also you can use show_shapes=True to display shape information and dpi=66 to change the resolution
tf.keras.utils.plot_model(model, show_shapes=True, dpi=66)
Here, we've built a sequential neural network model for Fashion-MNIST, consisting of flattened input images passed through dense layers with ReLU activation. The flatten layer transforms the multi-dimensional input into a flat vector of 784 elements, preparing it for the network's learning process. Following this, the model features two fully connected layers, with the first comprising 256 neurons and the second 128 neurons, both instrumental in identifying complex data patterns. The final layer uses softmax for classifying into the 10 fashion categories.
We will now compile the model with an optimizer, loss function, and metrics for training. For classification problems, you can use categorical_crossentropy. categorical_crossentropy is a loss function used to measure the difference between the model’s predicted class probabilities and the actual (true) class labels. It is specifically designed for multi-class classification problems where each input belongs to exactly one of multiple categories and labels are one-hot encoded
6. Compile (compile()) the neural network model using compile functions and using categorical_crossentropy' as the loss function (loss = 'categorical_crossentropy') and the optimiser to Adam
(optimizer='adam'). Also set the metircs to accuracy (metrics=['accuracy']`)
Adam optimizer is a popular and efficient gradient descent method and is usually a good default for deep learning tasks
# model.compile(...) sets up the learning process before you start training with model.fit().
model.compile(optimizer='Adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
7.Use the summary function to get a summary of the model. How many parameters does every layer have? How did we end up with those numbers?
# Print the model summary
model.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ flatten (Flatten) │ (None, 784) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 256) │ 200,960 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (Dense) │ (None, 128) │ 32,896 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (Dense) │ (None, 10) │ 1,290 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 235,146 (918.54 KB)
Trainable params: 235,146 (918.54 KB)
Non-trainable params: 0 (0.00 B)
How did we end up with those parameters?
Here is how those numbers were estimated. For example, at the first dense layer, there are 784 inputs (from the flattened image) and 256 neurons in the first dense layer. Each neuron requires 784 weights and 1 bias. So 784×256(weights) + 256(biases) = 200,960
So total parameters for each layer and the model were estimated as follows:
| Layer | Input Units | Output Neurons | Parameters |
|---|---|---|---|
| Flatten | 28×28 | 784 | 0 |
| Dense1 | 784 | 256 | 784×256 + 256 = 200,960 |
| Dense2 | 256 | 128 | 256×128 + 128 = 32,896 |
| Dense3 | 128 | 10 | 128×10 + 10 = 1,290 |
| Total | — | — | 235,146 |
From the summary, we can also see the number of parameters. For this simple model, we have more than 235,000 parameters. (by the way, LLMs have million or billion parameters - GPT-3 has 175 BILLION parameters)
Selecting the appropriate neural network architecture and loss function is necessary for compiling the model successfully. The following table categorizes common tasks depending on the respective output types, activation functions, loss functions, and metrics, guiding model development.
| Task | Output Type | Last-layer Activation | Loss Function | Metric(s) |
|---|---|---|---|---|
| Regression | Numerical | Linear | meanSquaredError (MSE), meanAbsoluteError (MAE) | Same as loss |
| Binary Classification | Binary | Sigmoid | binary_crossentropy | Accuracy, precision, recall, sensitivity, TPR, FPR, ROC, AUC |
| Classification: Single Label, Multiple Classes | Categorical | Softmax | categorical_crossentropy | Accuracy, confusion matrix |
| Classification: Multiple Labels, Multiple Classes | Categorical | Sigmoid | binary_crossentropy | Accuracy, precision, recall, sensitivity, TPR, FPR, ROC, AUC |
The task we work in this practical belongs to the Classification: Single Label, Multiple Classes category. For such tasks, models employ softmax activation and categorical crossentropy loss.
As you may have noticed, up to this point we haven't used the training set yet. In the next step, we will train the model; this involves fitting it to the training data for a specified number of epochs.
What is an Epoch? An epoch is one complete pass through the entire training dataset. When we train for 1 epoch, the model sees each training sample once and updates its internal parameters accordingly. Training for multiple epochs means the model gets multiple chances to learn from the same data
What is Batch Size? The batch size is a hyperparameter that defines the number of samples to work through before updating the internal model parameters.
In short, the batch size is a number of samples processed before the model is updated and the number of epochs is the number of complete passes through the training dataset.
The size of a batch must be more than or equal to one and less than or equal to the number of samples in the training dataset.
8. Train your neural network model on the training data using the fit() function. The first 2 parameters are the input predictos (X_train) and target labels (y_train) of the training set. Set batch_size to 64 and epochs to 10. Set validation_data=(X_test, y_test)).
model_history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_val, y_val))
tf.keras.backend.clear_session()
Epoch 1/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 10s 18ms/step - accuracy: 0.7084 - loss: 0.8310 - val_accuracy: 0.8375 - val_loss: 0.4496 Epoch 2/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.8352 - loss: 0.4541 - val_accuracy: 0.8409 - val_loss: 0.4280 Epoch 3/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8555 - loss: 0.3920 - val_accuracy: 0.8576 - val_loss: 0.3927 Epoch 4/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8709 - loss: 0.3489 - val_accuracy: 0.8633 - val_loss: 0.3787 Epoch 5/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8796 - loss: 0.3237 - val_accuracy: 0.8659 - val_loss: 0.3787 Epoch 6/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8857 - loss: 0.2993 - val_accuracy: 0.8704 - val_loss: 0.3727 Epoch 7/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8938 - loss: 0.2784 - val_accuracy: 0.8699 - val_loss: 0.3672 Epoch 8/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.9003 - loss: 0.2619 - val_accuracy: 0.8740 - val_loss: 0.3664 Epoch 9/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9053 - loss: 0.2470 - val_accuracy: 0.8682 - val_loss: 0.3866 Epoch 10/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9142 - loss: 0.2343 - val_accuracy: 0.8801 - val_loss: 0.3480
Each line represents one epoch (a full pass through the training data), and includes:
Some observations that we can do here:
Note: Be aware that if you run the fit() function again, it continues with the weights already learned from the prior training round. To reset the model's state and begin training anew, call the clear_session() function from Keras' backend like this:
from keras.backend import clear_session
clear_session()
Or
tf.keras.backend.clear_session()
This will ensure that your model starts learning from scratch again.
9. Plot your model's training history to see its performance over epochs. Plot both 'accuracy' and 'loss' metrics for the training phase, comparing these measures across epochs. Use the following code snippets for plotting accuracy and loss (assuming that you named the model model_history):
# For plotting
model_history.history['accuracy']
# For plotting loss
model_history.history['loss']
plt.figure(figsize=(12, 5))
# Plotting model accuracy
plt.subplot(1, 2, 1)
plt.plot(model_history.history['accuracy'], label='Train Accuracy')
plt.plot(model_history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(loc='upper left')
# Plotting model loss
plt.subplot(1, 2, 2)
plt.plot(model_history.history['loss'], label='Train Loss')
plt.plot(model_history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(loc='upper left')
plt.show()
We will now create a function that can plot the accuracy and loss. In this case we can call this function when we want to plot the training and validation accuracy and loss
def plot_training_history(model_history):
plt.figure(figsize=(12, 5))
# Plotting accuracy
plt.subplot(1, 2, 1)
plt.plot(model_history.history['accuracy'], label='Training Accuracy')
plt.plot(model_history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(loc='lower right')
# Plotting loss
plt.subplot(1, 2, 2)
plt.plot(model_history.history['loss'], label='Training Loss')
plt.plot(model_history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend(loc='upper right')
plt.show()
10. Evaluate the performance of the trained model. To get the performance on the training set you can the history of the model (model_history.history['accuracy'][-1]). The evaluate the model on the testing data using the model.evaluate() function. This function will return you the test loss and the test accuracy. Compare that with the accuracy on train.
# Evaluate the model on test dataset
print('Train accuracy:', model_history.history['accuracy'][-1])
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=False)
print('Test accuracy:', test_acc)
Train accuracy: 0.913349986076355 Test accuracy: 0.8676000237464905
You can see that the accuracy for the training set is 91% versus 86% on the test set. This drop is expected. However, we can also optimise some of the parameters to see if the performance will change.
As we mentioned there are two different ways to optimise some parameters. First we will use the train//val/test set up and then we will work again with the GridSearchCV. We do that so we can show you both ways; when you work on a project then it is better to choose one of those options and stick to it.
We will now focus on optimising the learning rate. The learning rate in a deep learning model is a hyperparameter that regulates how frequently the model's weights are changed during training.
11. Now we will find the optimal learning rate using the train/dev/test set. You can start with a creating a list of different learning rates. Then you can use a for loop to iterate the learning rate values. In the body of the loop, you can create your model, compile it (here you can set optimizer=Adam(learning_rate=learning_rate) and then fit it. You can try learning_rates = [0.001, 0.02, 0.1]. Also, you would need to store the histories so then it is able to compare the performance when using differnet learning rates
# Experiment with different learning rates to find the optimal one
learning_rates = [0.001, 0.02, 0.1]
model_histories = {}
for lr in learning_rates:
print(f"Training model with learning rate: {lr}")
model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(256, activation='relu'),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
# Compile the model with a specified learning rate
model.compile(optimizer=Adam(learning_rate=lr),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Train the model and save the history
model_history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_val, y_val))
# Store the history
model_histories[lr] = model_history
# Clear the TensorFlow backend to reset model state
tf.keras.backend.clear_session()
Training model with learning rate: 0.001 Epoch 1/10
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
313/313 ━━━━━━━━━━━━━━━━━━━━ 5s 9ms/step - accuracy: 0.7163 - loss: 0.8057 - val_accuracy: 0.8311 - val_loss: 0.4566 Epoch 2/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8372 - loss: 0.4487 - val_accuracy: 0.8443 - val_loss: 0.4224 Epoch 3/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8562 - loss: 0.3878 - val_accuracy: 0.8633 - val_loss: 0.3839 Epoch 4/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8706 - loss: 0.3497 - val_accuracy: 0.8709 - val_loss: 0.3692 Epoch 5/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8787 - loss: 0.3224 - val_accuracy: 0.8752 - val_loss: 0.3617 Epoch 6/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 3s 4ms/step - accuracy: 0.8884 - loss: 0.2974 - val_accuracy: 0.8779 - val_loss: 0.3574 Epoch 7/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.8957 - loss: 0.2790 - val_accuracy: 0.8801 - val_loss: 0.3495 Epoch 8/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9028 - loss: 0.2607 - val_accuracy: 0.8800 - val_loss: 0.3512 Epoch 9/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9080 - loss: 0.2512 - val_accuracy: 0.8816 - val_loss: 0.3489 Epoch 10/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 6ms/step - accuracy: 0.9128 - loss: 0.2335 - val_accuracy: 0.8836 - val_loss: 0.3452 Training model with learning rate: 0.02 Epoch 1/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - accuracy: 0.6305 - loss: 1.6214 - val_accuracy: 0.8147 - val_loss: 0.5256 Epoch 2/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7970 - loss: 0.5665 - val_accuracy: 0.8309 - val_loss: 0.4992 Epoch 3/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8265 - loss: 0.4996 - val_accuracy: 0.8342 - val_loss: 0.4761 Epoch 4/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8309 - loss: 0.4728 - val_accuracy: 0.7984 - val_loss: 0.5818 Epoch 5/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8300 - loss: 0.4841 - val_accuracy: 0.8309 - val_loss: 0.4898 Epoch 6/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8419 - loss: 0.4455 - val_accuracy: 0.8406 - val_loss: 0.4813 Epoch 7/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8395 - loss: 0.4550 - val_accuracy: 0.8043 - val_loss: 0.5573 Epoch 8/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8370 - loss: 0.4590 - val_accuracy: 0.8380 - val_loss: 0.4970 Epoch 9/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8414 - loss: 0.4388 - val_accuracy: 0.8243 - val_loss: 0.5658 Epoch 10/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8421 - loss: 0.4436 - val_accuracy: 0.8414 - val_loss: 0.4809 Training model with learning rate: 0.1 Epoch 1/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 3s 6ms/step - accuracy: 0.4187 - loss: 19.3952 - val_accuracy: 0.4133 - val_loss: 1.5945 Epoch 2/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.4656 - loss: 1.3649 - val_accuracy: 0.5738 - val_loss: 1.1986 Epoch 3/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.4474 - loss: 1.4342 - val_accuracy: 0.5404 - val_loss: 1.1808 Epoch 4/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.4621 - loss: 1.3402 - val_accuracy: 0.5282 - val_loss: 1.2398 Epoch 5/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.4259 - loss: 1.4996 - val_accuracy: 0.4551 - val_loss: 1.2825 Epoch 6/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.4352 - loss: 1.3020 - val_accuracy: 0.4289 - val_loss: 1.3803 Epoch 7/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.4289 - loss: 1.3770 - val_accuracy: 0.4509 - val_loss: 1.2908 Epoch 8/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.4499 - loss: 1.2989 - val_accuracy: 0.4467 - val_loss: 1.2884 Epoch 9/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.4519 - loss: 1.2821 - val_accuracy: 0.4305 - val_loss: 1.3381 Epoch 10/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.3949 - loss: 1.9914 - val_accuracy: 0.4139 - val_loss: 1.2921
12. Find the model with the best performance and print its validation accuracy. You can do that with a for-loop that will iterate over the model_histories
max = 0
for key, item in model_histories.items():
print (key,model_histories[key].history['accuracy'][-1])
if (model_histories[key].history['accuracy'][-1] > max):
max = model_histories[key].history['accuracy'][-1]
max_key = key
print ("Validation accuracy for the best model is: ", model_histories[max_key].history['accuracy'][-1])
0.001 0.9132500290870667 0.02 0.8453500270843506 0.1 0.3953999876976013 Validation accuracy for the best model is: 0.9132500290870667
13. Plot the training and validation accuracy and loss for the model with the learning rate that achieved the best performance. Remember that you can use the function we created earlier, the plot_training_history() function
plot_training_history(model_histories[max_key])
At early epochs it is quite common to have a higher validation accuracy compared to the training accuracy. Eventually, training accuracy usually surpasses validation accuracy as the model learns and begins to overfit.
If you want to practice more, you can also try to optimise batch sizes (remember that this is a parameter of the fit function.). The batch size is a hyperparameter that defines the number of samples (rows) to work through before updating the internal model parameters.
We suggest that you skip to question 16 where we are going to use the grid search cv to optimise multiple parameters before working on this.
14. We can do the same but now using different batch sizes during training. Try batch sizes of 64, 128 and 256
# Define different batch sizes to experiment with
batch_sizes = [64, 128, 256]
model_histories = {}
# Ensure clear separation in the output for each batch size experiment
print("\nStarting batch size experiments...\n" + "-"*50)
# Iterate over different batch sizes
for batch_size in batch_sizes:
print(f"\nTraining model with batch size: {batch_size}")
# Define the model architecture
model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(256, activation='relu'),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])
# Compile the model
model.compile(optimizer='Adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Fit the model
model_history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=10,
validation_data=(X_val, y_val))
# Store the history of each model training session
# Store the history
model_histories[batch_size] = model_history
# Clear the TensorFlow backend to reset model state
tf.keras.backend.clear_session()
Starting batch size experiments... -------------------------------------------------- Training model with batch size: 64 Epoch 1/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 4s 8ms/step - accuracy: 0.7090 - loss: 0.8417 - val_accuracy: 0.8372 - val_loss: 0.4454 Epoch 2/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8377 - loss: 0.4497 - val_accuracy: 0.8459 - val_loss: 0.4117 Epoch 3/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 5ms/step - accuracy: 0.8548 - loss: 0.3882 - val_accuracy: 0.8602 - val_loss: 0.3811 Epoch 4/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8697 - loss: 0.3507 - val_accuracy: 0.8693 - val_loss: 0.3705 Epoch 5/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8777 - loss: 0.3224 - val_accuracy: 0.8689 - val_loss: 0.3729 Epoch 6/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8891 - loss: 0.3007 - val_accuracy: 0.8758 - val_loss: 0.3633 Epoch 7/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 4ms/step - accuracy: 0.8977 - loss: 0.2802 - val_accuracy: 0.8771 - val_loss: 0.3581 Epoch 8/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9022 - loss: 0.2663 - val_accuracy: 0.8785 - val_loss: 0.3524 Epoch 9/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9027 - loss: 0.2560 - val_accuracy: 0.8854 - val_loss: 0.3441 Epoch 10/10 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9107 - loss: 0.2368 - val_accuracy: 0.8849 - val_loss: 0.3502 Training model with batch size: 128 Epoch 1/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 4s 10ms/step - accuracy: 0.6681 - loss: 0.9627 - val_accuracy: 0.8347 - val_loss: 0.4811 Epoch 2/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8354 - loss: 0.4642 - val_accuracy: 0.8495 - val_loss: 0.4428 Epoch 3/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8549 - loss: 0.4044 - val_accuracy: 0.8517 - val_loss: 0.4180 Epoch 4/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8705 - loss: 0.3661 - val_accuracy: 0.8609 - val_loss: 0.3929 Epoch 5/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8790 - loss: 0.3386 - val_accuracy: 0.8696 - val_loss: 0.3704 Epoch 6/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8892 - loss: 0.3167 - val_accuracy: 0.8741 - val_loss: 0.3583 Epoch 7/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8937 - loss: 0.2953 - val_accuracy: 0.8738 - val_loss: 0.3563 Epoch 8/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9002 - loss: 0.2789 - val_accuracy: 0.8787 - val_loss: 0.3463 Epoch 9/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9033 - loss: 0.2657 - val_accuracy: 0.8801 - val_loss: 0.3489 Epoch 10/10 157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.9073 - loss: 0.2530 - val_accuracy: 0.8767 - val_loss: 0.3649 Training model with batch size: 256 Epoch 1/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 4s 22ms/step - accuracy: 0.6551 - loss: 1.0585 - val_accuracy: 0.8250 - val_loss: 0.5056 Epoch 2/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8269 - loss: 0.4931 - val_accuracy: 0.8492 - val_loss: 0.4396 Epoch 3/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8507 - loss: 0.4225 - val_accuracy: 0.8596 - val_loss: 0.4106 Epoch 4/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8638 - loss: 0.3876 - val_accuracy: 0.8635 - val_loss: 0.3892 Epoch 5/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8707 - loss: 0.3626 - val_accuracy: 0.8678 - val_loss: 0.3786 Epoch 6/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step - accuracy: 0.8796 - loss: 0.3366 - val_accuracy: 0.8727 - val_loss: 0.3647 Epoch 7/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8897 - loss: 0.3146 - val_accuracy: 0.8744 - val_loss: 0.3607 Epoch 8/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.8926 - loss: 0.2985 - val_accuracy: 0.8743 - val_loss: 0.3561 Epoch 9/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step - accuracy: 0.8965 - loss: 0.2854 - val_accuracy: 0.8745 - val_loss: 0.3590 Epoch 10/10 79/79 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step - accuracy: 0.9010 - loss: 0.2737 - val_accuracy: 0.8746 - val_loss: 0.3602
15.Find the best batch size and plot the training and validation accuracy and loss
max = 0
for key, item in model_histories.items():
print (key,model_histories[key].history['accuracy'][-1])
if (model_histories[key].history['accuracy'][-1] > max):
max = model_histories[key].history['accuracy'][-1]
max_key = key
print ("Validation accuracy for the best model is: ", model_histories[max_key].history['accuracy'][-1])
64 0.9122999906539917 128 0.9097999930381775 256 0.9027000069618225 Validation accuracy for the best model is: 0.9122999906539917
plot_training_history(model_histories[max_key])
Fine-tuning hyperparameters is one of the main steps to do in deep learning. As you noticed there are several hyperparameters that we can optimise. Consider that want to try the following hyperparameters:
0.3, and 0.2.0.001, and 0.01.256 and 128.What is Dropout? Dropout is a regularization technique that randomly “drops” (i.e., disables) a fraction of neurons during training (0.3 → 30% of neurons are turned off). Dropout prevents overfitting by making the model less dependent on any one neuron and encourages the model to learn redundant and robust features
What is Learning Rate? It controls how big the steps the optimizer takes when updating the model’s weights.
What Are Neurons? Neurons are the units in a layer that process information (via weights and activation functions).
Now we will fine-tune the hyperparameters of the neural network model (e.g., number of hidden layers, number of neurons per layer) to optimize performance on the validation set. Follow those steps to do it
16.Ceate a function that takes as input those different hyperprameters. In this function you can build the model and compile it as well.
Your function can start like:
def create_model(num_units, dropout_rate, learning_rate)
def create_model(num_units, dropout_rate, learning_rate):
model = Sequential()
model.add(Flatten(input_shape=(28, 28)))
model.add(Dense(num_units, activation='relu'))
model.add(Dropout(dropout_rate))
model.add(Dense(num_units, activation='relu')) # Reusing num_units for simplicity
model.add(Dropout(dropout_rate))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['accuracy'])
return model
17. Create a grid with the different values of the hyperparameters
param_grid = {
'num_units': [128, 256], # Different neuron counts
'dropout_rate': [0.2, 0.3], # Diverse dropout rates
'learning_rate': [0.001, 0.01] # Several learning rates
}
param_grid = {
'num_units': [128, 256], # Different neuron counts
'dropout_rate': [0.2, 0.3], # Diverse dropout rates
'learning_rate': [0.001, 0.01] # Several learning rates
}
18. Then you create a KerasClassifier based on the defined model (model = KerasClassifier(model=create_model, ...)). You can set epochs to 15, batch_size to 64. Also, add the hypermarameters that you will tune, the value that you will put there can be the first one from the param_grid
# Parameter grid for grid search
# Hyperparameters to be tuned need to be added as arguments to KerasClassifier from scikeras (https://adriangb.com/scikeras/stable/migration.html#default-arguments-in-build-fn-model)
model = KerasClassifier(model=create_model,
epochs = 15,
batch_size=64,
num_units = 128,
dropout_rate = 0.2,
learning_rate = 0.01,
verbose=True)
19. Finally you perform the grid search using RandomizedSearchCV (https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html), set n_iterations to 3 and fit on the training set.
Random search cross-validation is a technique that searches for the optimal hyperparameters of a model by evaluating the model's performance on random combinations of hyperparameter values. The idea is to define a set of hyperparameters and a range of values for each hyperparameter, and then randomly sample values from these ranges to create different combinations of hyperparameters. This process is repeated a specified number of times, and the best combination of hyperparameters that produces the best performance on a validation set is selected. The number of parameter settings that are tried is given by n_iter.
With those few values that we have in the prameters, we can also use GridSearchCV; The GridSearchCV will perform an exhaustive search over all the combinations of hyperparameters specified in the param_grid. It will select the best combination based on cross-validation performance.
However, it is very useful if you know about the RandomizedSearchCV because you may build a model that has many parameters to use and you want to try out multiple values per parameter. In this case we suggest RandomizedSearchCV.
grid = RandomizedSearchCV(
estimator=model, # This is the model you want to tune.
param_distributions=param_grid, # This is a dictionary of hyperparameters you want to search over
n_iter=3, # Number of random combinations
cv=5, # Sets 5-fold cross-validation
verbose=2, # Controls the level of log output during training.
n_jobs=1 # Here is where n_jobs should be set, n_jobs=1: use only 1 core
)
grid_result = grid.fit(X_train, y_train)
Fitting 5 folds for each of 3 candidates, totalling 15 fits Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.6572 - loss: 0.9639 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8194 - loss: 0.5072 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8335 - loss: 0.4520 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8499 - loss: 0.4071 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8546 - loss: 0.3866 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8647 - loss: 0.3670 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8735 - loss: 0.3465 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8762 - loss: 0.3401 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8782 - loss: 0.3279 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8834 - loss: 0.3134 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8843 - loss: 0.3037 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8863 - loss: 0.2919 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8883 - loss: 0.2912 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8968 - loss: 0.2750 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8953 - loss: 0.2778 63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 6ms/step [CV] END dropout_rate=0.2, learning_rate=0.01, num_units=256; total time= 14.5s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.6586 - loss: 0.9627 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8172 - loss: 0.5165 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8418 - loss: 0.4465 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8493 - loss: 0.4106 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8595 - loss: 0.3819 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8691 - loss: 0.3563 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8735 - loss: 0.3465 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8742 - loss: 0.3370 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8807 - loss: 0.3249 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8862 - loss: 0.3071 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8904 - loss: 0.2994 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8931 - loss: 0.2888 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8926 - loss: 0.2823 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8970 - loss: 0.2810 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9008 - loss: 0.2681 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.2, learning_rate=0.01, num_units=256; total time= 12.9s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.6531 - loss: 0.9726 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8182 - loss: 0.5144 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8390 - loss: 0.4467 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8510 - loss: 0.4064 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8585 - loss: 0.3813 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8686 - loss: 0.3620 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8722 - loss: 0.3429 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8757 - loss: 0.3266 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8871 - loss: 0.3088 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8819 - loss: 0.3155 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8879 - loss: 0.2962 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8920 - loss: 0.2835 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8911 - loss: 0.2848 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8954 - loss: 0.2718 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8980 - loss: 0.2689 63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step [CV] END dropout_rate=0.2, learning_rate=0.01, num_units=256; total time= 14.5s
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
Epoch 1/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.6586 - loss: 0.9534 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - accuracy: 0.8179 - loss: 0.5037 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8432 - loss: 0.4358 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8539 - loss: 0.3946 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8661 - loss: 0.3731 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8754 - loss: 0.3451 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8750 - loss: 0.3324 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8823 - loss: 0.3264 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8902 - loss: 0.3014 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8854 - loss: 0.3022 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8951 - loss: 0.2892 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8964 - loss: 0.2829 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8947 - loss: 0.2748 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9000 - loss: 0.2640 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9028 - loss: 0.2636 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step [CV] END dropout_rate=0.2, learning_rate=0.01, num_units=256; total time= 15.1s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.6570 - loss: 0.9741 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8128 - loss: 0.5160 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8371 - loss: 0.4448 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8524 - loss: 0.4022 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8622 - loss: 0.3734 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8704 - loss: 0.3543 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8766 - loss: 0.3381 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8771 - loss: 0.3303 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8827 - loss: 0.3116 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8856 - loss: 0.3047 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8899 - loss: 0.2944 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8919 - loss: 0.2867 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8957 - loss: 0.2775 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8948 - loss: 0.2715 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9042 - loss: 0.2616 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.2, learning_rate=0.01, num_units=256; total time= 14.2s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.6640 - loss: 0.9456 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8168 - loss: 0.5176 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8414 - loss: 0.4432 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8464 - loss: 0.4157 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8536 - loss: 0.3934 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8643 - loss: 0.3667 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8693 - loss: 0.3538 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8748 - loss: 0.3388 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8777 - loss: 0.3262 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8826 - loss: 0.3176 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8818 - loss: 0.3037 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8890 - loss: 0.2930 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8936 - loss: 0.2813 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8939 - loss: 0.2786 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8982 - loss: 0.2721 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.2, learning_rate=0.001, num_units=256; total time= 13.2s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.6549 - loss: 0.9676 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8157 - loss: 0.5099 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8390 - loss: 0.4440 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8475 - loss: 0.4089 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8623 - loss: 0.3809 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8689 - loss: 0.3605 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8747 - loss: 0.3460 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8768 - loss: 0.3379 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8853 - loss: 0.3119 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8887 - loss: 0.3064 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8932 - loss: 0.2930 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8971 - loss: 0.2837 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8986 - loss: 0.2752 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8954 - loss: 0.2712 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9043 - loss: 0.2610 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step [CV] END dropout_rate=0.2, learning_rate=0.001, num_units=256; total time= 14.0s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.6465 - loss: 0.9859 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8177 - loss: 0.5148 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8351 - loss: 0.4447 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8516 - loss: 0.4091 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8628 - loss: 0.3800 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8647 - loss: 0.3616 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8721 - loss: 0.3400 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8793 - loss: 0.3316 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8821 - loss: 0.3189 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8837 - loss: 0.3130 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8890 - loss: 0.2982 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8910 - loss: 0.2898 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8932 - loss: 0.2809 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8947 - loss: 0.2747 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9006 - loss: 0.2643 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step [CV] END dropout_rate=0.2, learning_rate=0.001, num_units=256; total time= 13.2s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 4s 2ms/step - accuracy: 0.6585 - loss: 0.9721 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8236 - loss: 0.4986 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8449 - loss: 0.4343 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8542 - loss: 0.4050 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8613 - loss: 0.3724 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8674 - loss: 0.3530 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8747 - loss: 0.3392 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8801 - loss: 0.3211 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8815 - loss: 0.3121 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8880 - loss: 0.3004 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8899 - loss: 0.2880 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8919 - loss: 0.2824 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8979 - loss: 0.2726 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8996 - loss: 0.2619 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.9022 - loss: 0.2541 63/63 ━━━━━━━━━━━━━━━━━━━━ 1s 5ms/step [CV] END dropout_rate=0.2, learning_rate=0.001, num_units=256; total time= 14.2s
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
Epoch 1/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.6527 - loss: 0.9671 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8161 - loss: 0.5171 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8361 - loss: 0.4436 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8488 - loss: 0.4069 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8603 - loss: 0.3786 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8617 - loss: 0.3590 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8720 - loss: 0.3443 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8816 - loss: 0.3195 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8824 - loss: 0.3140 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8856 - loss: 0.3086 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8912 - loss: 0.2951 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8967 - loss: 0.2802 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8982 - loss: 0.2726 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.9028 - loss: 0.2622 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8971 - loss: 0.2707 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.2, learning_rate=0.001, num_units=256; total time= 14.7s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.5623 - loss: 1.2127 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7856 - loss: 0.6047 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8144 - loss: 0.5221 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8289 - loss: 0.4760 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8383 - loss: 0.4567 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8417 - loss: 0.4364 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8458 - loss: 0.4113 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8552 - loss: 0.3932 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8610 - loss: 0.3790 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8586 - loss: 0.3751 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8709 - loss: 0.3581 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8726 - loss: 0.3518 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8718 - loss: 0.3429 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8725 - loss: 0.3387 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8813 - loss: 0.3229 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 5ms/step [CV] END dropout_rate=0.3, learning_rate=0.01, num_units=128; total time= 14.8s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.5811 - loss: 1.1942 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7838 - loss: 0.5980 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8203 - loss: 0.5193 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8311 - loss: 0.4713 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8347 - loss: 0.4432 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8489 - loss: 0.4220 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8517 - loss: 0.4107 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8557 - loss: 0.3974 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8660 - loss: 0.3717 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8675 - loss: 0.3719 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8677 - loss: 0.3586 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8709 - loss: 0.3419 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8744 - loss: 0.3393 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8785 - loss: 0.3327 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8785 - loss: 0.3251 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.3, learning_rate=0.01, num_units=128; total time= 13.4s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.5591 - loss: 1.2297 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7867 - loss: 0.6000 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8156 - loss: 0.5222 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8335 - loss: 0.4775 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8380 - loss: 0.4470 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8488 - loss: 0.4234 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8543 - loss: 0.4022 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8556 - loss: 0.3967 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8674 - loss: 0.3733 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8665 - loss: 0.3665 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8683 - loss: 0.3533 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8690 - loss: 0.3521 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8716 - loss: 0.3453 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8758 - loss: 0.3402 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8739 - loss: 0.3380 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step [CV] END dropout_rate=0.3, learning_rate=0.01, num_units=128; total time= 14.1s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 3ms/step - accuracy: 0.5845 - loss: 1.1694 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.7909 - loss: 0.5883 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8236 - loss: 0.5006 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8327 - loss: 0.4652 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8423 - loss: 0.4336 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8497 - loss: 0.4132 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8599 - loss: 0.3915 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8592 - loss: 0.3851 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8645 - loss: 0.3702 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8703 - loss: 0.3589 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8687 - loss: 0.3536 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8748 - loss: 0.3449 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8710 - loss: 0.3423 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8767 - loss: 0.3326 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8827 - loss: 0.3216 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.3, learning_rate=0.01, num_units=128; total time= 17.7s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
250/250 ━━━━━━━━━━━━━━━━━━━━ 3s 2ms/step - accuracy: 0.5786 - loss: 1.1728 Epoch 2/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.7861 - loss: 0.5920 Epoch 3/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8129 - loss: 0.5264 Epoch 4/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8281 - loss: 0.4717 Epoch 5/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8373 - loss: 0.4440 Epoch 6/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8390 - loss: 0.4259 Epoch 7/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8538 - loss: 0.4044 Epoch 8/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8576 - loss: 0.3841 Epoch 9/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8615 - loss: 0.3668 Epoch 10/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8654 - loss: 0.3632 Epoch 11/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8703 - loss: 0.3555 Epoch 12/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8733 - loss: 0.3449 Epoch 13/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8727 - loss: 0.3356 Epoch 14/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8767 - loss: 0.3289 Epoch 15/15 250/250 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8806 - loss: 0.3187 63/63 ━━━━━━━━━━━━━━━━━━━━ 0s 4ms/step [CV] END dropout_rate=0.3, learning_rate=0.01, num_units=128; total time= 16.8s Epoch 1/15
/usr/local/lib/python3.11/dist-packages/keras/src/layers/reshaping/flatten.py:37: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead. super().__init__(**kwargs)
313/313 ━━━━━━━━━━━━━━━━━━━━ 5s 9ms/step - accuracy: 0.6787 - loss: 0.9059 Epoch 2/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8254 - loss: 0.4849 Epoch 3/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8426 - loss: 0.4301 Epoch 4/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8566 - loss: 0.3944 Epoch 5/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8673 - loss: 0.3676 Epoch 6/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8674 - loss: 0.3511 Epoch 7/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8779 - loss: 0.3331 Epoch 8/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8816 - loss: 0.3216 Epoch 9/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8854 - loss: 0.3099 Epoch 10/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8857 - loss: 0.3019 Epoch 11/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 2s 3ms/step - accuracy: 0.8893 - loss: 0.2932 Epoch 12/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 4ms/step - accuracy: 0.8937 - loss: 0.2808 Epoch 13/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8952 - loss: 0.2706 Epoch 14/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - accuracy: 0.8972 - loss: 0.2721 Epoch 15/15 313/313 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - accuracy: 0.8976 - loss: 0.2624
20. Print the optimal hyperparameters and the best score that was obtained with those parameters
print(grid_result.best_score_)
print(grid_result.best_params_)
0.873
{'num_units': 256, 'learning_rate': 0.01, 'dropout_rate': 0.2}
test_accuracy = grid.score(X_test, y_test)
test_accuracy
157/157 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step
0.8679
Optional. Now go back to question 16 and try to add more parameters and check if the optimal parameters change.
End of the Practical