
Introduction
Keras is one of the most popular go-to Python libraries/APIs for beginners and professionals in deep learning. Although it started as a stand-alone project by François Chollet, it has been integrated natively into TensorFlow starting in Version 2.0. Read more about it here.
As the official doc says, it is "an API designed for human beings, not machines" as it "follows best practices for reducing cognitive load".

One of the situations, where the cognitive load is sure to increase, is hyperparameter tuning. Although there are so many supporting libraries and frameworks for handling it, for simple grid searches, we can always rely on some built-in goodies in Keras.
In this article, we will quickly look at one such internal tool and examine what we can do with it for hyperparameter tuning and search.
Scikit-learn cross-validation and grid search
Almost every Python machine-learning practitioner is intimately familiar with the Scikit-learn library and its beautiful API with simple methods like fit
, get_params
, and predict
.
The library also offers extremely useful methods for cross-validation, model selection, pipelining, and grid search abilities. If you look around, you will find plenty of examples of using these API methods for classical ML problems. But how to use the same APIs for a deep learning problem that you have encountered?
Cross Validation and Grid Search for Model Selection in Python
One of the situations, where the cognitive load is sure to increase, is hyperparameter tuning.
When Keras enmeshes with Scikit-learn
Keras offer a couple of special wrapper classes – both for regression and classification problems – to utilize the full power of these APIs that are native to Scikit-learn.
In this article, let me show you an example of using simple k-fold cross-validation and exhaustive grid search with a Keras classifier model. It utilizes an implementation of the Scikit-learn classifier API for Keras.
The Jupyter notebook demo can be found here in my Github repo.
Start with a model generating function
For this to work properly, we should create a simple function to synthesize and compile a Keras model with some tunable arguments built-in. Here is an example,

Data
For this demo, we are using the popular Pima Indians Diabetes. This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. So, it is a binary classification task.
- We create features and target vectors –
X
andY
- We scale the feature vector using a scaling API from Scikit-learn like
MinMaxScaler
. We call thisX_scaled
.
That’s it for data preprocessing. We can pass this X_scaled
and Y
directly to the special classes, we will build next.
Keras offer a couple of special wrapper classes – both for regression and classification problems – to utilize the full power of these APIs that are native to Scikit-learn.
The KerasClassifier class
This is the special wrapper class from Keras than enmeshes the Scikit-learn classifier API with Keras parametric models. We can pass on various model parameters corresponding to the create_model
function, and other hyperparameters like epochs, and batch size to this class.
Here is how we create it,

Note, how we pass on our model creation function as the build_fn
argument. This is an example of using a function as a first-class object in Python where you can pass on functions as regular parameters to other classes or functions.
For now, we have fixed the batch size and the number of epochs we want to run our model for because we just want to run cross-validation on this model. Later, we will make these as hyperparameters and do a grid search to find the best combination.
10-fold cross-validation
Building a 10-fold cross-validation estimator is easy with Scikit-learn API. Here is the code. Note how we import the estimators from the model_selection
S module of Scikit-learn.

Then, we can simply run the model with this code, where we pass on the KerasClassifier
object we built earlier along with the feature and target vectors. The important parameter here is the cv
where we pass the kfold
object we built above. This tells the cross_val_score
estimator to run the Keras model with the data provided, in a 10-fold Stratified cross-validation setting.

The output cv_results
is a simple Numpy array of all the accuracy scores. Why accuracy? Because that’s what we chose as the metric in our model compiling process. We could have chosen any other classification metric like precision, recall, etc. and, in that case, that metric would have been calculated and stored in the cv_results
array.
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
We can easily calculate the average and standard deviation of the 10-fold CV run to estimate the stability of the model predictions. This is one of the primary utilities of a cross-validation run.

Beefing up the model creation function for grid search
Exhaustive (or randomized) grid search is often a common practice for hyperparameter tuning or to gain insights into the working of a machine learning model. Deep learning models, being endowed with a lot of hyperparameters, are prime candidates for such a systematic search.
In this example, we will search over the following hyperparameters,
- activation function
- optimizer type
- initialization method
- batch size
- number of epochs
Needless to say that we have to add the first three of these parameters to our model definition.

Then, we create the same KerasClassifier
object as before,

The search space
We decide to make the exhaustive hyperparameter search space size as 3×3×3×3×3=243.
Note that the actual number of Keras runs will also depend on the number of cross-validation we choose, as cross-validation will be used for each of these combinations.
Here are the choices,

That’s a lot of dimensions to search over!

Enmeshing Scikit-learn GridSearchCV with Keras
We have to create a dictionary of search parameters and pass it on to the Scikit-learn GridSearchCV
estimator. Here is the code,

By default, GridSearchCV runs a 5-fold cross-validation if the cv
parameter is not specified explicitly (from Scikit-learn v0.22 onwards). Here, we keep it at 3 for reducing the total number of runs.
It is advisable to set the verbosity of GridSearchCV
to 2 to keep a visual track of what’s going on. Remember to keep the verbose=0
for the main KerasClassifier
class though, as you probably don’t want to display all the gory details of training individual epochs.
Then, just fit!
As we all have come to appreciate the beautifully uniform API of Scikit-learn, it is the time to call upon that power and just say fit
to search through the whole space!


Grab a cup of coffee because this may take a while depending on the deep learning model architecture, dataset size, search space complexity, and your hardware configuration.
In total, there will be 729 fittings of the model, 3 cross-validation runs for each of the 243 parametric combinations.
If you don’t like full grid search, you can always try the random grid search from Scikit-learn stable!
How does the result look like? Just like you expect from a Scikit-learn estimator, with all the goodies stored for your exploration.

What can you do with the result?
You can explore and analyze the results in a number of ways based on your research interest or business goal.
What’s the combination of the best accuracy?
This is probably on the top of your mind. Just print it using the best_score_
and best_params_
attributes from the GridSearchCV
estimator.

We did the initial 10-fold cross-validation using ReLU activation and Adam optimizer and got an average accuracy of 0.691. After doing an exhaustive grid search, we discover that tanh activation and rmsprop optimizer could have been better choices for this problem. We got better accuracy!
Extract all the results in a DataFrame
Many a time, we may want to analyze the statistical nature of the performance of a deep learning model under a wide range of hyperparameters. To that end, it is extremely easy to create a Pandas DataFrame from the grid search results and analyze them further.

Here is the result,

Analyze visually
We can create beautiful visualizations from this dataset to examine and analyze what choice of hyperparameters improves the performance and reduces the variation.
Here is a set of violin plots of the mean accuracy created with Seaborn from the grid search dataset.

Here is another plot,

…it is extremely easy to create a Pandas DataFrame from the grid search results and analyze them further.
Summary and further thoughts
In this article, we went over how to use the powerful Scikit-learn wrapper API, provided by the Keras library, to do 10-fold cross-validation and a hyperparameter grid search for achieving the best accuracy for a binary classification problem.
Using this API, it is possible to enmesh the best tools and techniques of Scikit-learn-based general-purpose ML pipeline and Keras models. This approach definitely has a huge potential to save a practitioner a lot of time and effort from writing custom code for cross-validation, grid search, pipelining with Keras models.
Again, the demo code for this example can be found here. Other related deep learning tutorials can be found in the same repository. Please feel free to star and fork the repository if you like.
You can check the author’s GitHub repositories for code, ideas, and resources in machine learning and Data Science. If you are, like me, passionate about AI/machine learning/data science, please feel free to add me on LinkedIn or follow me on Twitter.
Tirthajyoti Sarkar – Sr. Principal Engineer – Semiconductor, AI, Machine Learning – ON…