
Hyperparameter tuning is an essential step in developing robust predictive models. After all, sticking with default parameters prevents models from achieving peak performance.
This begs the question: what method is most fitting for finding the optimal hyperparameters for a given model?
Here, we delve into 3 popular approaches for hyperparameter tuning and determine which one is superior.
Grid search
The grid search is the most common hyperparameter tuning approach given its simple and straightforward procedure. It is an uninformed search method, which means that it does not learn from its previous iterations.
Using this method entails testing every unique combination of hyperparameters in the search space to determine the combination that yields the best performance.
It’s easy to see the benefits of such a brute-force method; what better way to find the best solution than to try all of them out?
Unfortunately, this approach does not scale well; an increase in the size of the hyperparameter search space will result in an exponential rise in run time and computation.
Random search
The random search is also an uninformed search method that treats iterations independently.
However, instead of searching for all hyperparameter sets in the search space, it evaluates a specific number of hyperparameter sets at random. This number is determined by the user.
Since it performs fewer trials in hyperparameter tuning, the method requires less computation and run time than the grid search.
Unfortunately, since the random search tests hyperparameter sets at random, it runs the risk of missing the ideal set of hyperparameters and forgoing peak model performance.
Bayesian Optimization
Unlike the grid search and random search, which treat hyperparameter sets independently, the Bayesian optimization is an informed search method, meaning that it learns from previous iterations. The number of trials in this approach is determined by the user.
As the name suggests, the process is based on Bayes’ theorem:

For this use case, the theorem can be modified to the following:

Simply put, this method creates a probabilistic model, in which it maps hyperparameters to their corresponding score probability.
Instead of painstakingly trying every hyperparameter set or testing hyperparameter sets at random, the Bayesian optimization method can converge to the optimal hyperparameters. Thus, the best hyperparameters can be obtained without exploring the entire sample space.
With the Bayesian optimization method, users do not have to endure long run times that come from evaluating every hyperparameter set. They also do not have to incorporate randomness and risk missing the optimal solution.
That being said, Bayesian optimization does have its own drawback. Since this is an informed learning method, additional time is required to determine the next hyperparameters to evaluate based on the results of the previous iterations. At the expense of minimizing the number of trials, Bayesian optimization requires more time for each iteration.
Case Study
We have explored the ins and outs of the three hyperparameter tuning approaches. To consolidate our understanding of these methods, it is best to use an example.
Let’s fine-tune a classification model with all three approaches and determine which one yields the best results.
For the exercise, we will use the load digits dataset from the Sklearn module.
The goal is to fine-tune a random forest model with the grid search, random search, and Bayesian optimization.
Each method will be evaluated based on:
- The total number of trials executed
- The number of trials needed to yield the optimal hyperparameters
- The score of the model (f-1 score in this case)
- The run time
The random forest classifier object and the search space are shown below:
Altogether, there are 810 unique hyperparameter combinations.
- Grid Search
First, let’s obtain the optimal hyperparameters using the grid search method and time the process. Of course, this means that we will test all 810 hyperparameter sets and pick out the one that yields the best results.
2. Random Search
Next, we will use the random search to identify the optimal hyperparameters and time the process. The search is limited to 100 trials.
3. Bayesian Optimization
Finally, we perform hyperparameter tuning with the Bayesian optimization and time the process. In Python, this can be accomplished with the Optuna module.
Its syntax differs from that of Sklearn, but it performs the same operation.
For the sake of consistency, we will use 100 trials in this procedure as well.
Now that we have executed hyperparameter tuning with all three approaches, let’s see how the results of each method compare to each other.
For convenience, we will store the results of all 3 hyperparameter tuning procedures in a single data frame.

The grid search registered the highest score (joint with the Bayesian optimization method). However, the method required carrying out 810 trials and only managed to obtain the optimal hyperparameters at the 680th iteration. Also, its run time far exceeded that of the random search and the Bayesian optimization methods.
The random search method required only 100 trials and needed only 36 iterations to find the best hyperparameter set. It also took the least amount of time to execute. However, the random search method registered the lowest score out of the 3 methods.
The Bayesian optimization also performed 100 trials but was able to achieve the highest score after only 67 iterations, far less than the grid search’s 680 iterations. Although it executed the same number of trials as the random search, it has a longer run time since it is an informed search method.
Which method is the best?
Given that the grid search, random search, and Bayesian optimization all have their own trade-off between run time, the number of iterations, and performance, is it really possible to come to a consensus on which method is the best?
Probably not.
After all, the ideal hyperparameter tuning method depends on the use case.
Ask yourself:
- What are the constraints of your machine learning task?
- Does your project prioritize maximizing performance or minimizing run time and/or the number of iterations?
Answering these questions will help decide the most suitable hyperparameter optimization approach.
The grid search is ideal if the computational demand and run-time are not limiting factors.
The random search is suitable if you’re willing to sacrifice performance in exchange for fewer iterations and smaller run time (Theoretically, the random search could find the best hyperparameters, but this is left entirely up to chance).
Bayesian optimization is the best fit if you wish to obtain the optimal hyperparameters with fewer trials but are willing to have longer run times for each iteration.
My 2 cents
If you hate diplomatic answers and just want my personal opinion, I would say that I usually favor the Bayesian optimization.
Given the run time needed for fine-tuning models with larger training data sets and search spaces, I usually shun the grid search. The random search requires fewer iterations and is the fastest of all 3 methods, but its level of success depends on the hyperparameter sets that are selected at random. In some cases, it will select the optimal hyperparameters; in other cases, it will omit the optimal hyperparameters completely. Due to this inconsistency, I do not like relying on randomness for bigger machine learning tasks.
I prefer the Bayesian optimization approach for its ability to consistently attain the optimal hyperparameters with fewer iterations. Its individual iterations may take more time than those of the uninformed search methods, but that is rarely a deal-breaker for me.
Conclusion

The main takeaway from this analysis is that each hyperparameter tuning method has its own unique trade-off between run time, number of iterations, and performance.
Ultimately, the best approach for you will depend on your priorities and constraints. Having a strong understanding of the data as well as the objectives will ensure that you make the correct decision.
I wish you the best of luck in your Data Science endeavors!