Abstract

This article delves into various techniques for hyperparameter optimization, comparing the efficiency and effectiveness of Grid Search, Random Search, and Latin Hypercube Sampling in the context of gradient boosting models.

Code

import time
import warnings

import matplotlib.pyplot as plt
import numpy as np
import optuna
import pandas as pd
import seaborn as sns
from catboost import CatBoostClassifier, Pool
from matplotlib_inline.backend_inline import set_matplotlib_formats
from scipy.stats import qmc
from sklearn.datasets import fetch_openml
from sklearn.model_selection import ParameterGrid, train_test_split
from tqdm.notebook import tqdm

Code

set_matplotlib_formats("svg")
sns.set_style("darkgrid")
sns.set_context("paper")
sns.set_context(context="notebook", font_scale=1.5)


warnings.filterwarnings("ignore", category=FutureWarning)
FIGSIZE = (12, 6)
N_SAMPLES = 100

1 Introduction

We’ve all been in this situation. The point when we believe our features are good enough, and we’re ready to train our first model. Initially, we start with the default settings of the model, which yields acceptable performance. However, to achieve better results, we determine that hyperparameter tuning is necessary. According to the Scikit-Learn documentation, there are several options: GridSearchCV and RandomizedSearchCV.

Scikit-Learn

1.1 GridSearch

GridSearchCV is a hyperparameter tuning method in Scikit-Learn that utilizes exhaustive search to determine the optimal combination of estimator parameters. It evaluates each set of specified parameter values by trying all possible combinations using cross-validation. The advantages of this approach include exhaustive exploration, providing detailed performance information. However, it is time-consuming and memory-intensive since it tests all combinations. The choice of cross-validation method and the number of folds significantly impact the results.

1.2 RandomSearch

RandomizedSearchCV is a hyperparameter tuning method in Scikit-Learn that utilizes random sampling to find the optimal combination of estimator parameters. In contrast to GridSearchCV, which tests all possible combinations, RandomizedSearchCV samples a fixed number of parameter settings from specified distributions, making it more efficient in exploring the entire search-space. Cross-validation is employed to assess each sampled set, and continuous or categorical distributions can be defined. The chosen distributions, iterations, cross-validation method, and the number of folds all impact the results.

1.3 Something in Between

There are techniques, such as Latin Hypercube Sampling (LHS) or Poisson Disc sampling, that combine grid and random sampling methods.

Latin Hypercube Sampling is a deterministic, space-filling design technique that selects a representative sample from the parameter space. It does so by ensuring that every hyperrectangle enclosing each unique combination of parameters contains at least one sample. In simpler terms, LHS aims to provide a representative and uniform distribution of samples across the entire range of possible combinations.

Consider the analogy of a Sudoku puzzle. Each square on the board represents a unique combination of parameters. While you may be tempted to reveal all the numbers in one row or column, this would be unwise as it would limit the overall usefulness of your picks. LHS offers a framework for reducing the number of picks while maximizing their effectiveness. For instance, if you have selected a square from one row, it would favor other rows more, and the same logic applies to the columns and sub-squares.

In conclusion, when choosing between hyperparameter tuning methods, it is essential to consider factors like computational resources, time constraints, and the complexity of the problem at hand. GridSearchCV offers exhaustive exploration but comes with higher computational costs, while RandomizedSearchCV provides flexibility and efficiency but may not find the absolute best combination of parameters. Latin Hypercube Sampling offers a balance between these two extremes by providing representative samples while being more computationally efficient than GridSearchCV for large search spaces.

1.4 Visual Representation

To illustrate the distinction between the three sampling methods, let’s consider a hypothetical scenario in a two-dimensional space, ranging from 0 to 1 inclusively, and apply the following three sampling strategies: random, Latin Hypercube Sampling (LHS), and grid.

Code

np.random.seed(99)

uni_sample = np.random.uniform(0, 1, (100, 2))

sampler = qmc.LatinHypercube(d=2, optimization="lloyd")
lhs_sample = sampler.random(n=100)

grid_sample = np.array(
    [[i, j] for i in np.linspace(0, 1, 10) for j in np.linspace(0, 1, 10)]
)

Code

_, ax = plt.subplots(1, 3, figsize=(12, 4))


curr_ax = ax[0]
sns.histplot(
    x=uni_sample[:, 0], y=uni_sample[:, 1], bins=10, ax=curr_ax, cbar=True, thresh=None
)
sns.scatterplot(x=uni_sample[:, 0], y=uni_sample[:, 1], ax=curr_ax, color="tab:orange")
curr_ax.set_xticks([])
curr_ax.set_yticks([])
curr_ax.set_title("uniform sampling")


curr_ax = ax[1]
sns.histplot(
    x=lhs_sample[:, 0], y=lhs_sample[:, 1], bins=10, ax=curr_ax, cbar=True, thresh=None
)
sns.scatterplot(x=lhs_sample[:, 0], y=lhs_sample[:, 1], ax=curr_ax, color="tab:orange")
curr_ax.set_xticks([])
curr_ax.set_yticks([])
curr_ax.set_title("latin hypercube sampling")


curr_ax = ax[2]
sns.histplot(
    x=grid_sample[:, 0],
    y=grid_sample[:, 1],
    bins=10,
    ax=curr_ax,
    cbar=True,
    thresh=None,
)
sns.scatterplot(
    x=grid_sample[:, 0], y=grid_sample[:, 1], ax=curr_ax, color="tab:orange"
)
curr_ax.set_xticks([])
curr_ax.set_yticks([])
curr_ax.set_title("grid sampling")

plt.tight_layout()
plt.show()

The plots above illustrate that the random sampling method selects points randomly from the entire space, which can result in clustering or gaps. In contrast, Latin Hypercube Sampling ensures that each row and column contain a representative sample; however, it still generates a random pattern and disrupts the orthogonality of the grid sampling technique. Finally, the grid sampling method selects points at regular intervals, resulting in a uniform distribution across the space. Nevertheless, it may not be the most efficient method for large search spaces.

2 Apply LHS to a Real Application

Let’s put this knowledge into practice by performing some hyper-parameter tuning of a CatBoost model on the Portuguese Bank Marketing Dataset.

The dataset is related to direct marketing campaigns of a Portuguese banking institution, focusing on phone calls to promote term deposits. The data includes 17 input variables and one output variable. Input attributes consist of client demographics such as age, job type, marital status, education level, credit default status, average yearly balance, housing loan status, personal loan status, contact communication type, last contact day and month, duration of contact, number of contacts during the current campaign, number of days since last contact from a previous campaign, number of contacts before the current campaign, and the outcome of the previous marketing campaign. The goal is to predict if a client will subscribe to a term deposit based on these features.

This dataset represents a binary classification problem where the goal is to predict whether a client will subscribe to a term deposit based on multiple categorical and numerical features, making it an excellent use case for CatBoost, a gradient boosting library specifically designed to handle categorical data.

2.0.1 Dataset

We can utilize the fetch_openml function from scikit-learn to automatically fetch the appropriate dataset. Following this, the data will be split into train and test sets with stratification controlled on the target variable. This ensures that the distribution of the target variable (y) remains equal for both the training and testing sets.

Code

# fetch data set
ID = 1461
X, y = fetch_openml(
    data_id=ID,
    data_home=f"openml_download_{ID}",
    return_X_y=True,
)

# downsample and create a stratified train test split
X = X.sample(frac=0.25)  # control the size of the dataset
X = X.dropna(axis=0, how="any")

# y = y.astype(int) - 1
y = y.loc[X.index]  # align the datasets

X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    stratify=y,
)

2.1 Information on the Dataset

The dataset consists of 9 categorical and 7 numerical features, making it suitable for use with CatBoost. This algorithm uses a specialized technique called “ordered target encoding” to convert categorical features into numerical values, which can improve model performance and accuracy in gradient boosting models. This approach is particularly effective when managing datasets with numerous categorical features compared to other machine learning algorithms.

Code

X.info()

<class 'pandas.core.frame.DataFrame'>
Index: 11303 entries, 13121 to 3830
Data columns (total 16 columns):
 #   Column  Non-Null Count  Dtype   
---  ------  --------------  -----   
 0   V1      11303 non-null  int64   
 1   V2      11303 non-null  category
 2   V3      11303 non-null  category
 3   V4      11303 non-null  category
 4   V5      11303 non-null  category
 5   V6      11303 non-null  int64   
 6   V7      11303 non-null  category
 7   V8      11303 non-null  category
 8   V9      11303 non-null  category
 9   V10     11303 non-null  int64   
 10  V11     11303 non-null  category
 11  V12     11303 non-null  int64   
 12  V13     11303 non-null  int64   
 13  V14     11303 non-null  int64   
 14  V15     11303 non-null  int64   
 15  V16     11303 non-null  category
dtypes: category(9), int64(7)
memory usage: 807.6 KB

2.1.1 Creating Data Pools

CatBoost offers the capability to create data pools. This feature enables CatBoost to optimize the handling of large files internally, providing benefits such as defining categorical features in one place for improved reproducibility and subsequent reuse.

Code

train_pool = Pool(
    data=X_train,
    label=y_train,
    cat_features=list(X.select_dtypes(include="category").columns),
)

test_pool = Pool(
    data=X_test,
    label=y_test,
    cat_features=list(X.select_dtypes(include="category").columns),
)

2.2 Applying Three Sampling Techniques: Grid, Random, and Latin Hypercube

In the following section, we apply and compare the results of three sampling techniques: grid search, random search, and Latin Hypercube Sampling (LHS). Our objective is to demonstrate that LHS explores more of the design space than random search while maintaining sufficient randomness to mitigate the symmetry issues inherent in grid search.

The search space is defined by the following parameters:

Depth: 1 to 15
Iterations: 1 to 1024
Subsample: 0.1 to 1
Bagging Temperature: 1 to 1e6

We use 100 samples for each technique.

2.3 Grid Search

To adhere to the limitation of 100 samples, we use \(100^{1/4} \approx 3\), which results in \(3^{4}=81\) samples.

Code

n_grid_samples = np.power(N_SAMPLES, 1 / 4)
n_grid_samples = np.floor(n_grid_samples).astype(int)

param_grid = {
    "depth": np.linspace(1, 15, n_grid_samples, dtype=int),
    "iterations": np.linspace(1, 1024, n_grid_samples, dtype=int),
    "subsample": np.linspace(0.1, 1, n_grid_samples, endpoint=False),
    "bagging_temperature": np.linspace(1, 1000000, n_grid_samples, dtype=int),
}

param_combo = list(ParameterGrid(param_grid=param_grid))

cube_search = []
start = time.time()
for param in param_combo:
    cbc = CatBoostClassifier(
        **param,
        eval_metric="F1",
    )
    cbc.fit(train_pool, eval_set=(test_pool), verbose=False)

    score = cbc.get_best_score().get("validation").get("F1")
    param["score"] = score
    param["iterations"] = cbc.get_best_iteration()

    cube_search.append(param)
end = time.time()
print(f"time grid: {end - start:.2f} sec")
grid_results = pd.DataFrame(cube_search).sort_values(by="score", ascending=False)

time grid: 1538.35 sec

Code

grid_results.head()

	bagging_temperature	depth	iterations	subsample	score
70	1000000	8	258	0.4	0.569507
43	500000	8	258	0.4	0.569507
16	1	8	258	0.4	0.569507
40	500000	8	216	0.4	0.565611
13	1	8	216	0.4	0.565611

2.4 Random Search

For depth and iterations, the model expects integer values; therefore, randint is used. For subsample and bagging temperature, uniform is utilized since the model accepts float values for these parameters.

Code

random_sample = np.hstack(
    (
        np.random.randint(1, 15, (N_SAMPLES, 1)),
        np.random.randint(1, 1024, (N_SAMPLES, 1)),
        np.random.uniform(0, 1, (N_SAMPLES, 1)),
        np.random.uniform(1, 1_000_000, (N_SAMPLES, 1)),
    )
)

param_combo = list(
    pd.DataFrame(
        data=random_sample,
        columns=["depth", "iterations", "subsample", "bagging_temperature"],
    )
    .to_dict("index")
    .values()
)


cube_search = []
start = time.time()
for param in param_combo:
    cbc = CatBoostClassifier(
        **param,
        eval_metric="F1",
    )
    cbc.fit(train_pool, eval_set=(test_pool), verbose=False)

    score = cbc.get_best_score().get("validation").get("F1")
    param["score"] = score
    param["iterations"] = cbc.get_best_iteration()

    cube_search.append(param)

end = time.time()
print(f"time random: {end - start:.2f} sec")
random_results = pd.DataFrame(cube_search).sort_values(by="score", ascending=False)

time random: 572.61 sec

Code

random_results.head()

	depth	iterations	subsample	bagging_temperature	score
58	7.0	437	0.281857	582472.959943	0.576497
11	7.0	343	0.437535	187632.443518	0.575893
91	5.0	713	0.207018	544578.689749	0.568233
89	9.0	214	0.658629	129890.179531	0.566893
65	8.0	369	0.606488	759745.435830	0.565022

2.5 LHS Search

When applying the LHS (Latin Hypercube Sampling) technique, you need to specify the number of dimensions for the resulting sample and the number of samples you want. These parameters are on a different scale than the search space that you want to explore; therefore, a scaling is applied to adapt it to the desired space. Subsequently, a type conversion is applied.

Code

sampler = qmc.LatinHypercube(d=4, optimization="lloyd")
sample = sampler.random(n=N_SAMPLES)

l_bounds = [1, 1, 0, 1]
u_bounds = [15, 1024, 1, 1000000]
sample_scaled = qmc.scale(sample, l_bounds, u_bounds)

# convert to int
sample_scaled[:, :2] = sample_scaled[:, :2].astype(int)


param_combo = list(
    pd.DataFrame(
        data=sample_scaled,
        columns=["depth", "iterations", "subsample", "bagging_temperature"],
    )
    .to_dict("index")
    .values()
)

cube_search = []
start = time.time()
for param in param_combo:
    cbc = CatBoostClassifier(
        **param,
        eval_metric="F1",
    )
    cbc.fit(train_pool, eval_set=(test_pool), verbose=False)

    score = cbc.get_best_score().get("validation").get("F1")
    param["score"] = score
    param["iterations"] = cbc.get_best_iteration()

    cube_search.append(param)

end = time.time()
print(f"time lhs: {end - start:.2f} sec")
lhs_results = pd.DataFrame(cube_search).sort_values(by="score", ascending=False)

time lhs: 527.88 sec

Code

lhs_results.head()

	depth	iterations	subsample	bagging_temperature	score
71	7.0	858	0.183817	282489.633420	0.577007
72	8.0	404	0.949294	368410.292079	0.573913
94	9.0	148	0.490337	655506.162010	0.570787
25	7.0	655	0.760262	385807.469267	0.568966
26	10.0	244	0.710746	567170.475059	0.566210

2.6 Analysis of results

Code

df_all_sample = pd.concat(
    [
        lhs_results.assign(sampling="lhs"),
        random_results.assign(sampling="random"),
        grid_results.assign(sampling="grid"),
    ],
    axis=0,
)

df_all_sample = df_all_sample.sort_values(by="score", ascending=False)

df_all_sample.head(n=10)

	depth	iterations	subsample	bagging_temperature	score	sampling
71	7.0	858	0.183817	282489.633420	0.577007	lhs
58	7.0	437	0.281857	582472.959943	0.576497	random
11	7.0	343	0.437535	187632.443518	0.575893	random
72	8.0	404	0.949294	368410.292079	0.573913	lhs
94	9.0	148	0.490337	655506.162010	0.570787	lhs
16	8.0	258	0.400000	1.000000	0.569507	grid
43	8.0	258	0.400000	500000.000000	0.569507	grid
70	8.0	258	0.400000	1000000.000000	0.569507	grid
25	7.0	655	0.760262	385807.469267	0.568966	lhs
91	5.0	713	0.207018	544578.689749	0.568233	random

Method	Time (sec)	Best Score
Grid	1538	0.569
Random	573	0.576
LHS	528	0.577

Observing the time taken by each method, it is evident that grid search is the slowest, followed by random search and then LHS. This outcome is expected, as grid search is the most exhaustive search method, utilizing a large number of search points with high values for iterations (which control the number of trees in the model). In comparison, random search and LHS are faster than grid search due to having fewer points with high-value iterations. The difference is significant, with a reduction factor in runtime of around 3.

From a scoring perspective, the LHS method identified a set of hyperparameters that achieved the highest score. Random search followed closely behind, while grid search performed the worst.

3 Bonus: Optuna Hyperparameter Optimization Framework

Optuna provides a better way to perform hyperparameter optimization. With Optuna, you can define the search space and objective function, allowing Optuna to handle the rest. It utilizes a Bayesian optimization algorithm to find the best hyperparameters for your model. This dynamically adjust the search space, balancing exploration and exploitation. Resulting in faster convergence towards more optimal hyperparameters.

Code

def obj_all(trial) -> float:
    params = {
        "depth": trial.suggest_int(name="depth", low=1, high=15),
        "iterations": trial.suggest_int("iterations", 1, 1024),
        "subsample": trial.suggest_float("subsample", 0, 1.0),
        "bagging_temperature": trial.suggest_float(
            "bagging_temperature", 1e-10, 1_000_000, log=True
        ),
    }

    cbc = CatBoostClassifier(
        **params,
        eval_metric="F1",
    )
    cbc.fit(train_pool, eval_set=(test_pool), verbose=False)

    return cbc.get_best_score().get("validation").get("F1")


print("-" * 10)

start = time.time()
study = optuna.create_study(study_name="all-in", direction="maximize")
study.optimize(
    obj_all,
    n_trials=N_SAMPLES,
)

end = time.time()
print("-" * 10)
print(f"time all-in: {end - start:.2f} sec")
print(f"{study.best_params=}")
print(f"{study.best_value=}")

[I 2024-04-04 10:56:47,456] A new study created in memory with name: all-in

----------

[I 2024-04-04 10:56:50,953] Trial 0 finished with value: 0.5277777777777778 and parameters: {'depth': 11, 'iterations': 372, 'subsample': 0.6351383146436217, 'bagging_temperature': 2.0916398414638855e-07}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:11,784] Trial 1 finished with value: 0.47117794486215536 and parameters: {'depth': 15, 'iterations': 783, 'subsample': 0.16383410905600349, 'bagging_temperature': 732719.8112932495}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:14,578] Trial 2 finished with value: 0.5263157894736843 and parameters: {'depth': 4, 'iterations': 684, 'subsample': 0.5639112261921894, 'bagging_temperature': 2.322972999138206e-06}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:14,640] Trial 3 finished with value: 0.3699421965317919 and parameters: {'depth': 1, 'iterations': 33, 'subsample': 0.2177368073237087, 'bagging_temperature': 4.4335773709225747e-07}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:17,123] Trial 4 finished with value: 0.5150812064965197 and parameters: {'depth': 2, 'iterations': 831, 'subsample': 0.5078487155196426, 'bagging_temperature': 406904.6877287675}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:18,638] Trial 5 finished with value: 0.48039215686274506 and parameters: {'depth': 1, 'iterations': 606, 'subsample': 0.9851126712362728, 'bagging_temperature': 0.011758988439886613}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:20,206] Trial 6 finished with value: 0.4786729857819905 and parameters: {'depth': 13, 'iterations': 105, 'subsample': 0.036234431969518255, 'bagging_temperature': 5.972119355478038e-07}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:22,815] Trial 7 finished with value: 0.5165876777251185 and parameters: {'depth': 2, 'iterations': 878, 'subsample': 0.6510808349126643, 'bagging_temperature': 1.353820597494594e-06}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:22,879] Trial 8 finished with value: 0.36416184971098264 and parameters: {'depth': 1, 'iterations': 29, 'subsample': 0.20413071725527043, 'bagging_temperature': 0.11152861532596627}. Best is trial 0 with value: 0.5277777777777778.
[I 2024-04-04 10:58:26,250] Trial 9 finished with value: 0.545045045045045 and parameters: {'depth': 4, 'iterations': 825, 'subsample': 0.4960857700286204, 'bagging_temperature': 1948.5848661377586}. Best is trial 9 with value: 0.545045045045045.
[I 2024-04-04 10:58:31,741] Trial 10 finished with value: 0.5467289719626168 and parameters: {'depth': 7, 'iterations': 1012, 'subsample': 0.8569430773280887, 'bagging_temperature': 106.63598121124134}. Best is trial 10 with value: 0.5467289719626168.
[I 2024-04-04 10:58:37,328] Trial 11 finished with value: 0.5612472160356347 and parameters: {'depth': 7, 'iterations': 1022, 'subsample': 0.860476812328015, 'bagging_temperature': 173.26852844570692}. Best is trial 11 with value: 0.5612472160356347.
[I 2024-04-04 10:58:44,019] Trial 12 finished with value: 0.5707865168539326 and parameters: {'depth': 8, 'iterations': 1024, 'subsample': 0.9707152849954347, 'bagging_temperature': 5.006039579262278}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:58:50,434] Trial 13 finished with value: 0.5587583148558759 and parameters: {'depth': 8, 'iterations': 977, 'subsample': 0.8430383803857151, 'bagging_temperature': 3.6908961037530568}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:58:54,016] Trial 14 finished with value: 0.5345622119815668 and parameters: {'depth': 10, 'iterations': 492, 'subsample': 0.8097225582481731, 'bagging_temperature': 0.00023321465383614702}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:58:59,635] Trial 15 finished with value: 0.5644444444444445 and parameters: {'depth': 7, 'iterations': 1022, 'subsample': 0.9943179849824604, 'bagging_temperature': 5.678506734032542e-10}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:01,559] Trial 16 finished with value: 0.5507900677200903 and parameters: {'depth': 10, 'iterations': 258, 'subsample': 0.9989566132540533, 'bagging_temperature': 3.3960887183868437e-10}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:04,635] Trial 17 finished with value: 0.5318181818181817 and parameters: {'depth': 5, 'iterations': 689, 'subsample': 0.38157604979961235, 'bagging_temperature': 2.540549994339633e-10}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:09,189] Trial 18 finished with value: 0.5529953917050692 and parameters: {'depth': 6, 'iterations': 916, 'subsample': 0.7344330745439471, 'bagging_temperature': 0.0003981853381216049}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:12,857] Trial 19 finished with value: 0.5475113122171946 and parameters: {'depth': 10, 'iterations': 500, 'subsample': 0.9309557228022994, 'bagging_temperature': 1.2370252301330071}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:22,029] Trial 20 finished with value: 0.5197215777262181 and parameters: {'depth': 12, 'iterations': 674, 'subsample': 0.6943271856123738, 'bagging_temperature': 8.53440013996314e-09}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:28,675] Trial 21 finished with value: 0.5592841163310962 and parameters: {'depth': 8, 'iterations': 1015, 'subsample': 0.8946125216803167, 'bagging_temperature': 1776.078786754936}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:34,746] Trial 22 finished with value: 0.5498891352549888 and parameters: {'depth': 8, 'iterations': 934, 'subsample': 0.7885522714071026, 'bagging_temperature': 526.2152389556436}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:38,985] Trial 23 finished with value: 0.5669642857142857 and parameters: {'depth': 6, 'iterations': 768, 'subsample': 0.9281192174512158, 'bagging_temperature': 19.145347253477336}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:43,877] Trial 24 finished with value: 0.5416666666666666 and parameters: {'depth': 5, 'iterations': 756, 'subsample': 0.9340764856693228, 'bagging_temperature': 7.829358518781824}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:51,458] Trial 25 finished with value: 0.5661252900232019 and parameters: {'depth': 9, 'iterations': 892, 'subsample': 0.9995829327711736, 'bagging_temperature': 9.277559633633642e-05}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 10:59:59,130] Trial 26 finished with value: 0.5398230088495575 and parameters: {'depth': 9, 'iterations': 884, 'subsample': 0.747429503802476, 'bagging_temperature': 7.409176852450163e-05}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:00:04,029] Trial 27 finished with value: 0.5592841163310962 and parameters: {'depth': 9, 'iterations': 564, 'subsample': 0.9269526451688124, 'bagging_temperature': 0.05023437290000363}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:00:13,922] Trial 28 finished with value: 0.5188679245283019 and parameters: {'depth': 12, 'iterations': 740, 'subsample': 0.3446537828828158, 'bagging_temperature': 0.0012585308401226476}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:00:15,917] Trial 29 finished with value: 0.5518763796909493 and parameters: {'depth': 6, 'iterations': 412, 'subsample': 0.6485131634412891, 'bagging_temperature': 32461.475018296085}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:00:56,716] Trial 30 finished with value: 0.4842615012106538 and parameters: {'depth': 14, 'iterations': 853, 'subsample': 0.7775782857361064, 'bagging_temperature': 0.3978464063711761}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:01,892] Trial 31 finished with value: 0.5540540540540541 and parameters: {'depth': 7, 'iterations': 943, 'subsample': 0.9902431004358446, 'bagging_temperature': 1.437032108040093e-08}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:09,586] Trial 32 finished with value: 0.5573033707865168 and parameters: {'depth': 9, 'iterations': 914, 'subsample': 0.9071729012682805, 'bagging_temperature': 1.8166327987999917e-05}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:13,441] Trial 33 finished with value: 0.5466970387243736 and parameters: {'depth': 6, 'iterations': 796, 'subsample': 0.9997104537758995, 'bagging_temperature': 10.903223921346806}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:24,079] Trial 34 finished with value: 0.54337899543379 and parameters: {'depth': 11, 'iterations': 964, 'subsample': 0.9292827796374152, 'bagging_temperature': 2.328597403436014e-08}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:27,005] Trial 35 finished with value: 0.5388127853881279 and parameters: {'depth': 4, 'iterations': 742, 'subsample': 0.8579867441660263, 'bagging_temperature': 0.0023140222289245206}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:32,150] Trial 36 finished with value: 0.5545454545454545 and parameters: {'depth': 8, 'iterations': 855, 'subsample': 0.5794011855926096, 'bagging_temperature': 13146.25852984201}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:35,008] Trial 37 finished with value: 0.552808988764045 and parameters: {'depth': 5, 'iterations': 645, 'subsample': 0.9530172273922374, 'bagging_temperature': 41.50879346941638}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:38,517] Trial 38 finished with value: 0.5265588914549654 and parameters: {'depth': 3, 'iterations': 973, 'subsample': 0.8155746882637948, 'bagging_temperature': 0.021543131979309748}. Best is trial 12 with value: 0.5707865168539326.
[I 2024-04-04 11:01:42,700] Trial 39 finished with value: 0.5739910313901346 and parameters: {'depth': 7, 'iterations': 784, 'subsample': 0.8871028004808711, 'bagging_temperature': 2.7369135017547972e-05}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:01:50,137] Trial 40 finished with value: 0.5466970387243736 and parameters: {'depth': 11, 'iterations': 791, 'subsample': 0.6943860230588396, 'bagging_temperature': 1.0518080851933587e-05}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:01:54,955] Trial 41 finished with value: 0.5555555555555556 and parameters: {'depth': 7, 'iterations': 892, 'subsample': 0.8768399471185984, 'bagging_temperature': 1.760929287807736e-07}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:01:58,885] Trial 42 finished with value: 0.53125 and parameters: {'depth': 6, 'iterations': 810, 'subsample': 0.02410586009232152, 'bagging_temperature': 0.003167073508053565}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:02:00,963] Trial 43 finished with value: 0.5462753950338601 and parameters: {'depth': 9, 'iterations': 238, 'subsample': 0.9581305308581167, 'bagging_temperature': 0.16559149184009114}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:02:04,846] Trial 44 finished with value: 0.567032967032967 and parameters: {'depth': 7, 'iterations': 715, 'subsample': 0.9501391280044423, 'bagging_temperature': 4.3292468921135704e-06}. Best is trial 39 with value: 0.5739910313901346.
[I 2024-04-04 11:02:09,355] Trial 45 finished with value: 0.5758241758241759 and parameters: {'depth': 8, 'iterations': 716, 'subsample': 0.10728942137717945, 'bagging_temperature': 2.4918303179984774e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:11,660] Trial 46 finished with value: 0.543778801843318 and parameters: {'depth': 4, 'iterations': 553, 'subsample': 0.13933959652716674, 'bagging_temperature': 6.06097335547338e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:15,586] Trial 47 finished with value: 0.5553047404063206 and parameters: {'depth': 8, 'iterations': 628, 'subsample': 0.33453118226195133, 'bagging_temperature': 6.48727834244256e-08}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:19,228] Trial 48 finished with value: 0.5538461538461539 and parameters: {'depth': 7, 'iterations': 692, 'subsample': 0.08326358239124756, 'bagging_temperature': 2.2765191941170745e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:21,836] Trial 49 finished with value: 0.5243619489559165 and parameters: {'depth': 5, 'iterations': 590, 'subsample': 0.4835617505469966, 'bagging_temperature': 5.773484996896862e-07}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:25,374] Trial 50 finished with value: 0.5644444444444445 and parameters: {'depth': 6, 'iterations': 717, 'subsample': 0.8309884775442248, 'bagging_temperature': 4.87676999685545e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:30,764] Trial 51 finished with value: 0.5538461538461539 and parameters: {'depth': 8, 'iterations': 836, 'subsample': 0.8809198876610866, 'bagging_temperature': 0.0005298175877202106}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:36,662] Trial 52 finished with value: 0.5650224215246638 and parameters: {'depth': 10, 'iterations': 802, 'subsample': 0.956352790943644, 'bagging_temperature': 9.38316845350047e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:43,004] Trial 53 finished with value: 0.5720620842572062 and parameters: {'depth': 9, 'iterations': 757, 'subsample': 0.25887979328455496, 'bagging_temperature': 5.520053040481469e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:45,360] Trial 54 finished with value: 0.5720620842572062 and parameters: {'depth': 7, 'iterations': 443, 'subsample': 0.16884923210918873, 'bagging_temperature': 2.5487835189361906e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:47,586] Trial 55 finished with value: 0.5657015590200445 and parameters: {'depth': 7, 'iterations': 439, 'subsample': 0.24380708990252953, 'bagging_temperature': 1.6377029633084952e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:50,728] Trial 56 finished with value: 0.5446428571428572 and parameters: {'depth': 10, 'iterations': 468, 'subsample': 0.1520967101078978, 'bagging_temperature': 1.6850753451125992e-07}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:52,688] Trial 57 finished with value: 0.5442477876106195 and parameters: {'depth': 8, 'iterations': 314, 'subsample': 0.08345587417343392, 'bagging_temperature': 3.4437327546544023e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:54,700] Trial 58 finished with value: 0.5514223194748359 and parameters: {'depth': 7, 'iterations': 382, 'subsample': 0.24841989668366948, 'bagging_temperature': 2.8521688390437405e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:02:59,121] Trial 59 finished with value: 0.5669642857142857 and parameters: {'depth': 9, 'iterations': 525, 'subsample': 0.19825846872043634, 'bagging_temperature': 1.126174652337146e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:03,171] Trial 60 finished with value: 0.5545454545454545 and parameters: {'depth': 8, 'iterations': 650, 'subsample': 0.09648357277240804, 'bagging_temperature': 0.00795089888992053}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:06,608] Trial 61 finished with value: 0.5486725663716815 and parameters: {'depth': 6, 'iterations': 719, 'subsample': 0.3001098404369988, 'bagging_temperature': 1.375584838013753e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:10,630] Trial 62 finished with value: 0.565121412803532 and parameters: {'depth': 7, 'iterations': 772, 'subsample': 0.4166450497377241, 'bagging_temperature': 0.6854601393850405}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:13,828] Trial 63 finished with value: 0.5446224256292906 and parameters: {'depth': 6, 'iterations': 670, 'subsample': 0.19254008092431113, 'bagging_temperature': 1.9396066973805937}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:16,542] Trial 64 finished with value: 0.5565610859728507 and parameters: {'depth': 5, 'iterations': 604, 'subsample': 0.12151092366143247, 'bagging_temperature': 9.98743250780862e-07}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:18,686] Trial 65 finished with value: 0.5462753950338601 and parameters: {'depth': 8, 'iterations': 340, 'subsample': 0.05436072908745202, 'bagging_temperature': 18.575765506700026}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:24,747] Trial 66 finished with value: 0.5683297180043384 and parameters: {'depth': 9, 'iterations': 703, 'subsample': 0.5810501543992559, 'bagging_temperature': 273.4685611857321}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:29,239] Trial 67 finished with value: 0.5518763796909493 and parameters: {'depth': 9, 'iterations': 528, 'subsample': 0.569960170762642, 'bagging_temperature': 2386.9023665379627}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:34,071] Trial 68 finished with value: 0.552808988764045 and parameters: {'depth': 10, 'iterations': 697, 'subsample': 0.44299987481025244, 'bagging_temperature': 0.0002603462604595885}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:34,699] Trial 69 finished with value: 0.5258215962441315 and parameters: {'depth': 9, 'iterations': 147, 'subsample': 0.28172571628340454, 'bagging_temperature': 210.5321765550445}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:38,363] Trial 70 finished with value: 0.5610859728506787 and parameters: {'depth': 8, 'iterations': 577, 'subsample': 0.5243231067025155, 'bagging_temperature': 165994.36082627063}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:42,294] Trial 71 finished with value: 0.5567928730512249 and parameters: {'depth': 7, 'iterations': 762, 'subsample': 0.004975496425702108, 'bagging_temperature': 53.64258328108033}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:46,214] Trial 72 finished with value: 0.5633187772925763 and parameters: {'depth': 7, 'iterations': 725, 'subsample': 0.897091278239565, 'bagging_temperature': 659.3339712670534}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:52,568] Trial 73 finished with value: 0.5650224215246638 and parameters: {'depth': 10, 'iterations': 856, 'subsample': 0.966026312828763, 'bagging_temperature': 5.1207939103984155e-08}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:03:55,661] Trial 74 finished with value: 0.5580357142857143 and parameters: {'depth': 6, 'iterations': 638, 'subsample': 0.1724944375662059, 'bagging_temperature': 3.285552519690026}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:04:02,270] Trial 75 finished with value: 0.5612472160356347 and parameters: {'depth': 9, 'iterations': 766, 'subsample': 0.7487955820659308, 'bagging_temperature': 4.861603870391591e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:04:07,612] Trial 76 finished with value: 0.5363636363636363 and parameters: {'depth': 8, 'iterations': 833, 'subsample': 0.6015571960058017, 'bagging_temperature': 0.0011033139308208324}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:04:11,924] Trial 77 finished with value: 0.5475638051044083 and parameters: {'depth': 11, 'iterations': 453, 'subsample': 0.9208050253953769, 'bagging_temperature': 0.04022377547041282}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:16,343] Trial 78 finished with value: 0.496420047732697 and parameters: {'depth': 15, 'iterations': 667, 'subsample': 0.692410964054913, 'bagging_temperature': 2.7532897303151073e-05}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:19,620] Trial 79 finished with value: 0.5582417582417581 and parameters: {'depth': 7, 'iterations': 613, 'subsample': 0.7876695761469222, 'bagging_temperature': 0.20759936562575734}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:25,917] Trial 80 finished with value: 0.5585585585585585 and parameters: {'depth': 9, 'iterations': 740, 'subsample': 0.8375388858343029, 'bagging_temperature': 5678.842771193074}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:29,375] Trial 81 finished with value: 0.5342465753424659 and parameters: {'depth': 9, 'iterations': 414, 'subsample': 0.20264546422352994, 'bagging_temperature': 1.775120953630638e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:32,657] Trial 82 finished with value: 0.5488372093023256 and parameters: {'depth': 8, 'iterations': 536, 'subsample': 0.21360479215417089, 'bagging_temperature': 7.583504889117864e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:38,478] Trial 83 finished with value: 0.5605381165919283 and parameters: {'depth': 9, 'iterations': 700, 'subsample': 0.17378001789574854, 'bagging_temperature': 3.879127767317531e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:40,841] Trial 84 finished with value: 0.5669642857142857 and parameters: {'depth': 6, 'iterations': 500, 'subsample': 0.24868683967212946, 'bagging_temperature': 7.057659431324152e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:44,864] Trial 85 finished with value: 0.5580357142857143 and parameters: {'depth': 7, 'iterations': 785, 'subsample': 0.05652787986361946, 'bagging_temperature': 1.144451920483607e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:50,256] Trial 86 finished with value: 0.5529953917050692 and parameters: {'depth': 10, 'iterations': 809, 'subsample': 0.11077955483939216, 'bagging_temperature': 2.7579781605278724e-07}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:05:56,029] Trial 87 finished with value: 0.5663716814159292 and parameters: {'depth': 8, 'iterations': 908, 'subsample': 0.8720359084601365, 'bagging_temperature': 3.2860015980118056e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:02,643] Trial 88 finished with value: 0.5415676959619953 and parameters: {'depth': 12, 'iterations': 479, 'subsample': 0.9696727021508521, 'bagging_temperature': 303.8337638259729}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:06,788] Trial 89 finished with value: 0.5525114155251142 and parameters: {'depth': 5, 'iterations': 946, 'subsample': 0.3685001871339185, 'bagging_temperature': 6.663217473878182e-06}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:10,895] Trial 90 finished with value: 0.5550660792951543 and parameters: {'depth': 8, 'iterations': 663, 'subsample': 0.137825626794491, 'bagging_temperature': 5.400544338818074e-08}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:13,307] Trial 91 finished with value: 0.5701559020044543 and parameters: {'depth': 6, 'iterations': 505, 'subsample': 0.2355041659118694, 'bagging_temperature': 5.031551393216201e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:18,056] Trial 92 finished with value: 0.5535714285714286 and parameters: {'depth': 6, 'iterations': 990, 'subsample': 0.30287309204928203, 'bagging_temperature': 4.038475214576096e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:20,093] Trial 93 finished with value: 0.5587583148558759 and parameters: {'depth': 7, 'iterations': 375, 'subsample': 0.9400734264468863, 'bagging_temperature': 2.725879591509738e-10}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:22,724] Trial 94 finished with value: 0.54337899543379 and parameters: {'depth': 6, 'iterations': 553, 'subsample': 0.2318570947211355, 'bagging_temperature': 1.335683690193908e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:26,754] Trial 95 finished with value: 0.5525114155251142 and parameters: {'depth': 7, 'iterations': 745, 'subsample': 0.9068142306452065, 'bagging_temperature': 8.885814723256658e-09}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:30,314] Trial 96 finished with value: 0.5707964601769911 and parameters: {'depth': 9, 'iterations': 430, 'subsample': 0.26599166716608225, 'bagging_temperature': 20.327784104189416}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:32,575] Trial 97 finished with value: 0.5409090909090908 and parameters: {'depth': 5, 'iterations': 509, 'subsample': 0.2650429850414709, 'bagging_temperature': 87.75393864322213}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:34,203] Trial 98 finished with value: 0.5404157043879908 and parameters: {'depth': 4, 'iterations': 407, 'subsample': 0.4135129244872762, 'bagging_temperature': 15.29352980205095}. Best is trial 45 with value: 0.5758241758241759.
[I 2024-04-04 11:06:36,847] Trial 99 finished with value: 0.5478841870824054 and parameters: {'depth': 8, 'iterations': 428, 'subsample': 0.3032409164770215, 'bagging_temperature': 1.1224643722316576e-06}. Best is trial 45 with value: 0.5758241758241759.

----------
time all-in: 589.39 sec
study.best_params={'depth': 8, 'iterations': 716, 'subsample': 0.10728942137717945, 'bagging_temperature': 2.4918303179984774e-05}
study.best_value=0.5758241758241759

Method	Time (sec)	Best Score
Grid	1538	0.569
Random	573	0.576
LHS	528	0.577
Optuna	589	0.575

In this case, when limiting Optuna to sample only 100 points, it runs in roughly the same time as Random and LHS, yielding a similar score. In theory, increasing the number of sampled points by Optuna should result in a higher best score. However, this was not observed in the present scenario.