Fitting Cosine(Sine) functions with machine learning in Python

Using a mixture of Bayes optimization and linear regression to fit one of the most common functions in physics.

Carmen Adriana Martínez Barbosa, PhD.

Published in

Towards Data Science

9 min readJun 6, 2020

By: Carmen Martínez-Barbosa and José Celis-Gil

A couple of days ago one of our colleagues asked us how to fit the data from a harmonic signal with a reasonable knowledge of its behavior. This seems a standard problem as in physics many processes are periodic. The piston of a motor; the pendulum of a watch; the motion of a wheel; even the movement of the moon around the Earth follow a pattern that can be modeled as what in physics is called a harmonic oscillator, which mathematically can be expressed as:

where a₀, a₁,⍵, ϕ are the displacement, amplitude, frequency, and phase of the signal. In the next picture we can see the plot of this function and the role that every one of these components plays in the signal:

After receiving our colleague’s email, we thought that this was a problem easy to solve with existing Python libraries like Scipy. This Python’s package has a method called optimize.curve_fit, which uses non-linear least squares to fit a function f to some input data (an example with a Sine function fit can be found here). The function provided by Scipy is quite fast; however, after some trials, we noticed that the user needs to have an idea of each parameter values so that the code can give a good estimate. We explored other alternatives such as Bayesian optimization or regression and they turned out to be very good alternatives to fit data to harmonic signals.

In this blog, we explore how Scipy and Hyperopt can be used to fit a harmonic signal. We also introduce Python’s package HOBIT (Harmonic Oscillator hyBrid fIT): a hybrid method that combines Bayesian optimization with linear regression to efficiently fit Cosine (Sine) functions. Through this blog, we will explore the advantages and disadvantages of each of these methods.

All the code for this blog is available in this GitHub repository.

Let’s start with a toy dataset of n=100 observations that behaves according to f(x):

We use the function np.random.normal(size=n) to introduce noise to the signal, in order to make it more similar to real measurements.

We split the data in train and test, such that 80% of the observations are used for training and the rest are used for testing the fit (i.e. we do a single validation fit):

Fitting with Scipy optimize

The function optimize.curve_fit of Scipy uses non-linear least squares algorithm to find the optimal parameters of f(x). This function offers different optimization methods: Levenberg-Marquardt algorithm (‘lm’); Trust Region Reflective algorithm (‘trf’) and dogleg algorithm (´dogbox´). We will not explain these techniques, for more information you can visit the documentation of Scipy that explains the implementation of the least-squares algorithm.

The implementation of optimize.curve_fit in Python is shown below:

pₒ is an optional input of optimize.curve_fit function and it is an initial guess of the parameters of f(x), given by the user.

Running this code leads to the following result:

Note that a₁ and ϕ are far from the original parameters. This can be explained because f(x) has periodic behavior, and the loss function — which in our case is the Mean Squared Error (MSE) — has many local minima. This may lead to wrong fitting parameters, regardless of the optimization method. In other words: p₀ should be very close to the real parameters of the function, which is contradictive for fitting a function with unknown parameters.

Another consequence of the periodic behavior of f(x), is that Scipy may provide a reasonable fit although the parameters are far from the real ones, see figure below:

Fit of f(x) using *optimize.curve_fit* of Scipy. MSE on test set: 1.79

Despite the limitations of Scipy to fit periodic functions, one of the biggest advantages of optimize.curve_fit is its speed, being very fast and showing results after 0.016 seconds. If there is a known estimation of the parameters domain, we recommend to set “method=’trf’ ” or “method=’dogbox’ ” in the optimize.curve_fit and set the ranges by means of the option bounds so that the user can still take advantage of the speed of this package.

Fitting using Hyperopt

Hyperopt is a Python library that uses Bayesian optimization to minimize any function from a simple polynomial to complex loss functions used in machine learning algorithms. There are three basic parts required to formulate an optimization problem in Hyperopt:

The objective function: This is the function we want to minimize. In our case, it is the MSE.
Domain of parameters: the range where we know or we think the parameters might be located.
Optimization algorithm: this is the method used to create a probability model of the objective function from which new parameters a₀, a₁, ω, and ϕ are selected at every iteration. The probability model is also called the surrogate surface.

In this post, we are not going to explain in detail how Hyperopt works. For a great introduction to this topic, you can go to this blog.

We define the objective function as it is shown below:

Due to the implementation of Hyperopt, we need to define an auxiliary function called objective2 which adds more than one parameter to the optimization.

The implementation of Hyperopt is the following:

The variable space defines the parameters domain. tpe_algo is one of the optimization algorithms used by Hyperopt, which is the Tree-structure Parzen Estimator Approach (TPE). fmin is the function of Hyperopt that minimizes the objective function over the defined space according to tpe_algo up to a certain number of evaluations. The variable rstate is set to produce reproducible results.

After running fmin for 500 iterations, the best parameters of f(x) are stored in the variable tpe_best. The results are the following:

Clearly Hyperopt gives results closer to the real values compared to Scipy’s optimize.curve_fit even though the MSE is bigger (see figure below). However, note that the user needs to know in advance the parameters domain, although the range can be broader than in Scipy’s approach.

Fir of f(x)using Bayes optimization. MSE on test set: 3.37

For 500 iterations it takes to Hyperopt 4.88 seconds to find the optimal parameters of f(x). If the user does not have any previous knowledge of a₀, a₁, ω, and ϕ, a broader range of the space of parameters has to be defined but this will require more iterations to find an optimal solution. On the other hand, if the real values of a₀, a₁, ω, and ϕ are not within the defined domain of parameters, the results will be off.

Fitting using a hybrid method: HOBIT

We developed a Python library called: Harmonic Oscillator hyBrid fIT (HOBIT). This library combines the power of Hyperopt with the flexibility of Scikit learn to fit harmonic signals accurately. HOBIT is designed to fit Sine (Cosine)-shape signals. The user can define the domain of ω and ϕ but HOBIT uses the following ranges by default:

which are typical for harmonic oscillators.

There is no need to have a previous knowledge of a₀ and a₁ as they are determined by HOBIT.

The implementation of HOBIT is as simple as the following:

The class RegressionForTrigonometric has 2 fitting methods: fit_sin to fit Sine functions and fit_cos to fit Cosine functions. In any of these methods, you need to include your train set (X_train, y_train) and the number of iterations you want to run the optimization (given by max_evals). It is also possible to include other Hyperopt arguments, for instance, if you want to make the results reproducible, set a random seed in the variable rstate. HOBIT stores the resulting parameters in the variable best_parameters.

Running the code of above gives the following results:

Similar to Scikit learn, we can use the predict function, to apply the parameters found on unseen data:

The fitted f(x) of the data looks like this:

Fir of f(x) using the new Python’s package HOBIT. MSE on test set: 2.21

For 500 iterations it takes to HOBIT 3.05 seconds to find the optimal parameters of f(x), almost two seconds faster than Hyperopt! For the same number of iterations, the fit of HOBIT has a lower MSE.

How does HOBIT work?

The aim of HOBIT is to provide the parameters a₀, a₁, ω, and ϕ that best fit the input data, in the same fashion as using one of the classifiers of Scikit learn. HOBIT makes a first estimate of a₀ and a₁ from the observations. The initial assumptions of these parameters are:

We introduce these initial values and the default ranges of ω and ϕ to Hyperopt. After the iterations specified in the fit function, we obtain the fitted ω and ϕ by means of the tpe_algorithm.

We use the fitted ω and ϕ to make a transformation that allows us to carry out a polynomial regression. We use Scikit learn’s Linear Regression model on this transformation to get the final values of a₀ and a₁. (For more information on polynomial regression using Scikit learn, you can go to this blog). By using this hybrid approach, HOBIT fits the data faster than Hyperopt since it has to find only ω and ϕ while a₀ and a₁ are obtained by using a regression model.

Once the parameters are found the user can use the predict function to apply the model to another unseen data. The usage of HOBIT is easy and intuitive, especially if you use Scikit learn. We recommend to look at the data in advance to make a realistic definition of the domain of ω.

You can find here notebooks with examples that will help to explore HOBIT in more detail.

Conclusions

Cosine (Sine) functions are widely used in mathematics and physics, however fitting their parameters on data is not always an easy task, given their periodic behavior. In this blog, we revise some existing methods in Python that can be used to make such a fit.

We explore three tools: optimize.curve_fit from Scipy, Hyperopt, and HOBIT. We fit a dataset that behaves according to a Cosine function. Every package offers some advantages. Here are our recommendations for the use of every one of them: If the user has a previous detailed knowledge of the ranges where the optimal parameters can be found, then using optimize.curve_fit from Scipy is the best choice, however, if the user does not have this knowledge, the results may be incorrect. In cases where the user has very limited knowledge of the function, then Hyperopt can be helpful, the main issue is that it would be needed to set a big domain for every parameter making mandatory the use of more iterations to do the fit. Finally, HOBIT is a new tool that combines the best of both worlds making it easier and faster the fit of Sine (Cosine)-shape signals, giving the user the possibility to set ranges and iterations. Similarly to Hyperopt if the user does not have any previous knowledge of the parameters, broader ranges can be defined, but this will require more iterations to find an optimal solution. Despite this, HOBIT will be faster than only using Hyperopt, given that only the domain of two parameters will be adjusted.

Thanks for reading!

You can contact us via LinkedIn at:

Carmen Adriana Martínez Barbosa - Data Scientist/ consultant-researcher - Deltares | LinkedIn

A highly motivated professional with a Ph.D. in Astrophysics and passionate about data science. Strong background in…

www.linkedin.com