The world’s leading publication for data science, AI, and ML professionals.

Can LSTM be used to Forecast the Number of Sunspots?

I will be predicting the number of monthly Sunspots using time series analysis. The models that will be explored in-depth are the ARMA and…

Photo by NASA on Unsplash
Photo by NASA on Unsplash

Introduction

Firstly: what is a sunspot? Sunspots are a temporary phenomena on the Sun’s photosphere that appear darker than the surrounding areas. The reason why I have selected the sunspots dataset for time series analysis is sunspots appear on an 11-year solar cycle, meaning we should expect to see a seasonality component to the data. I will be modelling the seasonality trend using two different methods, the ARMA model and LSTM model.

Sun spots appear and disappear on a 11 year cycle

Image from [Wikipedia]
Image from [Wikipedia]

The data that will be used is from 1749 to 2013 and is the monthly average at each month. The data which was used for the analysis can be found from kaggle. Below I have a small snippet of the data.

Image by [author]
Image by [author]

One of the first steps in time series analysis is plotting both the data and the autocorrection. We do this to understand any underlaying trends or seasonality, by identifying any trends, the appropriate mathematical models can be used.

Sunspots Plot

I begin simply by plotting a block of 600 data points from the sunspots data.

Image by [author]
Image by [author]

From the graph a strong seasonality component can be viewed. However each peak and trough vary in size, this will limit which mathematical models can be used. For example using a sinusoidal model may not produce the best results.

Image by [Wikipedia]
Image by [Wikipedia]

A moving average or Data science approach will yield the best modelling results

Autocorrelation

Before I begin modelling the time series I first produce the autocorrelation graph to understand if any underlaying patterns are existent in the data. For time series modelling I should expect to see a stationary time series with no pattern.

Note that the autocorrection is the correlation of a data point with a lagged version of itself.

Image by [author]
Image by [author]

From the autocorrection plot a strong seasonality component exists. We will need to segment/decompose out the seasonality component.

Time Series Decomposition

For any given time series three layers can be segmented. It might help to think of any time series [Xt] as three components. Below I show the mathematical formulation of an additive model.

In our case there is only a seasonality component. With this in mind we can apply the method of differencing. If there is a seasonal component to an 11-year solar cycle, then we can remove it on an observation today by subtracting the value from 11 years ago.

Here is the python code for subtracting out a seasonality trend.

from matplotlib import pyplot
X = df.values
diff = list()
cycle = 132
for i in range(cycle, len(X)):
 value = X[i] - X[i - cycle]
 diff.append(value)
pyplot.plot(diff)
pyplot.title('Sunspots Dataset Differenced')
pyplot.show()
Image by [author]
Image by [author]

From the differenced plot I can see that there still exists some seasonality in the time series. Next I attempt to model the seasonality with a polynomial recurring on a 11 year cycle.

Image by [author]
Image by [author]

A second order polynomial does a poor Job of modelling our sunspots time series.

Some of the key problems of the time series we need to address:

  • Each 11 year cycle varies in magnitude
  • The cycles are on average 11 years apart but not exactly.

The Autoregressive moving average will address these issues well. One downfall will be the forward prediction will be limited in look ahead days. Another method that will be explored in-depth in this article will be the LSTM.

The LSTM performs well in forward prediction and addresses the downfall of the ARMA model.

Autoregressive moving average (ARMA) Modelling

The ARMA model is a powerful tool for modelling a time series. The ARMA(p,q) is defined where p is the order of the autoregressive polynomial and q is the order of the moving average polynomial.

The Autoregressive moving average (ARMA) model is given as

ARMA Model
ARMA Model

In python we are able to fit the parameters (p,q) using the ARIMA function.

The function ARIMA() accepts a two main arguments

  • x: a univariate time series
  • order: a vector of length 3 specifying the order of ARIMA(p,d,q) model

Here, d represents the order of a time series. For example if there is an underlaying linear trend than d would be equal to 1.

When fitting a ARMA model we set d equal to 0.

from statsmodels.tsa.arima_model import ARIMA
y = np.array(RNN_data.value)
model = ARIMA(y, order=(1,0,1)) #ARMA(1,1) model
model_fit = model.fit(disp = 0)
print(model_fit.summary())
# Plot residual errors
residuals = pd.DataFrame(model_fit.resid)
fig, ax = plt.subplots(1,2)
residuals.plot(title="Residuals", ax=ax[0])
residuals.plot(kind='kde', title='Density', ax=ax[1])
plt.show()
# Actual vs Fitted
cut_t = 30
predictions = model_fit.predict()
plot = pd.DataFrame({'Date':date,'Actual':abs(y[cut_t:]),"Predicted": predictions[cut_t:]})
plot.plot(x='Date',y=['Actual','Predicted'],title = 'ARMA(1,1) Sunspots Prediction',legend = True)
RMSE = np.sqrt(np.mean(residuals**2))

The ARMA(1,1) model produces the single day forward prediction. In other words we forecast one day ahead.

One of the downfalls of the ARMA model is the number of days for forward prediction.

Image by [author]
Image by [author]

LSTM Modelling

The LSTM is a type of model within the RNN family. It can be thought of as a neural network for time series data.

The LSTM differs from the neural network as follows; instead of single time instance for input into a neural network the LSTM takes x pervious days for input.

You may start to begin to understand why this will be useful for the Sunspots dataset. By using pervious data as input the model will attempt to capture the seasonality information on an 11-year cycle.

The main idea behind LSTM is in the forget gate where information can be passed through to the next neural network.

Here the green block represents a look-back day, for example if our LSTM model was using a look-back period of 3 days the below graphic would represent a single day in our model.

Image by [Wikipedia]
Image by [Wikipedia]

For an LSTM model the model structure is dictated by the structure of the input data and the features for prediction

The number of previous months for look-back and the number of input features will determine the number of neurons that require weights and bias.

LSTM Modelling in Python

I will be using the Keras libraries and packages for creating the LSTM model.

Creating the input data structure is the most important step in LSTM modelling in Python

The first step is simply splitting our data into the train and test datasets.

split = 0.7
#Split into test and training set (70/20 split)
length = len(dataset)
train_length = round(length*split)
test_length = len(dataset) - train_length
train = dataset[0:train_length,:]
test = dataset[train_length:length,:]

Let’s introduce the preprocessing functions for a single day ahead LSTM forecast.

def preprocessing(training_set_scaled,n,train_length,days):
    y_train = []
    for i in range(days, train_length -1 ):
        y_train.append(training_set_scaled[i, 0]) # note that our predictor variable is in the first column of our array 
    y_train = np.array(y_train)

    X = np.zeros(shape=[train_length - (days+1),days,n],dtype=float)
    for j in range(n):
        for i in range(days, train_length-1):
            X[i-days,:,j] = training_set_scaled[i-days:i, j]
    return X, y_train
Image by [author]
Image by [author]

Here the shape of our numpy array is as follows

  • 418: The number of days in our training data
  • 3: the number of Days we look-back for prediction
  • 1: the number of features used for prediction

Next I introduce the structure of the LSTM model

# Initialising the RNN
regressor = Sequential()
# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = nodes, return_sequences = True, input_shape = (new.shape[1], new.shape[2])))
regressor.add(Dropout(0.2))
# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = nodes, return_sequences = True))
regressor.add(Dropout(0.2))
# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = nodes, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = nodes))
regressor.add(Dropout(0.2))
# Adding the output layer
regressor.add(Dense(units = t)) # this is the output layer so this repersetns a single node with our output value
# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fitting the RNN to the Training set
model = regressor.fit(new, y_train, epochs = 50, batch_size = 22)

Single Day forward prediction with LSTM

Image by [author]
Image by [author]

The LSTM model preforms quite well, however part of this is simply because we are predicting one day ahead at a time. Next we look at multi-day forward prediction.

Multi-day LSTM Modelling in Python

Let’s introduce the preprocessing functions for a multi-day ahead LSTM forecasting.

  • Look-back is 3 Day
  • Forward Prediction is 5 days

I am using the previous 3 days to predict 5 days into the future at each time-step.

def preprocessing_muli(training_set_scaled,n,t,train_length,days):
    days_length = train_length - t
    y_train = np.zeros(shape=[days_length-t,t],dtype=float)
    #y_train[:,0] = training_set_scaled[t:days,0]
    #y_train = np.zeros(shape=[days_length - t,days],dtype=float)
    for i in range(t):
        y_train[:,i] = training_set_scaled[i+t:days_length+i,0] # Here each column repersents a time step.

    X = np.zeros(shape=[days_length-t,days,n],dtype=float)
    for j in range(n):
        for i in range(days):
            X[:,i,j] = training_set_scaled[i:days_length-t+i, j] 
    return X, y_train

Loss Function

I explore the loss function when fitting the LSTM model to both understand if the model converges and how fast does it converge. The loss function is a measurement of the distance our predictions are from the response variable. Below is an example of a linear regression model fit using MSE. Understand that the distance from each blue point to the line is measured then squared and finally divided by the total number of data points.

Image by [Wikipedia]
Image by [Wikipedia]

To keep things simple I will be using the mean squared error as the loss function. The formula for the mean squared error (MSE) is given as

From the MSE formula the hat represents the predicted or estimated values for the variable y. In mathematics the hat above the variable represents an estimation for a given random variable. n represents the number of total data points that we our estimating. The graphic above shows how the MSE would be calculated for a simple linear regression model in two dimensions.

The results for the MSE loss function for the Multi-day LSTM Model is shown below.

Image by [author]
Image by [author]

The results for the 3 day look-back and 5 day forward prediction.

Image by [author]
Image by [author]

LSTM – Sunspots 11 year cycle prediction in Python

Next I investigate if the LSTM model can predict one full cycle ahead. For our dataset each data point is representative of a single month and we know that we should expect a sunspot cycle each 11 years so we will be forward predicting 132 data points into the future. However to capture part of the next cycle as-well I will use 160 days for forward prediction.

  • Look-back is 200 Days
  • Forward Prediction is 160 days
Image by [author]
Image by [author]

The LSTM does properly capture the seasonality or cyclic component of the time series however it offset and the peak and trough are not aligned.

Image by [author]
Image by [author]

Attempting to predict an entire 11 year cycle ahead did not yield the best results so next I attempt to predict an half cycle ahead or about 60 days ahead.

LSTM Sunspots half cycle prediction in Python

Lastly I investigate if the LSTM model can predict a half cycle ahead. For our dataset each data point is representative of a single month and we know that we should expect a sunspot cycle each 11 years so we will be forward predicting about 60 data points into the future.

  • Look-back is 120 Days
  • Forward Prediction is 60 days
Image by [author]
Image by [author]

Here I create four blocks each 60 days long for predicting the test dataset.

Next I plot 60 days into the future.

Image by [author]
Image by [author]

Conclusion

Everything that was shown in this article was done to forecast the sunspots dataset using real data. Both ARMA and LSTM were evaluated in-depth. The LSTM far outperforms the ARMA model in terms of multi-day forward prediction, the ARMA model can capture the seasonality component while the LSTM is unable to correctly identify the start and end of a new cycle. Adding additional features to the LSTM model may yield better results when forecasting the number of sunspots.

My recommendation for modelling the sunspots dataset would be to use the LSTM model.


A-little bit about myself

I have recently completed a master’s in applied mathematics at Ryerson University in Toronto, Canada. I am part of the financial mathematics group specializing in statistics. I studied volatility modeling and algorithmic trading in-depth.

I previously worked as an engineering professional at Bombardier Aerospace, where I was responsible for modeling the life-cycle costs associated with aircraft maintenance.

I am actively training for triathlon and love fitness.

If you have any suggestions on the topics below let me know

  • Data Science
  • Machine Learning
  • Mathematics
  • Statistical Modelling

LinkedIn: https://www.linkedin.com/in/ethanjohnsonskinner/

Twitter : @Ethan_JS94

Join Medium with my referral link – Ethan Johnson-Skinner, MSc


Related Articles