Bitcoin Price Prediction Using Time Series Forecasting

Published in

Towards Data Science

7 min readJun 27, 2018

This article is about predicting bitcoin price using time series forecasting. Time series forecasting is quite different from other machine learning models because -

1. It is time dependent. So, the basic assumption of a linear regression model that the observations are independent doesn’t hold in this case.

2. Along with an increasing or decreasing trend, most time series have some form of seasonality trends, i.e. variations specific to a particular time frame.

Therefore simple machine learning models cannot be used and hence time series forecasting is a different area of research. In this article time series models like AR( Auto Regressive model), MA (Moving Average model) and ARIMA (Autoregressive Integrated Moving Average model) are used for forecasting the price of bitcoin.

The dataset contains the opening and closing prices of bitcoins from April 2013 to August 2017

Importing necessary libraries

import pandas as kunfu
import numpy as dragon
import pylab as p
import matplotlib.pyplot as plot
from collections import Counter
import re

#importing packages for the prediction of time-series data
import statsmodels.api as sm
import statsmodels.tsa.api as smt
import statsmodels.formula.api as smf

from sklearn.metrics import mean_squared_error

The data is loaded from a csv file into train dataframe. This is how first five rows of our data look like.

Plotting the time series

Using date as index the series is plotted with Date on x axis and closing price on y axis.

data = train['Close']
Date1 = train['Date']
train1 = train[['Date','Close']]
# Setting the Date as Index
train2 = train1.set_index('Date')
train2.sort_index(inplace=True)
print (type(train2))
print (train2.head())
plot.plot(train2)
plot.xlabel('Date', fontsize=12)
plot.ylabel('Price in USD', fontsize=12)
plot.title("Closing price distribution of bitcoin", fontsize=15)
plot.show()

Testing the Stationarity

Augmented Dicky Fuller Test:

The Augmented Dicky Fuller test is a type of statistical test called a unit root test.

The intuition behind a unit root test is that it determines how strongly a time series is defined by a trend.

There are no. of unit root tests and ADF is one of the most widely used

1. Null Hypothesis (H0): Null hypothesis of the test is that the time series can be represented by a unit root that is not stationary.

2. Alternative Hypothesis (H1): Alternative Hypothesis of the test is that the time series is stationary.

Interpretation of p value

1. p value > 0.05: Accepts the Null Hypothesis (H0), the data has a unit root and is non-stationary.

2. p value < = 0.05: Rejects the Null Hypothesis (H0), the data is stationary.

from statsmodels.tsa.stattools import adfuller

def test_stationarity(x):


    #Determing rolling statistics
    rolmean = x.rolling(window=22,center=False).mean()

    rolstd = x.rolling(window=12,center=False).std()
    
    #Plot rolling statistics:
    orig = plot.plot(x, color='blue',label='Original')
    mean = plot.plot(rolmean, color='red', label='Rolling Mean')
    std = plot.plot(rolstd, color='black', label = 'Rolling Std')
    plot.legend(loc='best')
    plot.title('Rolling Mean & Standard Deviation')
    plot.show(block=False)
    
    #Perform Dickey Fuller test    
    result=adfuller(x)
    print('ADF Stastistic: %f'%result[0])
    print('p-value: %f'%result[1])
    pvalue=result[1]
    for key,value in result[4].items():
         if result[0]>value:
            print("The graph is non stationery")
            break
         else:
            print("The graph is stationery")
            break;
    print('Critical values:')
    for key,value in result[4].items():
        print('\t%s: %.3f ' % (key, value))
        
ts = train2['Close']      
test_stationarity(ts)

Since the p value is greater than 0.05 the time series is non stationary. Okay so far we tested the series and it is non stationary. So there is some work that needs to be done here. So now we use transformations to make the series stationary.

Log Transforming the series

Log transformation is used to unskew highly skewed data. Thus helping in forecasting process.

ts_log = dragon.log(ts)
plot.plot(ts_log,color="green")
plot.show()

test_stationarity(ts_log)

The series is still non stationary as p value is still greater than 0.05 so we need to make further transformations. So let’s go ahead with differencing.

Remove trend and seasonality with differencing

In case of differencing to make the time series stationary the current value is subtracted with the previous values. Due to this the mean is stabilized and hence the chances of stationarity of time series are increased.

ts_log_diff = ts_log - ts_log.shift()
plot.plot(ts_log_diff)
plot.show()

ts_log_diff.dropna(inplace=True)
test_stationarity(ts_log_diff)

As our time series is now stationary asour p value is less than 0.05 therefore we can apply time series forecasting models.

Auto Regressive model

Auto regressive model is a time series forecasting model where current values are dependent on past values.

# follow lag
model = ARIMA(ts_log, order=(1,1,0))  
results_ARIMA = model.fit(disp=-1)  
plot.plot(ts_log_diff)
plot.plot(results_ARIMA.fittedvalues, color='red')
plot.title('RSS: %.7f'% sum((results_ARIMA.fittedvalues-ts_log_diff)**2))
plot.show()

Moving Average Model

In moving average model the series is dependent on past error terms.

# follow error
model = ARIMA(ts_log, order=(0,1,1))  
results_MA = model.fit(disp=-1)  
plot.plot(ts_log_diff)
plot.plot(results_MA.fittedvalues, color='red')
plot.title('RSS: %.7f'% sum((results_MA.fittedvalues-ts_log_diff)**2))
plot.show()

Auto Regressive Integrated Moving Average Model

It is a combination of both AR and MA models. It makes the time series stationary by itself through the process of differencing. Therefore differencing need not be done explicitly for ARIMA model

from statsmodels.tsa.arima_model import ARIMA
model = ARIMA(ts_log, order=(2,1,0))  
results_ARIMA = model.fit(disp=-1)  
plot.plot(ts_log_diff)
plot.plot(results_ARIMA.fittedvalues, color='red')
plot.title('RSS: %.7f'% sum((results_ARIMA.fittedvalues-ts_log_diff)**2))
plot.show()

Thus we see that the RSS (Residual Sum of Squares) error is minimum for ARIMA model. Therefore ARIMA model is the best among the three models because of use of dependence on both lagged values and error terms. Therefore it is further used to calculate the mean square error. Here in the below code snippet the dataset is divided into train and test.

For every value in the test test we apply an ARIMA model and then the error is calculated and then after iterating over all values in the test set the mean error between predicted and expected value is calculated.

size = int(len(ts_log)-100)# Divide into train and test
train_arima, test_arima = ts_log[0:size], ts_log[size:len(ts_log)]history = [x for x in train_arima]predictions = list()
originals = list()
error_list = list()

print('Printing Predicted vs Expected Values...')
print('\n')# We go over each value in the test set and then apply ARIMA model and calculate the predicted value. We have the expected value in the test set therefore we calculate the error between predicted and expected value for t in range(len(test_arima)):
    model = ARIMA(history, order=(2, 1, 0))
    model_fit = model.fit(disp=-1)
    
    output = model_fit.forecast()
    
    pred_value = output[0]
    
        
    original_value = test_arima[t]
    history.append(original_value)
    
    pred_value = dragon.exp(pred_value)
    
    
    original_value = dragon.exp(original_value)
    
    # Calculating the error
    error = ((abs(pred_value - original_value)) / original_value) * 100
    error_list.append(error)
    print('predicted = %f,   expected = %f,   error = %f ' % (pred_value, original_value, error), '%')
    
    predictions.append(float(pred_value))
    originals.append(float(original_value))
    
# After iterating over whole test set the overall mean error is calculated.   
print('\n Mean Error in Predicting Test Case Articles : %f ' % (sum(error_list)/float(len(error_list))), '%')plot.figure(figsize=(8, 6))
test_day = [t
           for t in range(len(test_arima))]
labels={'Orginal','Predicted'}
plot.plot(test_day, predictions, color= 'green')
plot.plot(test_day, originals, color = 'orange')
plot.title('Expected Vs Predicted Views Forecasting')
plot.xlabel('Day')
plot.ylabel('Closing Price')
plot.legend(labels)
plot.show()

predicted = 2513.745189, expected = 2564.060000, error = 1.962310 %
predicted = 2566.007269, expected = 2601.640000, error = 1.369626 %
predicted = 2604.348629, expected = 2601.990000, error = 0.090647 %
predicted = 2605.558976, expected = 2608.560000, error = 0.115045 %
predicted = 2613.835793, expected = 2518.660000, error = 3.778827 %
predicted = 2523.203681, expected = 2571.340000, error = 1.872032 %
predicted = 2580.654927, expected = 2518.440000, error = 2.470376 %
predicted = 2521.053567, expected = 2372.560000, error = 6.258791 %
predicted = 2379.066829, expected = 2337.790000, error = 1.765635 %
predicted = 2348.468544, expected = 2398.840000, error = 2.099826 %
predicted = 2405.299995, expected = 2357.900000, error = 2.010263 %
predicted = 2359.650935, expected = 2233.340000, error = 5.655697 %
predicted = 2239.002236, expected = 1998.860000, error = 12.013960 %
predicted = 2006.206534, expected = 1929.820000, error = 3.958221 %
predicted = 1942.244784, expected = 2228.410000, error = 12.841677 %
predicted = 2238.150016, expected = 2318.880000, error = 3.481421 %
predicted = 2307.325788, expected = 2273.430000, error = 1.490954 %
predicted = 2272.890197, expected = 2817.600000, error = 19.332404 %
predicted = 2829.051277, expected = 2667.760000, error = 6.045944 %
predicted = 2646.110662, expected = 2810.120000, error = 5.836382 %
predicted = 2822.356853, expected = 2730.400000, error = 3.367889 %
predicted = 2730.087031, expected = 2754.860000, error = 0.899246 %
predicted = 2763.766195, expected = 2576.480000, error = 7.269072 %
predicted = 2580.946838, expected = 2529.450000, error = 2.035891 %
predicted = 2541.493507, expected = 2671.780000, error = 4.876393 %
predicted = 2679.029936, expected = 2809.010000, error = 4.627255 %
predicted = 2808.092238, expected = 2726.450000, error = 2.994452 %
predicted = 2726.150588, expected = 2757.180000, error = 1.125404 %
predicted = 2766.298163, expected = 2875.340000, error = 3.792311 %

Mean Error in Predicting Test Case Articles : 3.593133 %

Therefore the original and predicted time series is plotted with mean error of 3.59%.Therefore we were able to use different transformations and models to predict the closing price of bitcoin.

Thank you if you read till the last. This is my first article on towards data science and there are many more to come. If you find any mistake or have any suggestions please do comment. If you liked the post please don’t forget to clap ! Thankyou.