The world’s leading publication for data science, AI, and ML professionals.

Predicting the Price of the Beyond Meat Stock Using Random Forest in Python

In this post we will predict the price of the Beyond Meat stock using random forest. Beyond Meat is a producer of plant-based meat…

Photo by Pixabay on Pexels
Photo by Pixabay on Pexels

In this post we will predict the price of the Beyond Meat stock using random forest. Beyond Meat is a producer of plant-based meat substitutes and was founded by Ethan Brown in Los Angeles in 2009.

First thing we can do is import the necessary libraries. We will be using the yahoo Finance API, seaborn, matplotlib, pandas, numpy, and sklearn:

import yfinance as yf
import seaborn as sns
import numpy as np
import pandas as pd 
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

If you don’t have the python wrapper for the yahoo finance API installed you can type the following in a command line:

pip install yfinance

We can pull the last five months of ‘BYND’ stock data from Yahoo finance and print the first five rows. We will pull data from May 5, 2019 to November 2, 2019:

data = yf.Ticker('BYND')
df = data.history(period="max",  start="2019-05-01", end="2019-11-02")
print(df.head())

We can plot the open prices using seaborn. We also use matpltlib to modify the seaborn plot. In order to plot the time series we need to convert the date strings into datetime objects:

sns.set()
df['timestamp'] = df.index
df['timestamp'] = pd.to_datetime(df['timestamp'])
sns.lineplot(df['timestamp'], df['Open'])
plt.ylabel("Open Price")

Next thing we can do is calculate the daily returns using the open and close daily prices and plot the results:

df['returns'] = (df['Close']-df['Open'])/df['Open']
sns.lineplot(df['timestamp'], df['returns'])
plt.ylabel("Returns")

Next we define a variable that specifies how far out we want to predict. Let’s predict 3 days out. We also create a new prediction column which is the target variable shifted up by the 3 days. Here our target variable will be the closing price :

forecast_out = 3
df['prediction'] = df[['Close']].shift(-forecast_out)
X = np.array(df['Close']).reshape(-1,1)
X = X[:-forecast_out]
y = np.array(df['prediction'])
y = y[:-forecast_out]

Next we split our data for training and testing, define a random forest object and train our model:

reg = RandomForestRegressor(n_estimators = 300, max_depth =300, random_state = 42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state = 7)
reg.fit(X_train, y_train)
print("Performance (R^2): ", reg.score(X_test, y_test))

Our model has R² = 0.84, which isn’t too bad. Thank you for reading and happy Machine Learning! The code from this post is available on GitHub.


Related Articles