Stock price forecasting using FB’s Prophet: Part 2

Facebook’s Prophet module is a trend forecasting library for Python. We spent some time over the last week going over it via this awesome introduction on Medium, but decided to do some refactoring to make it more reusable. Previously, we setup our pipenv virtual environment, separated sensitive data from our source code using dotenv, and started working with Alpha Vantage’s stock price and technical indicator API. In this post we’ll save our fetched data using Pickle and do some dataframe manipulations in Pandas. Part 3 is also available now.

Pickling our API results

When we left off, we had just wrote our get_time_series function, to which we pass 'get_daily' or such and a symbol for the stock that we would like to retrieve. We also have our get_technical function that we can use to pull any of the dozens of indicators available through Alpha Vantage’s API. Following the author’s original example, we can load Apple’s price history, simple moving average and RSI using the following calls:

symbol = 'AAPL'
ticker = get_time_series('get_daily', symbol, outputsize='full')
sma = get_technical('get_sma', symbol, time_period=60)
rsi = get_technical('get_rsi', symbol, time_period=60)

We’ve now got three dataframes. In the original piece, the author shows how you can export and import this dataframe using Panda’s .to_csv and read_csv functions. Saving the data is a good idea, especially during this stage of development, because it allows us to cache out data and reduce the number of API calls. (Alpha Vantage’s free tier allows 5 calls per minute, 500 a day. ) However, using CSV to save Panda’s dataframes is not recommended, as you will use index and column data. Python’s Pickle module will serialize the data and preserve it whole.

For our implementation, we will create a get_symbol function, which will check a local cache folder for a copy of the ticker data and load it. Our file naming convention uses the symbol string plus today’s date. Additionally, we concat our three dataframes into one using Pandas concat function:

def get_symbol(symbol):
    CACHE_DIR = './cache'
    # check if cache exists
    symbol = symbol.upper()
    today = datetime.now().strftime("%Y_%m_%d")

    file = CACHE_DIR + '/' + symbol + '_' + today + '.pickle'
    if os.path.isfile(file):
        # load pickle
        print("{} Found".format(file))
        result = pickle.load(open(file, "rb"))
    else:
        # get data, save to pickle
        print("{} not found".format(file))
        ticker = get_time_series('get_daily', symbol, outputsize='full')
        sma = get_technical('get_sma', symbol, time_period=60)
        rsi = get_technical('get_rsi', symbol, time_period=60)

        frames = [ticker, sma, rsi]
        result = pd.concat(frames, axis=1)
        pickle.dump(result, open(file, "wb"))
        print("{} saved".format(file))
    return result

Charts!

The original author left out all his chart code, so I had to figure things out on my own. No worries.

result = get_symbol("goog")
plt.plot(result.index, result['4. close'], result.index, result.SMA, result.index, result.RSI)
plt.show()
Google stock price (blue), 20-day moving average (orange) and RSI (green)

Since the RSI is such a small number relative to the stock price, let’s chart it separately.

    plt.subplot(211, title='Price')
    plt.plot(result.index, result['4. close'], result.index, result.SMA)
    plt.subplot(212, title="RSI")
    plt.plot(result.index, result.RSI)
    plt.show()
Much better.

We saved both of these in a plot_ticker function for reuse in our library. Now I am no expert on matplotlib, and have only done some basic stuff with Plotly in the past. I’m probably spoiled by looking at TradingView’s wonderful chart tools and dynamic interface, so being able to drag and zoom around in the results is really important to me from a usability standpoint.

Now I am no expert on matplotlib, and have only done some basic stuff with Plotly in the past. I’m probably spoiled by looking at TradingView’s wonderful chart tools and dynamic interface, so being able to drag and zoom around in the results is really important to me from a usability standpoint. So we’ll leave matplotlib behind from here, and I’ll show you how I used Bokeh in the next part.

Framing our data

We already showed how we concat our price, SMA and RSI data together earlier. Let’s take a look at our dataframe metadata. I want to show you the columns, the dtype of those columns, as well as that of the index. Tail is included just for illustration.

>>> ticker.columns
Index(['1. open', '2. high', '3. low', '4. close', '5. volume', 'SMA', 'RSI'], dtype='object')

>>> ticker.dtypes
1. open      float64
2. high      float64
3. low       float64
4. close     float64
5. volume    float64
SMA          float64
RSI          float64
dtype: object

>>> ticker.index
DatetimeIndex(['1999-10-18', '1999-10-19', '1999-10-20', '1999-10-21',
               '1999-10-22', '1999-10-25', '1999-10-26', '1999-10-27',
               '1999-10-28', '1999-10-29',

>>> ticker.tail()
            1. open  2. high  3. low  4. close   5. volume       SMA      RSI
date                                                                         
2019-10-09   227.03   227.79  225.64    227.03  18692600.0  212.0238  56.9637
2019-10-10   227.93   230.44  227.30    230.09  28253400.0  212.4695  57.8109

Now we don’t need all this for Prophet. In fact, it only looks at two series, a datetime column, labeled ‘ds’, and the series data that you want to forecast, a float, as ‘y’. In the original example, the author renames and recasts the data, but this is likely because of the metadata loss when importing from CSV, and isn’t strictly needed. Additionally, we’d like to preserve our original dataframe as we test our procedure code, so we’ll pass a copy.

def alpha_df_to_prophet_df(df):
    prophet_df = df.get('4. close')\
        .reset_index(level=0)\
        .rename(columns={'date': 'ds', '4. close': 'y'})

    # not needed since dtype is correct already
    # df['ds'] = pd.to_datetime(df['ds'])
    # df['y'] = df['y'].astype(float)
    return prophet_df

>>> alpha_df_to_prophet_df(ticker).tail()
             ds       y
5026 2019-10-09  227.03
5027 2019-10-10  230.09
5028 2019-10-11  236.21
5029 2019-10-14  235.87
5030 2019-10-15  235.32

In the first line of prophet_df =we’re selecting only the ‘close’ price column, which is returned with the original DateTimeIndex. We reset the index, which makes this into a ‘date’ column. Finally we rename them accordingly.


And that’s it for today! Next time we will be ready to take a look at Prophet. We’ll process our data, use Bokeh to display it, and finally write a procedure which we can use to process data in bulk.