Stock Price Simulator

Stonks | Know Your Meme

Image Source:Giphy.com

 

Hi ! I'm Z and this is a simple Stock Price Forecasting Web App that I built for fun using a combination of code from different online articles (sources listed below) and some of my own original code.

 

About This Web App

The web app allows you to analyze historical stock prices and also perform short term forecasts (~1-2wks) and compare results between different 'traditional' time series forecasting methods (ARIMA, Exponential Smoothing, etc) 

Pick an option from the list below and refer to the notes for more details

Unless otherwise stated, all calculations reference the Adjusted Closing Prices which are extracted from Yahoo Finance through Python's PandasDataReader library and processed in the backend by a Python script.

By default this web-app only plots and tabulates the last 60 "tail" values of any actual vs predicted stock prices. However the full table is available in the Jupyter Notebook iPYNB version of this file available on Github

Results will open up in a new tab to allow for easy comparison between different selections. 

Be aware that some selections need more time for the results to be calculated (E.g. Auto-Holt Winters , Efficient Portfolio Weights)

Please Note: I am not a financial expert of any sort so the use of this tool for any investment decisions will be at your own risk !  

 















































Note - Please be patient as this needs about 1-2 minutes to run (1-2secs per stock)

Works best for ~30-50 counters

e.g. For a mix of 50 different SP500 Tech Stocks : AAPL, ACN, ADBE, ADSK, AKAM, AMD, AVGO, BR, CDW, CRM, CSCO, CTSH, CTXS, DXC, ENPH, FFIV, FIS, FISV, FLIR, FLT, FTNT, GLW, GPN, HPE, HPQ, IBM, INTC, INTU, IT, JNPR, MA, MCHP, MSFT, MSI, MU, NOW, NTAP, NVDA, ORCL, PYPL, QCOM, STX, SWKS, TER, TXN, V, VRSN, WU, XRX, ZBRA



DescriptionSelectChoose Hi/LoFactor Weight
Valuation Measures
Market Cap (intraday)
Enterprise Value
Trailing P/E
Forward P/E
PEG Ratio (5 yr expected)
Price/Sales (ttm)
Price/Book (mrq)
Enterprise Value/Revenue
Enterprise Value/EBITDA
Financial Highlights
Fiscal Year
Fiscal Year End Date
Most Recent Quarter (mrq) Date
Profitability
Profit Margin
Operating Margin (ttm)
Management Effectiveness
Return on Assets (ttm)
Return on Equity (ttm)
Income Statement
Revenue (ttm)
Revenue Per Share (ttm)
Quarterly Revenue Growth (yoy)
Gross Profit (ttm)
EBITDA
Net Income Avi to Common (ttm)
Diluted EPS (ttm)
Quarterly Earnings Growth (yoy)
Balance Sheet
Total Cash (mrq)
Total Cash Per Share (mrq)
Total Debt (mrq)
Total Debt/Equity (mrq)
Current Ratio (mrq)
Book Value Per Share (mrq)
Cash Flow Statement
Operating Cash Flow (ttm)
Levered Free Cash Flow (ttm)
Stock Price History
Beta (5Y Monthly)
52-Week Change
52 Week High
52 Week Low
50-Day Moving Average
200-Day Moving Average
Share Statistics
Avg Vol (3 month)
Avg Vol (10 day)
Shares Outstanding
Float
% Held by Insiders
% Held by Institutions
Dividends and Splits
Forward Annual Dividend Rate
Forward Annual Dividend Yield
Trailing Annual Dividend Rate
Trailing Annual Dividend Yield
5 Year Average Dividend Yield
Payout Ratio
Dividend Date
Ex-Dividend Date
Last Split Factor
Last Split Date

 

Type Title Description More Detailed References Source  Code 
Analysis Candlestick Charts Candlestick charts show daily open, high, low, and close price and are used to determine possible price movement based on past patterns Candlestick Charting (Investopedia.com)

Adapted from:

https://medium.com/swlh/generating-candlestick-charts-from-scratch-ef6e1d3cf0e9

 

candlesticks

Image Source:Quadency.com

 

Type Title Description More Detailed References Source  Code 
Analysis Log Daily Returns Chart

Log Returns are often used instead of simple returns to help normalize the underlying time series before applying other forecasting techniques.

Log Returns=

ln (Price(t+1)/Price(t))

vs

Simple Arithmetic Returns=

[Price(t+1)-Price(t)]/ Price(t)

In addition, this option also :-

  • plots the 'best fit' normal distribution for the log returns data
  • plots a Q-Q Chart for the log returns data
  • checks the skew and kurtosis  of the distribution;
  • checks if the mean & variance are constant by splitting the start / mid/ end of the training data and comparing them;
  • tests whether the distribution is normally distributed via KS Test;
  • tests whether the distribution is stationary via a ADF Test;

Why Log Returns ? (Quantivy Blog)

Komologorov Smirnov Test For Normality (Real-Statistics.com)

Augmented Dickey Fuller Test (StatisticsHowTo.com)

Q-Q Plots (StatisticsHowTo.com)

N/A - Original Content

 

Skewness & Kurtosis (Source https://www.pinterest.com/pin/422281202809163/ )

Skewness and kurtosis | Statistics math, Ap statistics, Statistics notes

Q-Q Plot (Source https://stats.stackexchange.com/questions/101274/how-to-interpret-a-qq-plot )

How to interpret a QQ plot - Cross Validated

Examples Of Stationary vs Non Stationary Series (Source : https://stats.stackexchange.com/questions/182764/stationarity-tests-in-r-checking-mean-variance-and-covariance

 

 

 

Type Title Description More Detailed References Source  Code 
Analysis Relative Strength Index Chart (RSI)

RSI is a technical indicator that charts the current and historical strength or weakness of a stock based on the closing prices of a recent trading period.

Values >70 indicate overbought shares and <30 indicates underbought shares

Typically periods for RSI are set for 14 days and calculated using Simple Moving Averages SMA

What Is RSI ? (Fidelity.com)

Adapted From:

https://stackoverflow.com/questions/20526414/relative-strength-index-in-python-pandas

 

rsi

Image Source:InvestorsUnderground.com

 

Type Title Description More Detailed References Source  Code 
Analysis Moving Average Convergence Divergence (MACD)

Moving average convergence divergence (MACD) is a trend-following momentum indicator that shows the relationship between two moving averages of a stock price.

The MACD is typically calculated by subtracting the 26-period exponential moving average (EMA) from the 12-period EMA which is then smoothed over a 9-period EMA to generate a 'signal' line (i.e Set Periods 1,2,3 as 12,26,9 days respectively)

What Is MACD ? (Fidelity.com)

-NA-

 

macd

Image Source:KLSEi3Investor.com

 

Type Title Description More Detailed References Source  Code 
Analysis Bollinger Bands Chart Bollinger Bands show a 'band' around  a simple moving average (SMA) of a stock price reflecting the standard deviations. Typical values are a smoothing period of 20days and standard deviation of 2 What Are Bollinger Bands ? (Fidelity.com) 

Adapted From:

https://tcoil.info/compute-bollinger-bands-for-stocks-with-python-and-pandas/

 

rsi

Image Source:TradingWithRayner.com

 

Type Title Description More Detailed References Source  Code 
Forecast Moving Average Forecast (Univariate) Forecasts either a Simple Moving Average or Exponentially Weighted Average for a selected smoothing period SMA and EMAs (SchoolStockCharts.com)

Adapted From:

Intellipaat.com

 

Comparison Exponential vs Simple Moving Averages

(Source : https://medium.com/@ethan_johnson03/exploring-exponential-moving-average-56a3e89530c7)

Exploring Exponential Moving Average | by Ethan Johnson | Medium

Comparison of EMA vs SMA for 10 Day Period

(Source: https://www.researchgate.net/figure/Comparison-of-moving-averages-The-SMA-has-more-lag-than-the-EMA_fig1_235766539

Comparison of moving averages: The SMA has more lag than the EMA. 


 

Type Title Description More Detailed References Source  Code 
Forecast

Auto-ARIMA+GARCH Forecast

(ARIMA= Auto Regressive Integrated Moving Average 

GARCH = Generalised Auto Regressive Conditional Heteroskedasticity)

An (S)ARIMA model is used to model the log RETURNS (which is assumed to be stationary). The inputs for SARIMA i.e the (p,d,q,P,D,Q values) are AUTOMATICALLY set via pmdarima, a Python package, that optimizes for the config with the lowest AIC (Akaike information criterion)

However since the volatility of the returns are likely to be non-constant  the GARCH(1,1) model is used improve forecasting performance

Please be patient as algorithm needs ~1-2 min per stock to complete. 

ARIMA GARCH Forecasting (Article from Medium.com)

Adapted From:

ARIMA GARCH Forecasting (Article from Medium.com)

 

The ARIMA model estimates the conditional mean (in the red box) where a subsequent GARCH model estimates the conditional variance present in the residuals of the ARIMA estimation (in the green box).

 

Type Title Description More Detailed References Source  Code 
Forecast

Auto Holt Winters

Triple Exponential Smoothing

Fits a Triple Exponential Smoothing model against historical PRICES.

Will 'brute-force' search all possible permutations for trend & seasonality type (additive vs multiplicative vs none) for a user provided seasonal period and select the configuration with the lowest RMSE for the training data.

Typically, seasonality can be set at 252 days (~1 yr worth of trading days) or 5days (1 wk) 

When selecting Start/End Date for historical training data, please note that this algorithm must have AT LEAST 2 seasonal periods for it to work.

NOTE: Please be patient as algorithm needs ~3-4 min per stock to complete it's run  

Holt Winters Forecasting For Dummies (https://grisha.org)

Adapted From:

How to Grid Search Triple Exponential Smoothing for Time Series Forecasting in Python

(Machine Learning Mastery)

mseas.fig1

Source : https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/ 

 

Type Title Description More Detailed References Source  Code 
Forecast

Vector Auto Regression Forecast (Multivariate)

Vector Auto Regression (VAR) is a MULTIVARIATE forecasting algorithm used when two or more time series influence each other. This forecast model applies VAR over the Log Returns data for multiple stocks.

Unlike ARMA/ARIMA which are UNIVARIATE and 'one-directional',bVector Auto Regression (VAR) is bi-directional where each stock returns influence each other.

The algorithm automatically selects the best Lag Order reflecting the auto-regressive relationship between the various different time series based on the lowest Aikake Information Criterion score.

In addition, this option also returns a number of statistical tests to confirm the validity of the forecasting model:-

  • Granger’s causality tests the null hypothesis that the coefficients of past values in the regression equation is zero.
  • Johanson's Cointegration test helps to establish the presence of a statistically significant connection between two or more time series
  • Durbin-Watson's test checks for any serial correlation of residuals (i.e if there is any leftover pattern in the residuals errors)

(Note - Use the Analysis > Log Returns to check the ADF Statistic as Vector Auto Regression requires stationary data)

Multivariate Time Series Forecasting (AnalyticsVidhya.com)

Granger's Causality Test (StatisticsHowTo.com)

Johansen's Cointegration Test (StatisticsHowTo.com)

Durbin-Watson Test (Investopedia.com)

Adapted From:

https://www.machinelearningplus.com/time-series/vector-autoregression-examples-python

 

Type Title Description More Detailed References Source  Code 
Forecast

Geometric Brownian Motion (GBM) Forecast (Multivariate)

Geometric Brownian Motion (GBM) models stock prices as a random walk with a constant drift accompanied by random shocks

Assumes that the LOG RETURNS will then have a normal distribution where the mean and the variance of the LOG returns are estimated from the historical price data and assumed to be constant

This function also works for MULTIVARIATE inputs as it will estimate the Covariance between Log Returns of different stocks and incorporate it into the 'random shock' component

How To Use Monte Carlo Simulations With GBM (Investopedia.com)

Adapted and modified from:

https://towardsdatascience.com/how-to-simulate-financial-portfolios-with-python-d0dc4b52a278

 

Type Title Description More Detailed References Source  Code 
Forecast

Bootstrap Forecast (Multivariate)

Similar to the GBM model, the Bootstrap Sampling forecast model assumes that future stock returns are random but can be estimated using the distribution of historical stock returns. 

However unlike the GBM model, it does not require the Log Returns to be normally distributed and instead future prices are 'back-calculated' by sampling from historical returns and translating these back into absolute price values. 

This function also works for MULTIVARIATE inputs as it samples 'sets' of returns from the same time slice thereby capturing correlation effects between stocks

Monte Carlo vs Bootstrapping (PythonForFinance.net)

-NA-

 

Type Title Description More Detailed References Source  Code 
Forecast

Portfolio Weights For Optimal Risk-Returns USING HISTORICAL RETURNS

 

Modern Portfolio Theory assumes that there exists a set of optimal portfolios (Efficient Portfolios) that offer the highest expected return for a defined level of risk or the lowest risk for a given level of expected return

This takes HISTORICAL returns data for a given start-end date and extracts the Std Deviation and Mean Log Returns for random combinations of stock weights using Monte Carlo Simulation and and plots them against each other.

The Sharpe Ratio and Sortino Ratio for each stock weight configuration is tested to find the "optimal" weights with the best returns (Mean) for a given risk (standard deviation) 

NOTE: Please be patient as the algorithm takes time to run e.g. ~1,000 iterations takes about 3-5 minutes

Efficient Frontier

Mean Variance Portfolio Analysis

Sortino Ratio

Sharpe Ratio

(Investopedia.com)

-NA-

 

Image Source:http://datascience.uconn.edu/index.php/projects/students-work/item/61-markowitz-portfolio-construction 

 

Type Title Description More Detailed References Source  Code 
Forecast

Portfolio Weights For Optimal Risk-Returns Using FUTURE FORECAST RETURNS

 

Similar to previous option that used HISTORICAL RETURNS with ONE KEY DIFFERENCE.

Instead of HISTORICAL prices, the algorithm forecasts the P50 FUTURE returns using GBM or Bootstrap Sampling method for random combinations of stock weights are tested to plot the Std Deviation and Mean Log Returns using the same Monte Carlo Simulation.

Note the no of iterations for the GBM or Bootstrap to estimate the P50 returns is FIXED to 10 iters to avoid time out - use the IPYNB version of the code if you want to experiment further

The Sharpe Ratio and Sortino Ratio for each stock weight configuration is tested to find the "optimal" weights with the best returns (Mean) for a given risk (standard deviation) 

NOTE: Please be patient as the algorithm takes time to run e.g. ~1,000 iterations takes about 3-5 minutes

Efficient Frontier

Mean Variance Portfolio Analysis

Sortino Ratio

Sharpe Ratio

(Investopedia.com)

-NA-

 

Image Source:http://datascience.uconn.edu/index.php/projects/students-work/item/61-markowitz-portfolio-construction 

 

Type Title Description More Detailed References Source Code
Analysis Multi Factor Ranking & Screening

User makes a selection of relevant factors (key statistics from Yahoo Finance); a choice of factor weights ; normalization method (Z Score, Min-Max, Percentile) ; and aggregation type( Additive or Multiplicative.

The code then calculates a composite score for each stock and ranks them to determine the "Top Quartile" stocks and compares them to the latest Annual y.o.y Stock Return 

 

Factor Investing (Source:Investopedia) Based on some original code hosted on the google colab here :
Colab Workbook

 

               

 


Troubleshooting Common Errors

If you encounter unusual / very poor forecast results :- consider the following :-


About Me

I'm Z (zhijingeu@yahoo.com) , an amateur Data Science enthusiast based in Asia Pac. 

Connect With Me On LinkedIn: https://www.linkedin.com/in/eu-zhijing-25a4362 

Read Some Of My Medium Articles: https://zhijingeu.medium.com/

Visit My Github Page: https://github.com/ZhijingEu/


Feedback 

Like this site ? Let me know !

No Of Visitors