As a stock trader, you need to master technical analysis. Technical analysis is based on the premise that we can use past price to predict the future price. There are certain price patterns that can predict future stock price pretty accurately. Now increasingly traders are turning towards doing quantitative analysis in addition to technical analysis. Quantitative analysis allows you to use the power of machine learning and data science for predicting stock price movement. Algorithmic trading is on the rise. Did you read the post on how to develop algorithmic trading strategies using Python that beats S&P500? In this post we are going to discuss how we can use power of Python in doing machine learning and data science.
Before we can use power of Python in doing machine learning and data science we need to be able to load data into python IDE that can be analyzed. It is very important for you to use the right Python IDE. There are a number of Python IDEs (IDE stands for Integrated Development Environment). First you need to download Python compiler. You can do it directly from python official site.
We recommend you to use Anaconda. Spyder is an IDE developed specially for quantitative and scientific analysis and it comes by default with Anaconda. Spyder is a lightweight and it provides you the best environment for doing quantitative analysis. You can also check Quantopian. There is nothing to install. The compilers are hosted online and Quantopian is going to take care of providing you the right IDE when coding in a particular language.
There are dozens of languages that you can use. R is another powerful machine learning language that many use in analyzing the financial data. Did you read the post on how to use Quantmod R package in analyzing daily stock market data? Now R is a powerful scripting language that has been developed well to read financial time series data from the web as well as from csv. Yahoo Finance is a major source of providing daily quotes for almost all stocks listed in major stock exchanges. R can easily connect with Yahoo Finance and download the stock data for you if you provide the right ticker symbol.
Stock prices are one of the most important type of financial time series. Python previously lacked the ability to deal with financial time series data. This problem was solved some years back by Wes McKinney when he was working at a large hedge fund AQR Capital Management. He copied the ideas from R language and developed Pandas library for Python language that can deal with financial time series data in much the same manner as R using the concepts of Dataframe and Time Series Classes. Watch the following 3 videos that give you an introduction to Pandas library.
Most of the professional traders have been using Excel in their daily stock market analysis. As it is explained in the above video Excel is pretty slow when it comes to doing data analysis. Excel also lacks the power of doing machine learning. A few years back I had tried to implement a neural network using Excel. I could only make a very basic and simple neural network. Excel cannot implement many of the powerful machine learning algorithms like neural networks, support vector machines, randomForests fuzzy logic, Kalman filtering, particle filtering, LASSO, adaBoost etc. Without these powerful algorithms you are just not equipped to make a good financial time series predictive model that you can use in your trading. We are traders and do all this machine learning and data analysis has only one objective: building a better and more accurate trading system.
Now we are particularly interested in downloading daily stock market data from Yahoo Finance. Yahoo Finance has comprehensive stock market financial data. Sometime this stock data doesn’t get updated in time but most of the time you will find the stock market data on Yahoo Finance to be good enough for doing machine learning and data analysis. There are other sites also that you can use to download financial market data that includes Google Finance.
Below is the python code that will download the daily stock market data from Yahoo Finance. You just need to provide the ticker symbol and the start and end date for the data.
#import stock market data from Yahoo Finance
import pandas_datareader.data as web
import matplotlib.pyplot as plt
#specify the dates for stock quotes
#download data from Yahoo Finance
SP500=web.DataReader(“^GSPC”, “yahoo”, startDate, endDate)
#show the first 5 rows of the dataframe
# show the last 5 rows in the dataframe
Below is the output when you run the above python code. This is a very basic python script that downloads S&P 500 index daily data for the last 6 years. You can download any other stock daily data using the above script. Just replace the ticker symbol ^GSPC with the appropriate stock ticker symbol. For example if you want to download Google data replace ^GSPC with GOOG or GOOGL whatever share class you want to analyze.
Python 3.5.2 |Anaconda 4.0.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] on win32
Type “help”, “copyright”, “credits” or “license” for more information.
>>> runfile(‘D:/Documents/Documents/script.py’, wdir=’D:/Documents/Documents’)
DatetimeIndex: 1714 entries, 2010-01-04 to 2016-10-21
Data columns (total 6 columns):
Open 1714 non-null float64
High 1714 non-null float64
Low 1714 non-null float64
Close 1714 non-null float64
Volume 1714 non-null int64
Adj Close 1714 non-null float64
dtypes: float64(5), int64(1)
memory usage: 93.7 KB
Open High Low Close Volume \
2010-01-04 1116.560059 1133.869995 1116.560059 1132.989990 3991400000
2010-01-05 1132.660034 1136.630005 1129.660034 1136.520020 2491020000
2010-01-06 1135.709961 1139.189941 1133.949951 1137.140015 4972660000
2010-01-07 1136.270020 1142.459961 1131.319946 1141.689941 5270680000
2010-01-08 1140.520020 1145.390015 1136.219971 1144.979980 4389590000
Did you read the post on how I made $500K in 1 year with machine learning and HFT trading ? In the next post we are going to discuss how to download high frequency data.