Returns of financial stocks
We will use the tidyquant package to download historical data of stock prices, calculate returns, and examine the distribution of returns.
We must first identify which stocks we want to download data for, and for this we must know their ticker symbol; Apple is known as AAPL, Microsoft as MSFT, McDonald’s as MCD, etc. The file nyse.csv contains 508 stocks listed on the NYSE, their ticker symbol, name, the IPO (Initial Public Offering) year, and the sector and industry the company is in.
nyse <- read_csv(here::here("data","nyse.csv"))
glimpse(nyse)
## Rows: 508
## Columns: 6
## $ symbol <chr> "MMM", "ABB", "ABT", "ABBV", "ACN", "AAP", "AFL", "A", "~
## $ name <chr> "3M Company", "ABB Ltd", "Abbott Laboratories", "AbbVie ~
## $ ipo_year <chr> "n/a", "n/a", "n/a", "2012", "2001", "n/a", "n/a", "1999~
## $ sector <chr> "Health Care", "Consumer Durables", "Health Care", "Heal~
## $ industry <chr> "Medical/Dental Instruments", "Electrical Products", "Ma~
## $ summary_quote <chr> "https://www.nasdaq.com/symbol/mmm", "https://www.nasdaq~
Based on this dataset, a table and a bar plot was created to show the number of companies per sector, in descending order
# Create table by sector in descending order of number of companies
companiesPerSector <- nyse%>%
group_by(sector)%>%
count(sort = TRUE, name = "numOfComp")
companiesPerSector
## # A tibble: 12 x 2
## # Groups: sector [12]
## sector numOfComp
## <chr> <int>
## 1 Finance 97
## 2 Consumer Services 79
## 3 Public Utilities 60
## 4 Capital Goods 45
## 5 Health Care 45
## 6 Energy 42
## 7 Technology 40
## 8 Basic Industries 39
## 9 Consumer Non-Durables 31
## 10 Miscellaneous 12
## 11 Transportation 10
## 12 Consumer Durables 8
# Create bar plot
ggplot(companiesPerSector, aes(x = numOfComp, y = reorder(sector,numOfComp),fill=sector))+
geom_col()+
labs(x = "Number of Companies",
y = "Sector",
title = "Number of Companies in each Sector")+
theme_bw()+
theme(legend.position="none")+
NULL

6 stocks including the the SP500 ETF (Exchange Traded Fund) was selected to perform the analysis.
# Select prices of the 6 stocks with SPY with a time frame
myStocks <- c("AAPL","DIS","DPZ","ANF","TSLA","SPY" ) %>%
tq_get(get = "stock.prices",
from = "2011-01-01",
to = "2021-08-31") %>%
group_by(symbol)
# Examine the structure of the resulting data frame
glimpse(myStocks)
## Rows: 16,098
## Columns: 8
## Groups: symbol [6]
## $ symbol <chr> "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL~
## $ date <date> 2011-01-03, 2011-01-04, 2011-01-05, 2011-01-06, 2011-01-07, ~
## $ open <dbl> 11.63000, 11.87286, 11.76964, 11.95429, 11.92821, 12.10107, 1~
## $ high <dbl> 11.79500, 11.87500, 11.94071, 11.97321, 12.01250, 12.25821, 1~
## $ low <dbl> 11.60143, 11.71964, 11.76786, 11.88929, 11.85357, 12.04179, 1~
## $ close <dbl> 11.77036, 11.83179, 11.92857, 11.91893, 12.00429, 12.23036, 1~
## $ volume <dbl> 445138400, 309080800, 255519600, 300428800, 311931200, 448560~
## $ adjusted <dbl> 10.10622, 10.15897, 10.24207, 10.23379, 10.30708, 10.50119, 1~
Financial performance analysis depend on returns. Given the adjusted closing prices, daily, monthly, and yearly returns are calculated.
#calculate daily returns
myStocks_returns_daily <- myStocks %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "daily",
type = "log",
col_rename = "daily_returns",
cols = c(nested.col))
#calculate monthly returns
myStocks_returns_monthly <- myStocks %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "monthly",
type = "arithmetic",
col_rename = "monthly_returns",
cols = c(nested.col))
#calculate yearly returns
myStocks_returns_annual <- myStocks %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "yearly",
type = "arithmetic",
col_rename = "yearly_returns",
cols = c(nested.col))
A table which summarises monthly returns for each of the stocks and SPY with min, max, median, mean, SD.
monthlyReturnSummary <- myStocks_returns_monthly%>%
summarise(monthly_max = max(monthly_returns),
monthly_min = min(monthly_returns),
monthly_mean = mean(monthly_returns),
monthly_median = median(monthly_returns),
monthly_sd = sd(monthly_returns))
monthlyReturnSummary
## # A tibble: 6 x 6
## symbol monthly_max monthly_min monthly_mean monthly_median monthly_sd
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 AAPL 0.217 -0.181 0.0245 0.0257 0.0785
## 2 ANF 0.507 -0.421 0.00922 0.00287 0.145
## 3 DIS 0.234 -0.179 0.0155 0.00946 0.0682
## 4 DPZ 0.342 -0.188 0.0314 0.0253 0.0746
## 5 SPY 0.127 -0.125 0.0123 0.0174 0.0381
## 6 TSLA 0.811 -0.224 0.0523 0.0148 0.176
Density plots for each of the stocks by faceting
ggplot(myStocks_returns_monthly, aes(monthly_returns))+
geom_density()+
facet_wrap(~symbol)+
labs(x = "Distribution of Monthly Return",
y = "Probability % ",
title = "Probability Distribution of Monthly Returns")+
theme_bw()+
NULL

Tesla is the riskiest stock as it has the highest spread, which means its monthly standard deviations are the highest. The least risky is SPY as it has the narrowest spread.
A plot that shows the expected monthly return (mean) of a stock on the Y axis and the risk (standard deviation) in the X-axis.
# ggrepel function helps repel overlapping text labels
library(ggrepel)
ggplot(monthlyReturnSummary, aes(x = monthly_sd, y = monthly_mean))+
geom_point(size = 4, aes(color = symbol))+
ggrepel::geom_text_repel(aes(label = symbol), size = 4)+
labs(x = "Standard Deviation of Monthly Returns",
y = "Average Monthly Returns",
title = "Risk vs Potential Return")+
theme_bw()+
NULL

ANF is much riskier than most, while delivering the lowest returns.