Gradient Boosting Machine Regression with R

Last Update: September 22, 2020

Algorithm learning consists of algorithm training within training data subset for optimal parameters estimation and algorithm testing within testing data subset using previously optimized parameters. This corresponds to a supervised regression machine learning task.

This topic is part of Machine Trading Analysis with R course. Feel free to take a look at Course Curriculum.

This tutorial has an educational and informational purpose and doesn’t constitute any type of trading or investment advice. All content, including code and data, is presented for personal educational use exclusively and with no guarantee of exactness of completeness. Past performance doesn’t guarantee future results. Please read full Disclaimer.

An example of supervised learning meta-algorithm is gradient boosting machine [1] which consists of predicting output target feature by boosting of optimally weighted sequentially built decision trees. Boosting is used for simultaneously lowering squared bias error and variance error sources of sequentially built decision trees.

1. Trees algorithm definition.

Classification and regression trees (CART) algorithm consists of greedy top-down approach for finding optimal recursive binary node splits by locally minimizing variance at terminal nodes measured through sum of squared errors function at each stage.

1.1. Trees algorithm formula notation.

min\left ( sse \right )=\sum_{t=1}^{n}\left ( y_{t}-\hat{y}_{t} \right )^{2}

\hat{y}_{t}=\frac{1}{m}\sum_{t=1}^{m}y_{t}

Where y_{t} = output target feature data, \hat{y}_{t} = terminal node output target feature mean, n = number of observations, m = number of observations in terminal node.

2. Tree boosting algorithm.

Tree boosting algorithm consists of predicting output target feature of weighted sequentially built decision trees.

  • Gradient descent algorithm consists of finding local optimal weight coefficients of sequentially built decision trees by locally minimizing sum of squared errors, sum of absolute errors or Huber loss function.

2.1. Tree boosting algorithm formula notation.

min(sse)=\sum_{t=1}^{n}(y_{t}-\hat{y}_{k(t)})^2

\hat{y}_{k(t)}=\sum_{i=1}^{k}\gamma \omega_{i}\hat{y}_{i(t)}

Where y_{t} = output target feature data, \hat{y}_{k(t)} = sequentially built decision trees weighted output target feature prediction, \gamma = learning rate regularization coefficient, \omega_{i} = local optimal sequentially built decision trees weight coefficients, \hat{y}_{i(t)} = sequentially built decision trees output target feature prediction, k = number of sequentially built decision trees.

3. R script code example.

3.1. Load R packages [2].

library('quantmod')
library('gbm')

3.2. Gradient boosting machine regression data reading, target and predictor features creation, training and testing ranges delimiting.

  • Data: S&P 500® index replicating ETF (ticker symbol: SPY) daily adjusted close prices (2007-2015).
  • Data daily arithmetic returns used for target feature (current day) and predictor feature (previous day).
  • Target and predictor features creation, training and testing ranges delimiting not fixed and only included for educational purposes.
data <- read.csv('Gradient-Boosting-Machine-Regression-Data.txt',header=T)
spy <- xts(data[,2],order.by=as.Date(data[,1]))
rspy <- dailyReturn(spy)
rspy1 <- lag(rspy,k=1)
rspyall <- cbind(rspy,rspy1)
colnames(rspyall) <- c('rspy','rspy1')
rspyall <- na.exclude(rspyall)
rspyt <- window(rspyall,end='2014-01-01')
rspyf <- window(rspyall,start='2014-01-01')

3.3. Gradient boosting machine regression fitting and output.

  • Gradient boosting machine fitting within training range.
  • Gradient boosting machine probability distribution, number of sequentially built decision trees, shrinkage or learning rate regularization coefficient, fraction of training range data randomly selected to propose next decision tree in the expansion not fixed and only included for educational purposes.
  • Gradient boosting machine results might be different depending on random number generation seed when fraction of training range data randomly selected to propose next decision tree in the expansion was less than one.
In:
gbmt <- gbm(rspy~rspy1,distribution='gaussian',data=rspyt,n.trees=2,shrinkage=0.1,bag.fraction=1)
gbmt$train.error
Out:
0.0002155838 0.0002152003
4. References.

[1] Jerome H. Friedman. “Greedy Function Approximation: A Gradient Boosting Machine”. The Annals of Statistics. 2001.

[2] Jeffrey A. Ryan and Joshua M. Ulrich. “quantmod: Quantitative Financial Modelling Framework”. R package version 0.4-15. 2019.

Brandon Greenwell, Bradley Boehmke, Jay Cunningham and GBM Developers. “gbm: Generalized Boosted Regression Models”. R package version 2.1.8. 2020