Breusch-Pagan Test with Python

Last Update: December 15, 2020

Multiple regression assumptions consist of independent variables correct specification, independent variables no linear dependence, regression correct functional form, residuals no autocorrelation, residuals homoscedasticity and residuals normality.

This topic is part of Multiple Regression Analysis with Python course. Feel free to take a look at Course Curriculum.

This tutorial has an educational and informational purpose and doesn’t constitute any type of business, forecasting, trading or investment advice. All content, including code and data, is presented for personal educational use exclusively and with no guarantee of exactness of completeness. Past performance doesn’t guarantee future results. Please read full Disclaimer.

Residuals homoscedasticity consists of evaluating whether regression residuals or forecasting errors have a constant variance.

This is evaluated through Breusch-Pagan heteroscedasticity test [1] which consists of using squared original regression residuals data as dependent variable together with original regression independent variables and assessing if independent variables are jointly statistically significant.

1. Formula notation.

1.1. Breusch-Pagan test formula notation.

Note: number of original regression independent or explanatory variables not fixed and only included for educational purposes.

$\hat{\varepsilon}_{t}^2=\alpha+\beta_{1}x_{1,t}+\beta_{2}x_{2,t}+e_{t}$

Where $\hat{\varepsilon}_{t}^2$ = squared original regression estimated residuals or forecasting errors, $\alpha$ = regression constant or intercept, $\beta_{1},\beta_{2}$ = regression coefficients, $x_{1,t},x_{2,t}$ = original regression independent or explanatory variables data, $e_{t}$ = regression residuals or forecasting errors.

1.2. Breusch-Pagan test.

Breusch-Pagan Lagrange multiplier statistic $p-value$ :

If Breusch-Pagan Lagrange multiplier statistic $p-value<\alpha$ level of statistical significance then residuals were heteroscedastic with $(1-\alpha)$ level of statistical confidence.
If Breusch-Pagan Lagrange multiplier statistic $p-value>\alpha$ level of statistical significance then residuals were homoscedastic with $(1-\alpha)$ level of statistical confidence.

2. Python code example.

2.1. Import Python packages [2].

import numpy as np
import pandas as pd
import statsmodels.regression.linear_model as rg
import statsmodels.tools.tools as ct
import statsmodels.stats.diagnostic as dg

2.2. Breusch-Pagan test data.

Data: S&P 500® index replicating ETF (ticker symbol: SPY) adjusted close prices arithmetic monthly returns, 1 Year U.S. Treasury Bill Yield, 10 Years U.S. Treasury Note Yield, Merrill Lynch U.S. High Yield Corporate Bond Index Yield effective monthly yields, U.S. Consumer Price Index, U.S. Producer Price Index monthly inflations or deflations, West Texas Intermediate Oil prices arithmetic monthly returns, U.S. Industrial Production Index value, U.S. Personal Consumption Expenditures arithmetic monthly changes (1997-2016).

data = pd.read_csv('Data//Breusch-Pagan-Test-Data.txt', index_col='Date', parse_dates=True)

2.3. Breusch-Pagan test calculation and output.

data.loc[:, 'const'] = ct.add_constant(data)
ivar = ['const', 't1y', 't10y', 'hyield', 'cpi', 'ppi', 'oil', 'indpro', 'pce']
reg = rg.OLS(data['stocks'], data[ivar], hasconst=bool).fit()
res = reg.resid

In:
print('== Residuals Homoscedasticity Breusch-Pagan Test ==')
print('')
print('Breusch-Pagan LM Test Statistic:', np.round(dg.het_breuschpagan(res, exog_het=data[ivar])[0], 6))
print('Breusch-Pagan LM Test P-Value:', np.round(dg.het_breuschpagan(res, exog_het=data[ivar])[1], 6))

Out:
== Residuals Homoscedasticity Breusch-Pagan Test ==

Breusch-Pagan LM Test Statistic: 49.943053
Breusch-Pagan LM Test P-Value: 0.0

3. References

[1] Breusch, T. S.; Pagan, A. R. “A Simple Test for Heteroskedasticity and Random Coefficient Variation”. Econometrica. 1979.

[2] Travis E, Oliphant. “A guide to NumPy”. USA: Trelgol Publishing. 2006.

Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. “The NumPy Array: A Structure for Efficient Numerical Computation”. Computing in Science & Engineering. 2011.

Wes McKinney. “Data Structures for Statistical Computing in Python.” Proceedings of the 9th Python in Science Conference. 2010.

Seabold, Skipper, and Josef Perktold. “Statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.