Dispersion Measures with Python

Last Update: January 14, 2021

Descriptive statistics consists of quantitative or qualitative data population or sample frequency distribution, central tendency measures, dispersion measures, association measures and frequency distribution shape.

This topic is part of Business Statistics with Python course. Feel free to take a look at Course Curriculum.

This tutorial has an educational and informational purpose and doesn’t constitute any type of forecasting, business, trading or investment advice. All content, including code and data, is presented for personal educational use exclusively and with no guarantee of exactness of completeness. Past performance doesn’t guarantee future results. Please read full Disclaimer.

1. Dispersion measures

Dispersion measures consist of data population or sample variability. Main dispersion measures are standard deviation, variance and average deviation or mean absolute deviation.

  • Standard deviation consists of squared root of average data population or sample squared dispersion.

\sigma=\sqrt{\frac{1}{n}\sum_{t=1}^{n}(x_{t}-\mu)^2}\;\;or\;\;\bar{\sigma}=\sqrt{\frac{1}{\bar{n}-1}\sum_{t=1}^{\bar{n}}(x_{t}-\bar{\mu})^2}

Where \sigma = population standard deviation, x_{t} = data, \mu = population mean, n = population number of observations, \bar{\sigma} = corrected sample standard deviation, \bar{\mu} = sample mean, \bar{n} = sample number of observations.

  • Variance consists of average data population or sample squared dispersion.

\sigma^2=\frac{1}{n}\sum_{t=1}^{n}(x_{t}-\mu)^2\;\;or\;\;\bar{\sigma}^2=\frac{1}{\bar{n}-1}\sum_{t=1}^{\bar{n}}(x_{t}-\bar{\mu})^2

Where \sigma^2 = population variance, x_{t} = data, \mu = population mean, n = population number of observations, \bar{\sigma}^2 = corrected sample variance, \bar{\mu} = sample mean, \bar{n} = sample number of observations.

  • Mean absolute deviation or average deviation consists of average data population or sample absolute dispersion.

ad=\frac{1}{n}\sum_{t=1}^{n}\left | x_{t}-\mu \right |\;\;or\;\;\bar{ad}=\frac{1}{\bar{n}}\sum_{t=1}^{\bar{n}}\left | x_{t}-\bar{\mu} \right |

Where ad = population mean absolute deviation or average deviation, x_{t} = data, \mu = population mean, n = population number of observations, \bar{ad} = sample mean absolute deviation or average deviation, \bar{\mu} = sample mean, \bar{n} = sample number of observations.

2. Python code example.

2.1. Import Python packages [1].

import numpy as np
import pandas as pd

2.2. Dispersion measures data reading and data daily arithmetic returns calculation.

• Data: S&P 500® index replicating ETF (ticker symbol: SPY) daily adjusted close prices (2007-2015).

spy = pd.read_csv('Data//Dispersion-Measures-Data.txt', index_col='Date', parse_dates=True)
rspy <- dailyReturn(spy)
rspy = spy.pct_change(1)
rspy.columns = ['rspy']

2.3. Dispersion measures calculation and output.

In:
print('== Dispersion Measures ==')
print('')
print('Corrected Sample Standard Deviation:', np.round(rspy.std(ddof=1), 6))
print('Corrected Sample Variance:', np.round(rspy.var(ddof=1), 6))
print('Sample Mean Absolute Deviation:', np.round(rspy.mad(), 6))
Out:
== Dispersion Measures ==

Corrected Sample Standard Deviation: rspy    0.01357
dtype: float64
Corrected Sample Variance: rspy    0.000184
dtype: float64
Sample Mean Absolute Deviation: rspy    0.008667
dtype: float64
3. References.

[1] Travis E, Oliphant. “A guide to NumPy”. USA: Trelgol Publishing. 2006.

Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. “The NumPy Array: A Structure for Efficient Numerical Computation”. Computing in Science & Engineering. 2011.

Wes McKinney. “Data Structures for Statistical Computing in Python.” Proceedings of the 9th Python in Science Conference. 2010.