Functions > Data Analysis > Outliers and NaN > Example: Grubbs' Method for Detecting Outliers
Example: Grubbs' Method for Detecting Outliers
Grubbs' Test Statistic
Calculate the Grubb's test statistic, as used by the Grubbs function, to detect outliers. Compare the Grubb's test statistic with the test statistic of the ouliers.
1. Define a data set describing a heatflow experiment and plot it.
Click to copy this expression
Click to copy this expression
Click to copy this expression
Click to copy this expression
Click to copy this expression
Click to copy this expression
2. Define the critical value of the Student's t-distribution with N - 2 degrees of freedom and a significance level of alpha/(2N).
Click to copy this expression
* 
The function qt calculates the inverse cumulative probability density of the Student's t distribution.
3. Define the Grubbs' test statistic as a function of alpha.
Click to copy this expression
4. Define the level of significance for a confidence level of 90%.
Click to copy this expression
5. Call the Grubbs function to detect outliers.
Click to copy this expression
The Grubbs function can accept a matrix as an input, in which case it returns nested pairs of indices for the array locations of the outliers.
6. Compare the Grubb's test statistic with the test statistics of the outliers.
Click to copy this expression
The two outliers have a test statistic greater than the Grubb's test statistic. Even if more than one index is returned, this does not mean that all candidates must be outliers. This is because the critical value and the test statistic change if a candidate is removed. Both are dependent on N.
Because the Grubb’s test assumes that the data is normal, it is worth to check that your data follows a normal distribution. For example, you can use a visual test such as the normal probability plot before proceeding.
GrubbsClassic
Use the GrubbsClassic function to find the point which is the most likely to be an outlier in a data set.
1. Calculate the test statistic which is the greatest for the above data set.
Click to copy this expression
2. Define alpha for a 98% confidence interval.
Click to copy this expression
3. Compare the Grubbs' test statistic with Gmax.
Click to copy this expression
Click to copy this expression
No outliers are detected at this significance level.
4. Call the GrubbsClassic function.
Click to copy this expression
The point returned by GrubbsClassic is not an outlier, but it is the data point which is the most likely to be an outlier.
The Limiting Probability of Detecting Outliers
Use the special construct root to calculate the limiting probability at which outliers are detected.
Click to copy this expression
Outliers are detected when alpha is bigger than α_limit, or in other words, when the confidence interval is smaller than (1 - α_limit):
Click to copy this expression
This is consistent with the above findings. No outliers were detected for a 98% confidence interval, but two outliers were detected for a 90% confidence interval.
Was this helpful?