Functions > Data Analysis > Outliers and NaN > Outlier Detection and Removal
Outlier Detection and Removal
The Grubbs, GrubbsClassic and ThreeSigma functions detect outliers in data sets. The trim function removes rows with specified indices from a data set.
Grubbs(v, a)—Returns the index of suspected outliers, the test statistic for that outlier, and its distance from the critical statistic, for the probability a that data randomly takes a given value.
GrubbsClassic(v, a)—Returns the index of the data point which is the most likely to be an outlier, and its test statistic, and its distance from the critical statistic, for the probability a that data randomly takes a given value.
ThreeSigma(v)—Returns indices of points in v, which have a test statistic greater than three, and the value of this quantity for each point.
trim(v, vindex)—Trims out the entries (rows) specified by vindex.
The test statistic used to detect outliers is the distance of a point to the mean of the data set, divided by the standard deviation.
When a real matrix is used in place of a vector, the functions that detect outliers return the pair of indices for each outlier candidate as nested matrices.
Arguments
v is a real vector or matrix representing data points.
a is a probability between 0 < a < 1.
vindex is an integer-valued vector. The indices specified in vindex are relative to ORIGIN.