API

resp: the RobustLinResp structure.
pred: the predictor structure, of type DensePredChol, SparsePredChol, DensePredCG, SparsePredCG or RidgePred.
formula: either a FormulaTerm object or nothing
fitdispersion: if true, the dispersion is estimated otherwise it is kept fixed
fitted: if true, the model was already fitted

source

RobustModels.RobustLinResp — Type

RobustLinResp

Robust linear response structure.

Solve the following minimization problem:

\[\min \sum_i \rho\left(\dfrac{r_i}{\hat{\sigma}} ight)\]

Fields

est: estimator used for the model
y: response vector
μ: mean response vector
offset: offset added to Xβ to form μ. Can be of length 0
wts: prior case weights. Can be of length 0.
σ: current estimate of the scale or dispersion
devresid: the deviance residuals
wrkwt: working case weights for the Iteratively Reweighted Least Squares (IRLS) algorithm
wrkres: working residuals for IRLS
wrkscaledres: scaled residuals for IRLS

source

GLM.LinPred — Type

LinPred

Abstract type representing a linear predictor

source

RobustModels.DensePredCG — Type

DensePredCG

A LinPred type with Conjugate Gradient and a dense X

Members

X: Model matrix of size n × p with n ≥ p. Should be full column rank.
beta0: base coefficient vector of length p
delbeta: increment to coefficient vector, also of length p
scratchbeta: scratch vector of length p, used in GLM.linpred! method

source

RobustModels.SparsePredCG — Type

SparsePredCG

A LinPred type with Conjugate Gradient and a sparse X

Members

X: Model matrix of size n × p with n ≥ p. Should be full column rank.
beta0: base coefficient vector of length p
delbeta: increment to coefficient vector, also of length p
scratchbeta: scratch vector of length p, used in GLM.linpred! method

source

GLM.DensePredChol — Type

DensePredChol{T}

A LinPred type with a dense Cholesky factorization of X'X

Members

X: model matrix of size n × p with n ≥ p. Should be full column rank.
beta0: base coefficient vector of length p
delbeta: increment to coefficient vector, also of length p
scratchbeta: scratch vector of length p, used in linpred! method
chol: a Cholesky object created from X'X, possibly using row weights.
scratchm1: scratch Matrix{T} of the same size as X
scratchm2: scratch Matrix{T} os the same size as X'X

source

GLM.SparsePredChol — Type

SparsePredChol{T,M<:SparseMatrixCSC,C} where {T,C}

A LinPred type with a sparse Cholesky factorization of X'X. No pivot option.

Members

X: model matrix of size n × p with n ≥ p. Should be full column rank.
Xt: transpose of the model matrix.
beta0: base coefficient vector of length p
delbeta: increment to coefficient vector, also of length p
scratchbeta: scratch vector of length p, used in GLM.linpred! method
chol: a sparse Cholesky object created from X'X, possibly using row weights.
scratch: scratch SparseMatrixCSC{T} of the same size as X

source

GLM.DensePredQR — Type

DensePredQR

A LinPred type with a dense, unpivoted QR decomposition of X

Members

X: Model matrix of size n × p with n ≥ p. Should be full column rank.
beta0: base coefficient vector of length p
delbeta: increment to coefficient vector, also of length p
scratchbeta: scratch vector of length p, used in linpred! method
qr: a QRCompactWY object created from X, with optional row weights.

source

RobustModels.RidgePred — Type

RidgePred

Regularized predictor using ridge regression on the p features.

Members

X: model matrix
λ: shrinkage parameter of the regularizer
G: regularizer matrix of size p×p.
βprior: regularizer prior of the coefficient values. Default to zeros(p).
pred: the non-regularized predictor using an extended model matrix.
pivot: for DensePredChol, if the decomposition was pivoted.
scratchbeta: scratch vector of length p, used in GLM.linpred! method

source

RobustModels.AbstractRegularizedPred — Type

Abstract type for predictor with regularization

source

RobustModels.QuantileRegression — Type

QuantileRegression

Quantile regression representation

Fields

τ: the quantile value
X: the model matrix
β: the coefficients
y: the response vector
wts: the weights
wrkres: the working residuals
formula: either a FormulaTerm object or nothing
fitdispersion: if true, the dispersion is estimated otherwise it is kept fixed
fitted: if true, the model was already fitted

source

Constructors for models

StatsAPI.fit — Method

fit(
    ::Type{M},
    X::AbstractMatrix{T},
    y::AbstractVector{T},
    est::AbstractMEstimator;
    method::Symbol       = :auto,  # :chol, :qr, :cg
    dofit::Bool          = true,
    wts::FPVector        = similar(y, 0),
    offset::FPVector     = similar(y, 0),
    fitdispersion::Bool  = false,
    ridgeλ::Real         = 0,
    ridgeG::Union{UniformScaling, AbstractArray} = I,
    βprior::AbstractVector = [],
    quantile::Union{Nothing, AbstractFloat} = nothing,
    dropcollinear::Bool = false,
    initial_scale::Union{Symbol, Real}=:mad,
    σ0::Union{Nothing, Symbol, Real}=initial_scale,
    initial_coef::AbstractVector=[],
    β0::AbstractVector=initial_coef,
    correct_leverage::Bool=false,
    fitargs...,
) where {M<:RobustLinearModel, T<:AbstractFloat}

Create a robust model with the model matrix (or formula) X and response vector (or dataframe) y, using a robust estimator.

Arguments

X: the model matrix (it can be dense or sparse) or a formula
y: the response vector or a table (dataframe, namedtuple, ...).
est: a robust estimator

Keywords

method::Symbol = :auto: the method to use for solving the weighted linear system, chol (default), qr or cg. Use :auto to select the default method;
dofit::Bool = true: if false, return the model object without fitting;
dropmissing::Bool = false: if true, drop the rows with missing values (and convert to Non-Missing type). With dropmissing=true the number of observations may be smaller than the size of the input arrays;
wts::Vector = similar(y, 0): Prior probability weights of observations. Can be empty (length 0) if no weights are used (default);
offset::Vector = similar(y, 0): an offset vector, should be empty if no offset is used;
fitdispersion::Bool = false: reevaluate the dispersion;
ridgeλ::Real = 0: if positive, perform a robust ridge regression with shrinkage parameter ridgeλ. RidgePred object will be used;
ridgeG::Union{UniformScaling, AbstractArray} = I: define a custom regularization matrix. Default to unity matrix (with 0 for the intercept);
βprior::AbstractVector = []: define a custom prior for the coefficients for ridge regression. Default to zeros(p);
quantile::Union{Nothing, AbstractFloat} = nothing: only for GeneralizedQuantileEstimator, define the quantile to estimate;
dropcollinear::Bool=false: controls whether or not a model matrix less-than-full rank is accepted;
contrasts::AbstractDict{Symbol,Any} = Dict{Symbol,Any}(): a Dict mapping term names (as Symbols) to term types (e.g. ContinuousTerm) or contrasts (e.g., HelmertCoding(), SeqDiffCoding(; levels=["a", "b", "c"]), etc.). If contrasts are not provided for a variable, the appropriate term type will be guessed based on the data type from the data column: any numeric data is assumed to be continuous, and any non-numeric data is assumed to be categorical (with DummyCoding() as the default contrast type);
initial_scale::Union{Symbol, Real}=:mad: the initial scale estimate, for non-convex estimator it helps to find the global minimum. Automatic computation using :mad, L1 or extrema (non-robust).
σ0::Union{Nothing, Symbol, Real}=initial_scale: alias of initial_scale;
initial_coef::AbstractVector=[]: the initial coefficients estimate, for non-convex estimator it helps to find the global minimum.
β0::AbstractVector=initial_coef: alias of initial_coef;
correct_leverage::Bool=false: apply the leverage correction weights with leverage_weights.
fitargs...: other keyword arguments used to control the convergence of the IRLS algorithm (see RobustModels.pirls!).

Output

the RobustLinearModel object.

source

StatsAPI.fit — Method

fit(::Type{M}, X::Union{AbstractMatrix{T},SparseMatrixCSC{T}},
    y::AbstractVector{T}; quantile::AbstractFloat=0.5,
    dofit::Bool          = true,
    wts::FPVector        = similar(y, 0),
    fitdispersion::Bool  = false,
    fitargs...) where {M<:QuantileRegression, T<:AbstractFloat}

Fit a quantile regression model with the model matrix (or formula) X and response vector (or dataframe) y.

It is solved using the exact interior method.

Arguments

X: the model matrix (it can be dense or sparse) or a formula
y: the response vector or a dataframe.

Keywords

quantile::AbstractFloat=0.5: the quantile value for the regression, between 0 and 1.
dofit::Bool = true: if false, return the model object without fitting;
wts::Vector = []: a weight vector, should be empty if no weights are used;
fitdispersion::Bool = false: reevaluate the dispersion;
fitargs...: other keyword arguments like verbose to print iteration details.

Output

the RobustLinearModel object.

source

RobustModels.rlm — Function

rlm(X, y, args...; kwargs...)

An alias for fit(RobustLinearModel, X, y, est; kwargs...).

The arguments X and y can be a Matrix and a Vector or a Formula and a DataFrame.

source

RobustModels.quantreg — Function

quantreg(X, y, args...; kwargs...)

An alias for fit(QuantileRegression, X, y; kwargs...).

The arguments X and y can be a Matrix and a Vector or a Formula and a DataFrame.

source

StatsAPI.fit! — Function

Fit a statistical model in-place.

source

RobustModels.refit! — Function

refit!(m::RobustLinearModel, [y::FPVector];
                             wts::Union{Nothing, FPVector} = nothing,
                             offset::Union{Nothing, FPVector} = nothing,
                             quantile::Union{Nothing, AbstractFloat} = nothing,
                             ridgeλ::Union{Nothing, Real} = nothing,
                             kwargs...)

Refit the RobustLinearModel.

This function assumes that m was correctly initialized and the model is refitted with the new values for the response, weights, offset, quantile and ridge shrinkage.

Defining a new quantile is only possible for GeneralizedQuantileEstimator.

Defining a new ridgeλ is only possible for RidgePred objects.

source

refit!(m::QuantileRegression,
      [y::FPVector ;
       verbose::Bool=false,
       quantile::Union{Nothing,
       AbstractFloat}=nothing,
      ]
)

Refit the QuantileRegression model with the new values for the response, weights and quantile. This function assumes that m was correctly initialized.

source

Model methods

StatsAPI.coef — Function

coef(model::StatisticalModel)

Return the coefficients of the model.

source

StatsAPI.coeftable — Function

coeftable(model::StatisticalModel; level::Real=0.95)

Return a table with coefficients and related statistics of the model. level determines the level for confidence intervals (by default, 95%).

The returned CoefTable object implements the Tables.jl interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(coeftable(model)).

source

StatsAPI.coefnames — Function

coefnames(model::StatisticalModel)

Return the names of the coefficients.

source

StatsAPI.responsename — Function

responsename(model::RegressionModel)

Return the name of the model response (a.k.a. the dependent variable).

source

StatsAPI.confint — Function

confint(model::StatisticalModel; level::Real=0.95)

Compute confidence intervals for coefficients, with confidence level level (by default 95%).

source

StatsAPI.deviance — Function

deviance(model::StatisticalModel)

Return the deviance of the model relative to a reference, which is usually when applicable the saturated model. It is equal, up to a constant, to $-2 \log L$, with $L$ the likelihood of the model.

source

StatsAPI.nulldeviance — Function

nulldeviance(model::StatisticalModel)

Return the deviance of the null model, obtained by dropping all independent variables present in model.

If model includes an intercept, the null model is the one with only the intercept; otherwise, it is the one without any predictor (not even the intercept).

source

StatsAPI.dof — Function

dof(model::StatisticalModel)

Return the number of degrees of freedom consumed in the model, including when applicable the intercept and the distribution's dispersion parameter.

source

StatsAPI.dof_residual — Function

dof_residual(model::RegressionModel)

Return the residual degrees of freedom of the model.

source

StatsAPI.nobs — Function

nobs(model::StatisticalModel)

Return the number of independent observations on which the model was fitted. Be careful when using this information, as the definition of an independent observation may vary depending on the model, on the format used to pass the data, on the sampling plan (if specified), etc.

source

RobustModels.wobs — Function

wobs(obj::RobustResp)

For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.

source

wobs(m::RobustLinearModel)

For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.

source

wobs(m::QuantileRegression)

For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.

source

StatsAPI.isfitted — Function

isfitted(model::StatisticalModel)

Indicate whether the model has been fitted.

source

StatsAPI.islinear — Function

islinear(model::StatisticalModel)

Indicate whether the model is linear.

source

StatsAPI.loglikelihood — Function

loglikelihood(model::StatisticalModel)
loglikelihood(model::StatisticalModel, observation)

Return the log-likelihood of the model.

With an observation argument, return the contribution of observation to the log-likelihood of model.

If observation is a Colon, return a vector of each observation's contribution to the log-likelihood of the model. In other words, this is the vector of the pointwise log-likelihood contributions.

In general, sum(loglikehood(model, :)) == loglikelihood(model).

source

StatsAPI.nullloglikelihood — Function

nullloglikelihood(model::StatisticalModel)

Return the log-likelihood of the null model, obtained by dropping all independent variables present in model.

If model includes an intercept, the null model is the one with only the intercept; otherwise, it is the one without any predictor (not even the intercept).

source

StatsAPI.stderror — Function

stderror(model::StatisticalModel)

Return the standard errors for the coefficients of the model.

source

StatsAPI.vcov — Function

vcov(model::StatisticalModel)

Return the variance-covariance matrix for the coefficients of the model.

source

StatsAPI.weights — Function

weights(model::StatisticalModel)

Return the weights used in the model.

source

RobustModels.workingweights — Function

workingweights(m::RobustLinearModel)

The robust weights computed by the model.

This can be used to detect outliers, as outliers weights are lower than the weights of valid data points.

source

StatsAPI.fitted — Function

fitted(model::RegressionModel)

Return the fitted values of the model.

source

StatsAPI.predict — Function

predict(model::RegressionModel, [newX])

Form the predicted response of model. An object with new covariate values newX can be supplied, which should have the same type and structure as that used to fit model; e.g. for a GLM it would generally be a DataFrame with the same variable names as the original predictors.

source

StatsAPI.leverage — Function

leverage(model::RegressionModel)

Return the diagonal of the projection matrix of the model.

source

RobustModels.leverage_weights — Function

leverage_weights(m::RobustLinearModel, variant::Symbol)

Returns leverage_weights for the model using a different weights vector depending on the variant: - variant = :original: use the user-defined weights, if no weights were used (size of the weights vector is 0), no weights are used. - variant = :fitted: use the working weights of the fitted model from the IRLS procedure.

source

StatsAPI.modelmatrix — Function

modelmatrix(model::RegressionModel)

Return the model matrix (a.k.a. the design matrix).

source

RobustModels.projectionmatrix — Function

projectionmatrix(m::RobustLinearModel)

The robust projection matrix from the predictor: X (X' W X)⁻¹ X' W, where W are the working weights.

source

projectionmatrix(m::RobustLinearModel, variant::Symbol)

Returns projectionmatrix for the model using a different weights vector depending on the variant: - variant = :original: use the user-defined weights, if no weights were used (size of the weights vector is 0), no weights are used. - variant = :fitted: use the working weights of the fitted model from the IRLS procedure.

source

GLM.dispersion — Method

dispersion(m::RobustLinearModel, sqr::Bool=false)

The dispersion is the (weighted) sum of robust residuals. If sqr is true, return the squared dispersion.

source

StatsAPI.response — Function

response(model::RegressionModel)

Return the model response (a.k.a. the dependent variable).

source

StatsAPI.residuals — Function

residuals(model::RegressionModel)

Return the residuals of the model.

source

StatsModels.hasintercept — Function

Indicate if the model has an intercept

source

RobustModels.hasformula — Function

Indicate if the model is defined from a formula (and dataframe) or from arrays

source

StatsModels.formula — Function

The model formula. Throw an error if the model was defined by arrays.

source

RobustModels.scale — Function

scale(m::RobustLinearModel, sqr::Bool=false)

The robust scale estimate used for the robust estimation.

If sqr is true, the square of the scale is returned.

source

RobustModels.tauscale — Function

tauscale(m::RobustLinearModel, sqr::Bool=false; kwargs...)

The robust τ-scale that is minimized in τ-estimation.

If sqr is true, the square of the τ-scale is returned.

source

RobustModels.location_variance — Function

location_variance(r::RobustLinResp, sqr::Bool=false)

Compute the part of the variance of the coefficients β that is due to the encertainty from the location. If sqr is false, return the standard deviation instead.

From Maronna et al., Robust Statistics: Theory and Methods, Equation 4.49

source

RobustModels.Estimator — Function

Estimator(m::RobustLinearModel)

The robust estimator object used to fit the model.

source

GLM.linpred! — Function

linpred!(out, p::RidgePred{T}, f::Real=1.0)

Overwrite out with the linear predictor from p with factor f The effective coefficient vector, p.scratchbeta, is evaluated as p.beta0 .+ f * p.delbeta, and out is updated to p.X * p.scratchbeta

source

linpred!(out, p::LinPred, f::Real=1.0)

Overwrite out with the linear predictor from p with factor f

The effective coefficient vector, p.scratchbeta, is evaluated as p.beta0 .+ f * p.delbeta, and out is updated to p.X * p.scratchbeta

source

RobustModels.pirls! — Function

pirls!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
       minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
       beta0::AbstractVector=[], sigma0::Union{Nothing, T}=nothing)

(Penalized) Iteratively Reweighted Least Square procedure for M-estimation. The Penalized aspect is not implemented (yet).

source

RobustModels.pirls_Sestimate! — Function

pirls_Sestimate!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
       minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
       beta0::AbstractVector=T[], sigma0::Union{Nothing, T}=nothing)

(Penalized) Iteratively Reweighted Least Square procedure for S-estimation. The Penalized aspect is not implemented (yet).

source

RobustModels.pirls_τestimate! — Function

pirls_τestimate!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
       minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
       beta0::AbstractVector=T[], sigma0::Union{Nothing, T}=nothing)

(Penalized) Iteratively Reweighted Least Square procedure for τ-estimation. The Penalized aspect is not implemented (yet).

source

Estimators

RobustModels.MEstimator — Type

MEstimator{L<:LossFunction} <: AbstractMEstimator

M-estimator for a given loss function.

The M-estimator is obtained by minimizing the loss function:

\[\hat{\mathbf{\beta}} = \underset{\mathbf{\beta}}{\textrm{argmin}} \sum_{i=1}^n \rho\left(\dfrac{r_i}{\hat{\sigma}}\right)\]

with the residuals $\mathbf{r} = \mathbf{y} - \mathbf{X} \mathbf{\beta}$ , and a robust scale estimate $\hat{\sigma}$.

Fields

loss: the LossFunction used for the robust estimation.

source

RobustModels.L1Estimator — Type

L1Estimator is a shorthand name for MEstimator{L1Loss}. Using exact QuantileRegression should be preferred.

source

RobustModels.L2Estimator — Type

L2Estimator is a shorthand name for MEstimator{L2Loss}, the non-robust OLS.

source

RobustModels.SEstimator — Type

SEstimator{L<:BoundedLossFunction} <: AbstractMEstimator

S-estimator for a given bounded loss function.

The S-estimator is obtained by minimizing the scale estimate:

\[\hat{\mathbf{\beta}} = \underset{\mathbf{\beta}}{\textrm{argmin }} \hat{\sigma}^2\]

where the robust scale estimate $\hat{\sigma}}$ is solution of:

\[\dfrac{1}{n} \sum_{i=1}^n \rho\left(\dfrac{r_i}{\hat{\sigma}}\right) = \delta\]

with the residuals $\mathbf{r} = \mathbf{y} - \mathbf{X} \mathbf{\beta}$ , $\rho$ is a bounded loss function with $\underset{r \to \infty}{\lim} \rho(r) = 1$ and $\delta$ is the finite breakdown point, usually 0.5.

Fields

loss: the LossFunction used for the robust estimation.

source

RobustModels.MMEstimator — Type

MMEstimator{L1<:BoundedLossFunction, L2<:LossFunction} <: AbstractMEstimator

MM-estimator for the given loss functions.

The MM-estimator is obtained using a two-step process:

compute a robust scale estimate with a high breakdown point using a S-estimate and the loss function L1.
compute an efficient estimate using a M-estimate with the loss function L2.

Fields

loss1: the BoundedLossFunction used for the high breakdown point S-estimation.
loss2: the LossFunction used for the efficient M-estimation.
scaleest: boolean specifying the if the estimation is in the S-estimation step (true)

or the M-estimation step (false).

source

RobustModels.TauEstimator — Type

TauEstimator{L1<:BoundedLossFunction, L2<:BoundedLossFunction} <: AbstractMEstimator

τ-estimator for the given loss functions.

The τ-estimator corresponds to a M-estimation, where the loss function is a weighted sum of a high breakdown point loss and an efficient loss. The weight is recomputed at every step of the Iteratively Reweighted Least Square, so the estimate is both robust (high breakdown point) and efficient.

Fields

loss1: the high breakdown point BoundedLossFunction.
loss2: the high efficiency BoundedLossFunction.
w: the weight in the sum of losses: w . loss1 + loss2.

source

RobustModels.GeneralizedQuantileEstimator — Type

GeneralizedQuantileEstimator{L<:LossFunction} <: AbstractQuantileEstimator

Generalized Quantile Estimator is an M-Estimator with asymmetric loss function.

For L1Loss, this corresponds to quantile regression (although it is better to use quantreg for quantile regression because it gives the exact solution).

For L2Loss, this corresponds to Expectile regression (see ExpectileEstimator).

Fields

loss: the LossFunction.
τ: the quantile value to estimate, between 0 and 1.

Properties

tau, q, quantile are aliases for τ.

source

RobustModels.ExpectileEstimator — Type

The expectile estimator is a generalization of the L2 estimator, for other quantile τ ∈ [0,1].

[1] Schnabel, Eilers - Computational Statistics and Data Analysis 53 (2009) 4168–4177 - Optimal expectile smoothing doi:10.1016/j.csda.2009.05.002

source

RobustModels.QuantileEstimator — Type

Non-exact quantile estimator, GeneralizedQuantileEstimator{L1Loss}. Prefer using QuantileRegression

source

Loss functions

RobustModels.BoundedLossFunction — Type

Bounded loss function type for hard rejection of outlier.

source

RobustModels.L2Loss — Type

The (convex) L2 loss function is that of the standard least squares problem. ψ(r) = r

source

RobustModels.L1Loss — Type

The standard L1 loss function takes the absolute value of the residual, and is convex but non-smooth. It is not a real L1 loss but a Huber loss with very small tuning constant. ψ(r) = sign(r) Use $QuantileRegression$ for a correct implementation of the L1 loss.

source

RobustModels.HuberLoss — Type

The convex Huber loss function switches from between quadratic and linear cost/loss function at a certain cutoff. ψ(r) = (abs(r) <= 1) ? r : sign(r)

source

RobustModels.L1L2Loss — Type

The convex L1-L2 loss interpolates smoothly between L2 behaviour for small residuals and L1 for outliers. ψ(r) = r / √(1 + r^2)

source

RobustModels.FairLoss — Type

The (convex) "fair" loss switches from between quadratic and linear cost/loss function at a certain cutoff, and is C3 but non-analytic. ψ(r) = r / (1 + abs(r))

source

RobustModels.LogcoshLoss — Type

The convex Log-Cosh loss function ψ(r) = tanh(r)

source

RobustModels.ArctanLoss — Type

The convex Arctan loss function ψ(r) = atan(r)

source

RobustModels.CatoniWideLoss — Type

The convex (wide) Catoni loss function. See: "Catoni (2012) - Challenging the empirical mean and empirical variance: A deviation study"

ψ(r) = sign(r) * log(1 + abs(r) + r^2/2)

source

RobustModels.CatoniNarrowLoss — Type

The convex (narrow) Catoni loss function. See: "Catoni (2012) - Challenging the empirical mean and empirical variance: A deviation study"

ψ(r) = (abs(r) <= 1) ? -sign(r) * log(1 - abs(r) + r^2/2) : sign(r) * log(2)

source

RobustModels.CauchyLoss — Type

The non-convex Cauchy loss function switches from between quadratic behaviour to logarithmic tails. This rejects outliers but may result in multiple minima. For scale estimate, r.ψ(r) is used as a loss, which is the same as for Geman loss. ψ(r) = r / (1 + r^2)

source

RobustModels.GemanLoss — Type

The non-convex Geman-McClure for strong suppression of outliers and does not guarantee a unique solution. For the S-Estimator, it is equivalent to the Cauchy loss. ψ(r) = r / (1 + r^2)^2

source

RobustModels.WelschLoss — Type

The non-convex Welsch for strong suppression of outliers and does not guarantee a unique solution ψ(r) = r * exp(-r^2)

source

RobustModels.TukeyLoss — Type

The non-convex Tukey biweight estimator which completely suppresses the outliers, and does not guaranty a unique solution. ψ(r) = (abs(r) <= 1) ? r * (1 - r^2)^2 : 0

source

RobustModels.YohaiZamarLoss — Type

The non-convex (and bounded) optimal Yohai-Zamar loss function that minimizes the estimator bias. It was originally introduced in Optimal locally robust M-estimates of regression (1997) by Yohai and Zamar with a slightly different formula.

source

RobustModels.HardThresholdLoss — Type

The non-convex hard-threshold loss function, or saturated L2 loss. Non-smooth. ψ(r) = (abs(r) <= 1) ? r : 0

source

RobustModels.HampelLoss — Type

The 3-parameter non-convex bounded Hampel's loss function. ψ(r) = (abs(r) <= 1) ? r : ( (abs(r) <= l.ν1) ? sign(r) : ( (abs(r) <= l.ν2) ? (l.ν2 - abs(r)) / (l.ν2 - l.ν1) * sign(r) : 0))

source

Estimator and Loss functions methods

RobustModels.rho — Function

The loss function ρ for the M-estimator.

source

RobustModels.psi — Function

The influence function ψ is the derivative of the loss function for the M-estimator, multiplied by the square of the tuning constant.

source

RobustModels.psider — Function

The derivative of ψ, used for asymptotic estimates.

source

RobustModels.weight — Function

The weights for IRLS, the function ψ divided by r.

source

RobustModels.estimator_values — Function

Faster version if you need ρ, ψ and w in the same call

source

RobustModels.estimator_norm — Function

The integral of exp(-ρ) used for calculating the full-loglikelihood

source

RobustModels.estimator_bound — Function

The limit at ∞ of the loss function. Used for scale estimation of bounded loss.

source

RobustModels.tuning_constant — Function

The tuning constant of the loss function, can be optimized to get efficient or robust estimates.

source

RobustModels.isconvex — Function

Boolean if the estimator or loss function is convex

source

RobustModels.isbounded — Function

Boolean if the estimator or loss function is bounded

source

RobustModels.estimator_high_breakdown_point_constant — Function

The tuning constant associated to the loss that gives an efficient M-estimator.

source

RobustModels.estimator_high_efficiency_constant — Function

The tuning constant associated to the loss that gives a robust (high breakdown point) M-estimator.

source

RobustModels.efficient_loss — Function

The loss initialized with an efficient tuning constant

source

RobustModels.robust_loss — Function

The loss initialized with a robust (high breakdown point) tuning constant

source

RobustModels.efficiency_tuning_constant — Function

The tuning constant c is computed so the efficiency for Normal distributed residuals is 0.95. The efficiency of the mean estimate μ is defined by:

eff_μ = (E[ψ'])²/E[ψ²]

source

RobustModels.mscale_loss — Function

mscale_loss(loss::L, x)

The rho-function that is used for M-scale estimation.

For monotone (convex) functions, χ(r) = r.ψ(r)/c^2.

For bounded functions, χ(r) = ρ(r)/ρ(∞) so χ(∞) = 1.

source

RobustModels.breakdown_point_tuning_constant — Function

The M-estimate of scale is computed by solving:

\[\dfrac{1}{n} \sum_i \chi\left( \dfrac{r_i}{\hat{\sigma}}\right) = \delta\]

For monotone (convex) functions, χ(r) = r.ψ(r) and δ is defined as E[χ(r)] = δ for the Normal distribution N(0,1) For bounded functions, χ(r) = ρ(r)/ρ(∞) with χ(∞) = 1 and δ = E[χ]/χ(∞) with expectation w.r.t. Normal density.

The tuning constant c corresponding to a high breakdown point (0.5) is such that δ = 1/2, from 1/n Σ χ(r/ŝ) = δ

source

RobustModels.scale_estimate — Function

scale_estimate(loss, res; σ0=1.0, wts=[], verbose=false,
                         order=1, approx=false, nmax=30,
                         rtol=1e-4, atol=0.1)

Compute the M-scale estimate from the loss function. If the loss is bounded, ρ is used as the function χ in the sum, otherwise r.ψ(r) is used if the loss is not bounded, to coincide with the Maximum Likelihood Estimator. Also, for bounded estimator, because f(s) = 1/(nδ) Σ ρ(ri/s) is decreasing the iteration step is not using the weights but is multiplicative.

source

RobustModels.tau_efficiency_tuning_constant — Function

tau_efficiency_tuning_constant(::Type{L1}, ::Type{L2}; eff::Real=0.95, c0::Real=1.0)
    where {L1<:BoundedLossFunction,L2<:BoundedLossFunction}

Compute the tuning constant that corresponds to a high breakdown point for the τ-estimator.

source

RobustModels.estimator_tau_efficient_constant — Function

The tuning constant associated to the loss that gives a robust τ-estimator.

source

RobustModels.loss — Function

The loss function used for the estimation

source

RobustModels.set_SEstimator — Function

MEstimator, set to S-Estimation phase

source

RobustModels.set_MEstimator — Function

MEstimator, set to M-Estimation phase

source

RobustModels.update_weight! — Function

update_weight!(E::TauEstimator, res::AbstractArray{T}; wts::AbstractArray{T}=T[])

Update the weight between the two estimators of a τ-estimator using the scaled residual.

source

RobustModels.tau_scale_estimate — Function

tau_scale_estimate!(E::TauEstimator, res::AbstractArray{T}, σ::Real, sqr::Bool=false;
                    wts::AbstractArray{T}=T[], bound::AbstractFloat=0.5) where {T<:AbstractFloat}

The τ-scale estimate, where σ is the scale estimate from the robust M-scale. If sqr is true, return the squared value.

source

RobustModels.quantile_weight — Function

quantile_weight(τ::Real, r::Real)

Wrapper function to compute quantile-like loss function.

source