API
Types defined in the package
RobustModels.AbstractRobustModel
— TypeAbstractRobustModel
Abstract type for robust models.
RobustModels.jl
implements one subtype: RobustLinearModel
. See the documentation for each for more details.
RobustModels.AbstractEstimator
— TypeGeneral location or scale estimator
RobustModels.AbstractQuantileEstimator
— TypeGeneralized M-Quantile estimator
RobustModels.LossFunction
— TypeAn estimator needs a cost/loss function for the modified (weighted) least squares problems of the form:
\[\min \sum_i \rho\left(\dfrac{r_i}{\hat{\sigma}} ight)\]
RobustModels.RobustLinearModel
— TypeRobustLinearModel
Robust linear model representation
Fields
resp
: theRobustLinResp
structure.pred
: the predictor structure, of typeDensePredChol
,SparsePredChol
,DensePredCG
,SparsePredCG
orRidgePred
.formula
: either aFormulaTerm
object ornothing
fitdispersion
: if true, the dispersion is estimated otherwise it is kept fixedfitted
: if true, the model was already fitted
RobustModels.RobustLinResp
— TypeRobustLinResp
Robust linear response structure.
Solve the following minimization problem:
\[\min \sum_i \rho\left(\dfrac{r_i}{\hat{\sigma}} ight)\]
Fields
est
: estimator used for the modely
: response vectorμ
: mean response vectoroffset
: offset added toXβ
to formμ
. Can be of length 0wts
: prior case weights. Can be of length 0.σ
: current estimate of the scale or dispersiondevresid
: the deviance residualswrkwt
: working case weights for the Iteratively Reweighted Least Squares (IRLS) algorithmwrkres
: working residuals for IRLSwrkscaledres
: scaled residuals for IRLS
GLM.LinPred
— TypeLinPred
Abstract type representing a linear predictor
RobustModels.DensePredCG
— TypeDensePredCG
A LinPred
type with Conjugate Gradient and a dense X
Members
X
: Model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inGLM.linpred!
method
RobustModels.SparsePredCG
— TypeSparsePredCG
A LinPred
type with Conjugate Gradient and a sparse X
Members
X
: Model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inGLM.linpred!
method
GLM.DensePredChol
— TypeDensePredChol{T}
A LinPred
type with a dense Cholesky factorization of X'X
Members
X
: model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inlinpred!
methodchol
: aCholesky
object created fromX'X
, possibly using row weights.scratchm1
: scratch Matrix{T} of the same size asX
scratchm2
: scratch Matrix{T} os the same size asX'X
GLM.SparsePredChol
— TypeSparsePredChol{T,M<:SparseMatrixCSC,C} where {T,C}
A LinPred
type with a sparse Cholesky factorization of X'X
. No pivot option.
Members
X
: model matrix of sizen
×p
withn ≥ p
. Should be full column rank.Xt
: transpose of the model matrix.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inGLM.linpred!
methodchol
: a sparseCholesky
object created fromX'X
, possibly using row weights.scratch
: scratch SparseMatrixCSC{T} of the same size asX
GLM.DensePredQR
— TypeDensePredQR
A LinPred
type with a dense, unpivoted QR decomposition of X
Members
X
: Model matrix of sizen
×p
withn ≥ p
. Should be full column rank.beta0
: base coefficient vector of lengthp
delbeta
: increment to coefficient vector, also of lengthp
scratchbeta
: scratch vector of lengthp
, used inlinpred!
methodqr
: aQRCompactWY
object created fromX
, with optional row weights.
RobustModels.RidgePred
— TypeRidgePred
Regularized predictor using ridge regression on the p
features.
Members
X
: model matrixλ
: shrinkage parameter of the regularizerG
: regularizer matrix of size p×p.βprior
: regularizer prior of the coefficient values. Default tozeros(p)
.pred
: the non-regularized predictor using an extended model matrix.pivot
: forDensePredChol
, if the decomposition was pivoted.scratchbeta
: scratch vector of lengthp
, used inGLM.linpred!
method
RobustModels.AbstractRegularizedPred
— TypeAbstract type for predictor with regularization
RobustModels.QuantileRegression
— TypeQuantileRegression
Quantile regression representation
Fields
τ
: the quantile valueX
: the model matrixβ
: the coefficientsy
: the response vectorwts
: the weightswrkres
: the working residualsformula
: either aFormulaTerm
object ornothing
fitdispersion
: if true, the dispersion is estimated otherwise it is kept fixedfitted
: if true, the model was already fitted
Constructors for models
StatsAPI.fit
— Methodfit(
::Type{M},
X::AbstractMatrix{T},
y::AbstractVector{T},
est::AbstractMEstimator;
method::Symbol = :auto, # :chol, :qr, :cg
dofit::Bool = true,
wts::FPVector = similar(y, 0),
offset::FPVector = similar(y, 0),
fitdispersion::Bool = false,
ridgeλ::Real = 0,
ridgeG::Union{UniformScaling, AbstractArray} = I,
βprior::AbstractVector = [],
quantile::Union{Nothing, AbstractFloat} = nothing,
dropcollinear::Bool = false,
initial_scale::Union{Symbol, Real}=:mad,
σ0::Union{Nothing, Symbol, Real}=initial_scale,
initial_coef::AbstractVector=[],
β0::AbstractVector=initial_coef,
correct_leverage::Bool=false,
fitargs...,
) where {M<:RobustLinearModel, T<:AbstractFloat}
Create a robust model with the model matrix (or formula) X and response vector (or dataframe) y, using a robust estimator.
Arguments
X
: the model matrix (it can be dense or sparse) or a formulay
: the response vector or a table (dataframe, namedtuple, ...).est
: a robust estimator
Keywords
method::Symbol = :auto
: the method to use for solving the weighted linear system,chol
(default),qr
orcg
. Use :auto to select the default method;dofit::Bool = true
: if false, return the model object without fitting;dropmissing::Bool = false
: if true, drop the rows with missing values (and convert to Non-Missing type). Withdropmissing=true
the number of observations may be smaller than the size of the input arrays;wts::Vector = similar(y, 0)
: Prior probability weights of observations. Can be empty (length 0) if no weights are used (default);offset::Vector = similar(y, 0)
: an offset vector, should be empty if no offset is used;fitdispersion::Bool = false
: reevaluate the dispersion;ridgeλ::Real = 0
: if positive, perform a robust ridge regression with shrinkage parameterridgeλ
.RidgePred
object will be used;ridgeG::Union{UniformScaling, AbstractArray} = I
: define a custom regularization matrix. Default to unity matrix (with 0 for the intercept);βprior::AbstractVector = []
: define a custom prior for the coefficients for ridge regression. Default tozeros(p)
;quantile::Union{Nothing, AbstractFloat} = nothing
: only forGeneralizedQuantileEstimator
, define the quantile to estimate;dropcollinear::Bool=false
: controls whether or not a model matrix less-than-full rank is accepted;contrasts::AbstractDict{Symbol,Any} = Dict{Symbol,Any}()
: aDict
mapping term names (asSymbol
s) to term types (e.g.ContinuousTerm
) or contrasts (e.g.,HelmertCoding()
,SeqDiffCoding(; levels=["a", "b", "c"])
, etc.). If contrasts are not provided for a variable, the appropriate term type will be guessed based on the data type from the data column: any numeric data is assumed to be continuous, and any non-numeric data is assumed to be categorical (withDummyCoding()
as the default contrast type);initial_scale::Union{Symbol, Real}=:mad
: the initial scale estimate, for non-convex estimator it helps to find the global minimum. Automatic computation using:mad
,L1
orextrema
(non-robust).σ0::Union{Nothing, Symbol, Real}=initial_scale
: alias ofinitial_scale
;initial_coef::AbstractVector=[]
: the initial coefficients estimate, for non-convex estimator it helps to find the global minimum.β0::AbstractVector=initial_coef
: alias ofinitial_coef
;correct_leverage::Bool=false
: apply the leverage correction weights withleverage_weights
.fitargs...
: other keyword arguments used to control the convergence of the IRLS algorithm (seeRobustModels.pirls!
).
Output
the RobustLinearModel object.
StatsAPI.fit
— Methodfit(::Type{M}, X::Union{AbstractMatrix{T},SparseMatrixCSC{T}},
y::AbstractVector{T}; quantile::AbstractFloat=0.5,
dofit::Bool = true,
wts::FPVector = similar(y, 0),
fitdispersion::Bool = false,
fitargs...) where {M<:QuantileRegression, T<:AbstractFloat}
Fit a quantile regression model with the model matrix (or formula) X and response vector (or dataframe) y.
It is solved using the exact interior method.
Arguments
X
: the model matrix (it can be dense or sparse) or a formulay
: the response vector or a dataframe.
Keywords
quantile::AbstractFloat=0.5
: the quantile value for the regression, between 0 and 1.dofit::Bool = true
: iffalse
, return the model object without fitting;wts::Vector = []
: a weight vector, should be empty if no weights are used;fitdispersion::Bool = false
: reevaluate the dispersion;fitargs...
: other keyword arguments likeverbose
to print iteration details.
Output
the RobustLinearModel object.
RobustModels.rlm
— Functionrlm(X, y, args...; kwargs...)
An alias for fit(RobustLinearModel, X, y, est; kwargs...)
.
The arguments X
and y
can be a Matrix
and a Vector
or a Formula
and a DataFrame
.
RobustModels.quantreg
— Functionquantreg(X, y, args...; kwargs...)
An alias for fit(QuantileRegression, X, y; kwargs...)
.
The arguments X
and y
can be a Matrix
and a Vector
or a Formula
and a DataFrame
.
StatsAPI.fit!
— FunctionFit a statistical model in-place.
RobustModels.refit!
— Functionrefit!(m::RobustLinearModel, [y::FPVector];
wts::Union{Nothing, FPVector} = nothing,
offset::Union{Nothing, FPVector} = nothing,
quantile::Union{Nothing, AbstractFloat} = nothing,
ridgeλ::Union{Nothing, Real} = nothing,
kwargs...)
Refit the RobustLinearModel
.
This function assumes that m
was correctly initialized and the model is refitted with the new values for the response, weights, offset, quantile and ridge shrinkage.
Defining a new quantile
is only possible for GeneralizedQuantileEstimator
.
Defining a new ridgeλ
is only possible for RidgePred
objects.
refit!(m::QuantileRegression,
[y::FPVector ;
verbose::Bool=false,
quantile::Union{Nothing,
AbstractFloat}=nothing,
]
)
Refit the QuantileRegression
model with the new values for the response, weights and quantile. This function assumes that m
was correctly initialized.
Model methods
StatsAPI.coef
— Functioncoef(model::StatisticalModel)
Return the coefficients of the model.
StatsAPI.coeftable
— Functioncoeftable(model::StatisticalModel; level::Real=0.95)
Return a table with coefficients and related statistics of the model. level
determines the level for confidence intervals (by default, 95%).
The returned CoefTable
object implements the Tables.jl interface, and can be converted e.g. to a DataFrame
via using DataFrames; DataFrame(coeftable(model))
.
StatsAPI.coefnames
— Functioncoefnames(model::StatisticalModel)
Return the names of the coefficients.
StatsAPI.responsename
— Functionresponsename(model::RegressionModel)
Return the name of the model response (a.k.a. the dependent variable).
StatsAPI.confint
— Functionconfint(model::StatisticalModel; level::Real=0.95)
Compute confidence intervals for coefficients, with confidence level level
(by default 95%).
StatsAPI.deviance
— Functiondeviance(model::StatisticalModel)
Return the deviance of the model relative to a reference, which is usually when applicable the saturated model. It is equal, up to a constant, to $-2 \log L$, with $L$ the likelihood of the model.
StatsAPI.nulldeviance
— Functionnulldeviance(model::StatisticalModel)
Return the deviance of the null model, obtained by dropping all independent variables present in model
.
If model
includes an intercept, the null model is the one with only the intercept; otherwise, it is the one without any predictor (not even the intercept).
StatsAPI.dof
— Functiondof(model::StatisticalModel)
Return the number of degrees of freedom consumed in the model, including when applicable the intercept and the distribution's dispersion parameter.
StatsAPI.dof_residual
— Functiondof_residual(model::RegressionModel)
Return the residual degrees of freedom of the model.
StatsAPI.nobs
— Functionnobs(model::StatisticalModel)
Return the number of independent observations on which the model was fitted. Be careful when using this information, as the definition of an independent observation may vary depending on the model, on the format used to pass the data, on the sampling plan (if specified), etc.
RobustModels.wobs
— Functionwobs(obj::RobustResp)
For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.
wobs(m::RobustLinearModel)
For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.
wobs(m::QuantileRegression)
For unweighted linear models, equals to $nobs$, it returns the number of elements of the response. For models with prior weights, return the sum of the weights.
StatsAPI.isfitted
— Functionisfitted(model::StatisticalModel)
Indicate whether the model has been fitted.
StatsAPI.islinear
— Functionislinear(model::StatisticalModel)
Indicate whether the model is linear.
StatsAPI.loglikelihood
— Functionloglikelihood(model::StatisticalModel)
loglikelihood(model::StatisticalModel, observation)
Return the log-likelihood of the model.
With an observation
argument, return the contribution of observation
to the log-likelihood of model
.
If observation
is a Colon
, return a vector of each observation's contribution to the log-likelihood of the model. In other words, this is the vector of the pointwise log-likelihood contributions.
In general, sum(loglikehood(model, :)) == loglikelihood(model)
.
StatsAPI.nullloglikelihood
— Functionnullloglikelihood(model::StatisticalModel)
Return the log-likelihood of the null model, obtained by dropping all independent variables present in model
.
If model
includes an intercept, the null model is the one with only the intercept; otherwise, it is the one without any predictor (not even the intercept).
StatsAPI.stderror
— Functionstderror(model::StatisticalModel)
Return the standard errors for the coefficients of the model.
StatsAPI.vcov
— Functionvcov(model::StatisticalModel)
Return the variance-covariance matrix for the coefficients of the model.
StatsAPI.weights
— Functionweights(model::StatisticalModel)
Return the weights used in the model.
RobustModels.workingweights
— Functionworkingweights(m::RobustLinearModel)
The robust weights computed by the model.
This can be used to detect outliers, as outliers weights are lower than the weights of valid data points.
StatsAPI.fitted
— Functionfitted(model::RegressionModel)
Return the fitted values of the model.
StatsAPI.predict
— Functionpredict(model::RegressionModel, [newX])
Form the predicted response of model
. An object with new covariate values newX
can be supplied, which should have the same type and structure as that used to fit model
; e.g. for a GLM it would generally be a DataFrame
with the same variable names as the original predictors.
StatsAPI.leverage
— Functionleverage(model::RegressionModel)
Return the diagonal of the projection matrix of the model.
RobustModels.leverage_weights
— Functionleverage_weights(m::RobustLinearModel, variant::Symbol)
Returns leverage_weights
for the model using a different weights vector depending on the variant: - variant = :original
: use the user-defined weights, if no weights were used (size of the weights vector is 0), no weights are used. - variant = :fitted
: use the working weights of the fitted model from the IRLS procedure.
StatsAPI.modelmatrix
— Functionmodelmatrix(model::RegressionModel)
Return the model matrix (a.k.a. the design matrix).
RobustModels.projectionmatrix
— Functionprojectionmatrix(m::RobustLinearModel)
The robust projection matrix from the predictor: X (X' W X)⁻¹ X' W, where W are the working weights.
projectionmatrix(m::RobustLinearModel, variant::Symbol)
Returns projectionmatrix
for the model using a different weights vector depending on the variant: - variant = :original
: use the user-defined weights, if no weights were used (size of the weights vector is 0), no weights are used. - variant = :fitted
: use the working weights of the fitted model from the IRLS procedure.
GLM.dispersion
— Methoddispersion(m::RobustLinearModel, sqr::Bool=false)
The dispersion is the (weighted) sum of robust residuals. If sqr
is true, return the squared dispersion.
StatsAPI.response
— Functionresponse(model::RegressionModel)
Return the model response (a.k.a. the dependent variable).
StatsAPI.residuals
— Functionresiduals(model::RegressionModel)
Return the residuals of the model.
StatsModels.hasintercept
— FunctionIndicate if the model has an intercept
RobustModels.hasformula
— FunctionIndicate if the model is defined from a formula (and dataframe) or from arrays
StatsModels.formula
— FunctionThe model formula. Throw an error if the model was defined by arrays.
RobustModels.scale
— Functionscale(m::RobustLinearModel, sqr::Bool=false)
The robust scale estimate used for the robust estimation.
If sqr
is true
, the square of the scale is returned.
RobustModels.tauscale
— Functiontauscale(m::RobustLinearModel, sqr::Bool=false; kwargs...)
The robust τ-scale that is minimized in τ-estimation.
If sqr
is true
, the square of the τ-scale is returned.
RobustModels.location_variance
— Functionlocation_variance(r::RobustLinResp, sqr::Bool=false)
Compute the part of the variance of the coefficients β
that is due to the encertainty from the location. If sqr
is false
, return the standard deviation instead.
From Maronna et al., Robust Statistics: Theory and Methods, Equation 4.49
RobustModels.Estimator
— FunctionEstimator(m::RobustLinearModel)
The robust estimator object used to fit the model.
GLM.linpred!
— Functionlinpred!(out, p::RidgePred{T}, f::Real=1.0)
Overwrite out
with the linear predictor from p
with factor f
The effective coefficient vector, p.scratchbeta
, is evaluated as p.beta0 .+ f * p.delbeta
, and out
is updated to p.X * p.scratchbeta
linpred!(out, p::LinPred, f::Real=1.0)
Overwrite out
with the linear predictor from p
with factor f
The effective coefficient vector, p.scratchbeta
, is evaluated as p.beta0 .+ f * p.delbeta
, and out
is updated to p.X * p.scratchbeta
RobustModels.pirls!
— Functionpirls!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
beta0::AbstractVector=[], sigma0::Union{Nothing, T}=nothing)
(Penalized) Iteratively Reweighted Least Square procedure for M-estimation. The Penalized aspect is not implemented (yet).
RobustModels.pirls_Sestimate!
— Functionpirls_Sestimate!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
beta0::AbstractVector=T[], sigma0::Union{Nothing, T}=nothing)
(Penalized) Iteratively Reweighted Least Square procedure for S-estimation. The Penalized aspect is not implemented (yet).
RobustModels.pirls_τestimate!
— Functionpirls_τestimate!(m::RobustLinearModel{T}; verbose::Bool=false, maxiter::Integer=30,
minstepfac::Real=1e-3, atol::Real=1e-6, rtol::Real=1e-6,
beta0::AbstractVector=T[], sigma0::Union{Nothing, T}=nothing)
(Penalized) Iteratively Reweighted Least Square procedure for τ-estimation. The Penalized aspect is not implemented (yet).
Estimators
RobustModels.MEstimator
— TypeMEstimator{L<:LossFunction} <: AbstractMEstimator
M-estimator for a given loss function.
The M-estimator is obtained by minimizing the loss function:
\[\hat{\mathbf{\beta}} = \underset{\mathbf{\beta}}{\textrm{argmin}} \sum_{i=1}^n \rho\left(\dfrac{r_i}{\hat{\sigma}}\right)\]
with the residuals $\mathbf{r} = \mathbf{y} - \mathbf{X} \mathbf{\beta}$ , and a robust scale estimate $\hat{\sigma}$.
Fields
loss
: theLossFunction
used for the robust estimation.
RobustModels.L1Estimator
— TypeL1Estimator
is a shorthand name for MEstimator{L1Loss}
. Using exact QuantileRegression should be preferred.
RobustModels.L2Estimator
— TypeL2Estimator
is a shorthand name for MEstimator{L2Loss}
, the non-robust OLS.
RobustModels.SEstimator
— TypeSEstimator{L<:BoundedLossFunction} <: AbstractMEstimator
S-estimator for a given bounded loss function.
The S-estimator is obtained by minimizing the scale estimate:
\[\hat{\mathbf{\beta}} = \underset{\mathbf{\beta}}{\textrm{argmin }} \hat{\sigma}^2\]
where the robust scale estimate $\hat{\sigma}}$ is solution of:
\[\dfrac{1}{n} \sum_{i=1}^n \rho\left(\dfrac{r_i}{\hat{\sigma}}\right) = \delta\]
with the residuals $\mathbf{r} = \mathbf{y} - \mathbf{X} \mathbf{\beta}$ , $\rho$ is a bounded loss function with $\underset{r \to \infty}{\lim} \rho(r) = 1$ and $\delta$ is the finite breakdown point, usually 0.5.
Fields
loss
: theLossFunction
used for the robust estimation.
RobustModels.MMEstimator
— TypeMMEstimator{L1<:BoundedLossFunction, L2<:LossFunction} <: AbstractMEstimator
MM-estimator for the given loss functions.
The MM-estimator is obtained using a two-step process:
- compute a robust scale estimate with a high breakdown point using a S-estimate and the loss function
L1
. - compute an efficient estimate using a M-estimate with the loss function
L2
.
Fields
loss1
: theBoundedLossFunction
used for the high breakdown point S-estimation.loss2
: theLossFunction
used for the efficient M-estimation.scaleest
: boolean specifying the if the estimation is in the S-estimation step (true
)
or the M-estimation step (false
).
RobustModels.TauEstimator
— TypeTauEstimator{L1<:BoundedLossFunction, L2<:BoundedLossFunction} <: AbstractMEstimator
τ-estimator for the given loss functions.
The τ-estimator corresponds to a M-estimation, where the loss function is a weighted sum of a high breakdown point loss and an efficient loss. The weight is recomputed at every step of the Iteratively Reweighted Least Square, so the estimate is both robust (high breakdown point) and efficient.
Fields
loss1
: the high breakdown pointBoundedLossFunction
.loss2
: the high efficiencyBoundedLossFunction
.w
: the weight in the sum of losses:w . loss1 + loss2
.
RobustModels.GeneralizedQuantileEstimator
— TypeGeneralizedQuantileEstimator{L<:LossFunction} <: AbstractQuantileEstimator
Generalized Quantile Estimator is an M-Estimator with asymmetric loss function.
For L1Loss
, this corresponds to quantile regression (although it is better to use quantreg
for quantile regression because it gives the exact solution).
For L2Loss
, this corresponds to Expectile regression (see ExpectileEstimator
).
Fields
loss
: theLossFunction
.τ
: the quantile value to estimate, between 0 and 1.
Properties
tau
,q
,quantile
are aliases forτ
.
RobustModels.ExpectileEstimator
— TypeThe expectile estimator is a generalization of the L2 estimator, for other quantile τ ∈ [0,1].
[1] Schnabel, Eilers - Computational Statistics and Data Analysis 53 (2009) 4168–4177 - Optimal expectile smoothing doi:10.1016/j.csda.2009.05.002
RobustModels.QuantileEstimator
— TypeNon-exact quantile estimator, GeneralizedQuantileEstimator{L1Loss}
. Prefer using QuantileRegression
Loss functions
RobustModels.BoundedLossFunction
— TypeBounded loss function type for hard rejection of outlier.
RobustModels.L2Loss
— TypeThe (convex) L2 loss function is that of the standard least squares problem. ψ(r) = r
RobustModels.L1Loss
— TypeThe standard L1 loss function takes the absolute value of the residual, and is convex but non-smooth. It is not a real L1 loss but a Huber loss with very small tuning constant. ψ(r) = sign(r) Use $QuantileRegression$ for a correct implementation of the L1 loss.
RobustModels.HuberLoss
— TypeThe convex Huber loss function switches from between quadratic and linear cost/loss function at a certain cutoff. ψ(r) = (abs(r) <= 1) ? r : sign(r)
RobustModels.L1L2Loss
— TypeThe convex L1-L2 loss interpolates smoothly between L2 behaviour for small residuals and L1 for outliers. ψ(r) = r / √(1 + r^2)
RobustModels.FairLoss
— TypeThe (convex) "fair" loss switches from between quadratic and linear cost/loss function at a certain cutoff, and is C3 but non-analytic. ψ(r) = r / (1 + abs(r))
RobustModels.LogcoshLoss
— TypeThe convex Log-Cosh loss function ψ(r) = tanh(r)
RobustModels.ArctanLoss
— TypeThe convex Arctan loss function ψ(r) = atan(r)
RobustModels.CatoniWideLoss
— TypeThe convex (wide) Catoni loss function. See: "Catoni (2012) - Challenging the empirical mean and empirical variance: A deviation study"
ψ(r) = sign(r) * log(1 + abs(r) + r^2/2)
RobustModels.CatoniNarrowLoss
— TypeThe convex (narrow) Catoni loss function. See: "Catoni (2012) - Challenging the empirical mean and empirical variance: A deviation study"
ψ(r) = (abs(r) <= 1) ? -sign(r) * log(1 - abs(r) + r^2/2) : sign(r) * log(2)
RobustModels.CauchyLoss
— TypeThe non-convex Cauchy loss function switches from between quadratic behaviour to logarithmic tails. This rejects outliers but may result in multiple minima. For scale estimate, r.ψ(r) is used as a loss, which is the same as for Geman loss. ψ(r) = r / (1 + r^2)
RobustModels.GemanLoss
— TypeThe non-convex Geman-McClure for strong suppression of outliers and does not guarantee a unique solution. For the S-Estimator, it is equivalent to the Cauchy loss. ψ(r) = r / (1 + r^2)^2
RobustModels.WelschLoss
— TypeThe non-convex Welsch for strong suppression of outliers and does not guarantee a unique solution ψ(r) = r * exp(-r^2)
RobustModels.TukeyLoss
— TypeThe non-convex Tukey biweight estimator which completely suppresses the outliers, and does not guaranty a unique solution. ψ(r) = (abs(r) <= 1) ? r * (1 - r^2)^2 : 0
RobustModels.YohaiZamarLoss
— TypeThe non-convex (and bounded) optimal Yohai-Zamar loss function that minimizes the estimator bias. It was originally introduced in Optimal locally robust M-estimates of regression (1997) by Yohai and Zamar with a slightly different formula.
RobustModels.HardThresholdLoss
— TypeThe non-convex hard-threshold loss function, or saturated L2 loss. Non-smooth. ψ(r) = (abs(r) <= 1) ? r : 0
RobustModels.HampelLoss
— TypeThe 3-parameter non-convex bounded Hampel's loss function. ψ(r) = (abs(r) <= 1) ? r : ( (abs(r) <= l.ν1) ? sign(r) : ( (abs(r) <= l.ν2) ? (l.ν2 - abs(r)) / (l.ν2 - l.ν1) * sign(r) : 0))
Estimator and Loss functions methods
RobustModels.rho
— FunctionThe loss function ρ for the M-estimator.
RobustModels.psi
— FunctionThe influence function ψ is the derivative of the loss function for the M-estimator, multiplied by the square of the tuning constant.
RobustModels.psider
— FunctionThe derivative of ψ, used for asymptotic estimates.
RobustModels.weight
— FunctionThe weights for IRLS, the function ψ divided by r.
RobustModels.estimator_values
— FunctionFaster version if you need ρ, ψ and w in the same call
RobustModels.estimator_norm
— FunctionThe integral of exp(-ρ) used for calculating the full-loglikelihood
RobustModels.estimator_bound
— FunctionThe limit at ∞ of the loss function. Used for scale estimation of bounded loss.
RobustModels.tuning_constant
— FunctionThe tuning constant of the loss function, can be optimized to get efficient or robust estimates.
RobustModels.isconvex
— FunctionBoolean if the estimator or loss function is convex
RobustModels.isbounded
— FunctionBoolean if the estimator or loss function is bounded
RobustModels.estimator_high_breakdown_point_constant
— FunctionThe tuning constant associated to the loss that gives an efficient M-estimator.
RobustModels.estimator_high_efficiency_constant
— FunctionThe tuning constant associated to the loss that gives a robust (high breakdown point) M-estimator.
RobustModels.efficient_loss
— FunctionThe loss initialized with an efficient tuning constant
RobustModels.robust_loss
— FunctionThe loss initialized with a robust (high breakdown point) tuning constant
RobustModels.efficiency_tuning_constant
— FunctionThe tuning constant c is computed so the efficiency for Normal distributed residuals is 0.95. The efficiency of the mean estimate μ is defined by:
eff_μ = (E[ψ'])²/E[ψ²]
RobustModels.mscale_loss
— Functionmscale_loss(loss::L, x)
The rho-function that is used for M-scale estimation.
For monotone (convex) functions, χ(r) = r.ψ(r)/c^2.
For bounded functions, χ(r) = ρ(r)/ρ(∞) so χ(∞) = 1.
RobustModels.breakdown_point_tuning_constant
— FunctionThe M-estimate of scale is computed by solving:
\[\dfrac{1}{n} \sum_i \chi\left( \dfrac{r_i}{\hat{\sigma}}\right) = \delta\]
For monotone (convex) functions, χ(r) = r.ψ(r) and δ is defined as E[χ(r)] = δ for the Normal distribution N(0,1) For bounded functions, χ(r) = ρ(r)/ρ(∞) with χ(∞) = 1 and δ = E[χ]/χ(∞) with expectation w.r.t. Normal density.
The tuning constant c corresponding to a high breakdown point (0.5) is such that δ = 1/2, from 1/n Σ χ(r/ŝ) = δ
RobustModels.scale_estimate
— Functionscale_estimate(loss, res; σ0=1.0, wts=[], verbose=false,
order=1, approx=false, nmax=30,
rtol=1e-4, atol=0.1)
Compute the M-scale estimate from the loss function. If the loss is bounded, ρ is used as the function χ in the sum, otherwise r.ψ(r) is used if the loss is not bounded, to coincide with the Maximum Likelihood Estimator. Also, for bounded estimator, because f(s) = 1/(nδ) Σ ρ(ri/s) is decreasing the iteration step is not using the weights but is multiplicative.
RobustModels.tau_efficiency_tuning_constant
— Functiontau_efficiency_tuning_constant(::Type{L1}, ::Type{L2}; eff::Real=0.95, c0::Real=1.0)
where {L1<:BoundedLossFunction,L2<:BoundedLossFunction}
Compute the tuning constant that corresponds to a high breakdown point for the τ-estimator.
RobustModels.estimator_tau_efficient_constant
— FunctionThe tuning constant associated to the loss that gives a robust τ-estimator.
RobustModels.loss
— FunctionThe loss function used for the estimation
RobustModels.set_SEstimator
— FunctionMEstimator, set to S-Estimation phase
RobustModels.set_MEstimator
— FunctionMEstimator, set to M-Estimation phase
RobustModels.update_weight!
— Functionupdate_weight!(E::TauEstimator, res::AbstractArray{T}; wts::AbstractArray{T}=T[])
Update the weight between the two estimators of a τ-estimator using the scaled residual.
RobustModels.tau_scale_estimate
— Functiontau_scale_estimate!(E::TauEstimator, res::AbstractArray{T}, σ::Real, sqr::Bool=false;
wts::AbstractArray{T}=T[], bound::AbstractFloat=0.5) where {T<:AbstractFloat}
The τ-scale estimate, where σ
is the scale estimate from the robust M-scale. If sqr
is true, return the squared value.
RobustModels.quantile_weight
— Functionquantile_weight(τ::Real, r::Real)
Wrapper function to compute quantile-like loss function.