direpack.sprm.sprm.sprm

class sprm(n_components=1, eta=0.5, fun='Hampel', probp1=0.95, probp2=0.975, probp3=0.999, centre='median', scale='mad', verbose=True, maxit=100, tol=0.01, start_cutoff_mode='specific', start_X_init='pcapp', columns=False, copy=True)[source]

SPRM Sparse Partial Robust M Regression

Algorithm first outlined in:: Sparse partial robust M regression, Irene Hoffmann, Sven Serneels, Peter Filzmoser, Christophe Croux, Chemometrics and Intelligent Laboratory Systems, 149 (2015), 50-59.

Parameters

eta (float.) – Sparsity parameter in [0,1)
n_components (int) – min 1. Note that if applied on data, n_components shall take a value <= min(x_data.shape)
fun (str) – downweighting function. ‘Hampel’ (recommended), ‘Fair’ or ‘Huber’
probp1 (float) – probability cutoff for start of downweighting (e.g. 0.95)
probp2 (float) – probability cutoff for start of steep downweighting (e.g. 0.975, only relevant if fun=’Hampel’)
probp3 (float) – probability cutoff for start of outlier omission (e.g. 0.999, only relevant if fun=’Hampel’)
centre (str) – type of centring (‘mean’, ‘median’, ‘l1median’, or ‘kstepLTS’, the latter recommended statistically, if too slow, switch to ‘median’)
scale (str) – type of scaling (‘std’,’mad’, ‘scaleTau2’ [recommended] or ‘None’)
verbose (booleans) – specifying verbose mode
maxit (int) – maximal number of iterations in M algorithm
tol (float) – tolerance for convergence in M algorithm
start_cutoff_mode (str,) – values:’specific’ will set starting value cutoffs specific to X and y (preferred); any other value will set X and y stating cutoffs identically. The latter yields identical results to the SPRM R implementation available from CRAN.
start_X_init (str,) – values: ‘pcapp’ will include a PCA/broken stick projection to calculate the staring weights, else just based on X; any other value will calculate the X starting values based on the X matrix itself. This is less stable for very flat data (p >> n), yet yields identical results to the SPRM R implementation available from CRAN.
columns ((def false) Either boolean, list, numpy array or pandas Index) – if False, no column names supplied; if True, if X data are supplied as a pandas data frame, will extract column names from the frame throws an error for other data input types if a list, array or Index (will only take length x_data.shape[1]), the column names of the x_data supplied in this list, will be printed in verbose mode.
copy ((def True) boolean, whether to copy data) –

Attributes always provided

x_weights_: X block PLS weighting vectors (usually denoted W)
x_loadings_: X block PLS loading vectors (usually denoted P)
C_: vector of inner relationship between response and latent variablesblock re
x_scores_: X block PLS score vectors (usually denoted T)
coef_: vector of regression coefficients
intercept_: intercept
coef_scaled_: vector of scaled regression coeeficients (when scaling option used)
intercept_scaled_: scaled intercept
residuals_: vector of regression residuals
x_ev_: X block explained variance per component
y_ev_: y block explained variance
fitted_: fitted response
x_Rweights_: X block SIMPLS style weighting vectors (usually denoted R)
x_caseweights_: X block case weights
y_caseweights_: y block case weights
caseweights_: combined case weights
colret_: names of variables retained in the sparse model
x_loc_: X block location estimate
y_loc_: y location estimate
x_sca_: X block scale estimate
y_sca_: y scale estimate
non_zero_scale_vars_: indicator vector of variables in X with nonzero scale

__init__(n_components=1, eta=0.5, fun='Hampel', probp1=0.95, probp2=0.975, probp3=0.999, centre='median', scale='mad', verbose=True, maxit=100, tol=0.01, start_cutoff_mode='specific', start_X_init='pcapp', columns=False, copy=True)[source]

Methods

`__init__`([n_components, eta, fun, probp1, ...])
`fit`(X, y)	Fit a SPRM model.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_params`([deep])	Get parameters for this estimator.
`predict`(Xn)	Predict using a SPRM model.
`score`(X, y[, sample_weight])	Return the coefficient of determination of the prediction.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(Xn)	Transform input data.
`valscore`(Xn, yn, scoring)	Specific score function for validation data
`weightnewx`(Xn)	Calculate case weights for new data based on the projection in the SPRM score space

Attributes