direpack.sprm.sprm.sprm
- class sprm(n_components=1, eta=0.5, fun='Hampel', probp1=0.95, probp2=0.975, probp3=0.999, centre='median', scale='mad', verbose=True, maxit=100, tol=0.01, start_cutoff_mode='specific', start_X_init='pcapp', columns=False, copy=True)[source]
SPRM Sparse Partial Robust M Regression
- Algorithm first outlined in:
Sparse partial robust M regression, Irene Hoffmann, Sven Serneels, Peter Filzmoser, Christophe Croux, Chemometrics and Intelligent Laboratory Systems, 149 (2015), 50-59.
- Parameters
eta (float.) – Sparsity parameter in [0,1)
n_components (int) – min 1. Note that if applied on data, n_components shall take a value <= min(x_data.shape)
fun (str) – downweighting function. ‘Hampel’ (recommended), ‘Fair’ or ‘Huber’
probp1 (float) – probability cutoff for start of downweighting (e.g. 0.95)
probp2 (float) – probability cutoff for start of steep downweighting (e.g. 0.975, only relevant if fun=’Hampel’)
probp3 (float) – probability cutoff for start of outlier omission (e.g. 0.999, only relevant if fun=’Hampel’)
centre (str) – type of centring (‘mean’, ‘median’, ‘l1median’, or ‘kstepLTS’, the latter recommended statistically, if too slow, switch to ‘median’)
scale (str) – type of scaling (‘std’,’mad’, ‘scaleTau2’ [recommended] or ‘None’)
verbose (booleans) – specifying verbose mode
maxit (int) – maximal number of iterations in M algorithm
tol (float) – tolerance for convergence in M algorithm
start_cutoff_mode (str,) – values:’specific’ will set starting value cutoffs specific to X and y (preferred); any other value will set X and y stating cutoffs identically. The latter yields identical results to the SPRM R implementation available from CRAN.
start_X_init (str,) – values: ‘pcapp’ will include a PCA/broken stick projection to calculate the staring weights, else just based on X; any other value will calculate the X starting values based on the X matrix itself. This is less stable for very flat data (p >> n), yet yields identical results to the SPRM R implementation available from CRAN.
columns ((def false) Either boolean, list, numpy array or pandas Index) – if False, no column names supplied; if True, if X data are supplied as a pandas data frame, will extract column names from the frame throws an error for other data input types if a list, array or Index (will only take length x_data.shape[1]), the column names of the x_data supplied in this list, will be printed in verbose mode.
copy ((def True) boolean, whether to copy data) –
- Attributes always provided
x_weights_: X block PLS weighting vectors (usually denoted W)
x_loadings_: X block PLS loading vectors (usually denoted P)
C_: vector of inner relationship between response and latent variablesblock re
x_scores_: X block PLS score vectors (usually denoted T)
coef_: vector of regression coefficients
intercept_: intercept
coef_scaled_: vector of scaled regression coeeficients (when scaling option used)
intercept_scaled_: scaled intercept
residuals_: vector of regression residuals
x_ev_: X block explained variance per component
y_ev_: y block explained variance
fitted_: fitted response
x_Rweights_: X block SIMPLS style weighting vectors (usually denoted R)
x_caseweights_: X block case weights
y_caseweights_: y block case weights
caseweights_: combined case weights
colret_: names of variables retained in the sparse model
x_loc_: X block location estimate
y_loc_: y location estimate
x_sca_: X block scale estimate
y_sca_: y scale estimate
non_zero_scale_vars_: indicator vector of variables in X with nonzero scale
- __init__(n_components=1, eta=0.5, fun='Hampel', probp1=0.95, probp2=0.975, probp3=0.999, centre='median', scale='mad', verbose=True, maxit=100, tol=0.01, start_cutoff_mode='specific', start_X_init='pcapp', columns=False, copy=True)[source]
Methods
__init__([n_components, eta, fun, probp1, ...])fit(X, y)Fit a SPRM model.
fit_transform(X[, y])Fit to data, then transform it.
get_params([deep])Get parameters for this estimator.
predict(Xn)Predict using a SPRM model.
score(X, y[, sample_weight])Return the coefficient of determination of the prediction.
set_params(**params)Set the parameters of this estimator.
transform(Xn)Transform input data.
valscore(Xn, yn, scoring)Specific score function for validation data
weightnewx(Xn)Calculate case weights for new data based on the projection in the SPRM score space
Attributes