In literature the problem of detecting “outlying” observations in regression model where the predictors could be affected by multicollinearity has been faced using robust procedures for Partial Least Squares regression model (PLS; Hubert et al., 2003; Serneels et al., 2005; Camminatiello, 2008). The basic approaches to outlier identification are distance-based and projection pursuit methods (Filzmoser et al., 2008). In this paper we explore some different robust alternatives of PLS regression, their advantages and disadvantages, presenting a simulation plan which allows a clear comparison among the various methods. The aim is to propose a robust approach by questioning if an observation is really an outlier, as its examination can depend on 1) the appropriateness of the model to be formulated and on 2) the suitability of the parametric tests considered. Among the various approaches presented in literature to incorporate non-linear features into the linear PLS regression model, we will consider PLS regression via spline transformation of the predictor variables (Durand; 2001). In addition, we will discuss nonnormal and non-parametric alternatives (Baringhaus and Franz, 2004) to the usual tests based on the squared Mahalanobis distance and Hotelling's T2 (Mardia, 1975). After the discussion of the simulation plan, to illustrate the usefulness of the robust PLS methods, the analysis will be made on environmental data.

Outlier in linear and non-linear PLS regression: an application in environmental field

Ida Camminatiello
;
Rosaria Lombardo;
2015

Abstract

In literature the problem of detecting “outlying” observations in regression model where the predictors could be affected by multicollinearity has been faced using robust procedures for Partial Least Squares regression model (PLS; Hubert et al., 2003; Serneels et al., 2005; Camminatiello, 2008). The basic approaches to outlier identification are distance-based and projection pursuit methods (Filzmoser et al., 2008). In this paper we explore some different robust alternatives of PLS regression, their advantages and disadvantages, presenting a simulation plan which allows a clear comparison among the various methods. The aim is to propose a robust approach by questioning if an observation is really an outlier, as its examination can depend on 1) the appropriateness of the model to be formulated and on 2) the suitability of the parametric tests considered. Among the various approaches presented in literature to incorporate non-linear features into the linear PLS regression model, we will consider PLS regression via spline transformation of the predictor variables (Durand; 2001). In addition, we will discuss nonnormal and non-parametric alternatives (Baringhaus and Franz, 2004) to the usual tests based on the squared Mahalanobis distance and Hotelling's T2 (Mardia, 1975). After the discussion of the simulation plan, to illustrate the usefulness of the robust PLS methods, the analysis will be made on environmental data.
2015
978-88-88793-85-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/383780
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact