Links to specific topics

Thursday, June 9, 2016

PLS-SEM performance with non-normal data

Many claims have been made in the past about the advantages of structural equation modeling employing the partial least squares method (PLS-SEM). While some claims may have been exaggerated, we are continuously finding that others have not. One of such claims, falling in the latter category (i.e., not an exaggeration), is that PLS-SEM is robust to deviations from normality. In other words, PLS-SEM performs quite well with non-normal data.

An article illustrating this advantage of PLS-SEM is available. Its reference, abstract, and link to full text are available below.

Kock, N. (2016). Non-normality propagation among latent variables and indicators in PLS-SEM simulations. Journal of Modern Applied Statistical Methods, 15(1), 299-315.

Structural equation modeling employing the partial least squares method (PLS-SEM) has been extensively used in business research. Often the use of this method is justified based on claims about its unique performance with small samples and non-normal data, which call for performance analyses. How normal and non-normal data are created for the performance analyses are examined. A method is proposed for the generation of data for exogenous latent variables and errors directly, from which data for endogenous latent variables and indicators are subsequently obtained based on model parameters. The emphasis is on the issue of non-normality propagation among latent variables and indicators, showing that this propagation can be severely impaired if certain steps are not taken. A key step is inducing non-normality in structural and indicator errors, in addition to exogenous latent variables. Illustrations of the method and its steps are provided through simulations based on a simple model of the effect of e-collaboration technology use on job performance.

The article’s main goal is actually to discuss a method to create non-normal data where the data creator has full access to all data elements, including factor or composite scores and all error terms, and where severe non-normality is extended to error terms. In the process of achieving this goal, the article actually demonstrates that PLS-SEM is very robust to severe deviations from normality, even when these deviations apply to all error terms. This is an issue that is often glossed over in PLS-SEM performance tests with non-normal data.

Readers may also find the YouTube video linked below useful in the context of this discussion.

View Skewness and Kurtosis in WarpPLS



Anonymous said...

Hi Ned,

Thanks for this post. I was wondering if you have any thoughts on the performance or suitability of PLS-SEM when the data include many zero values, or are effectively, 'zero-inflated'.


Ned Kock said...

Hi Anon. Zero-inflated data sets tend to lead to at least two issues: (a) the data tends to be non-normally distributed (e.g., zero-inflated Poisson distributions); and (b) nonlinear relationships tend to occur, particularly relationships of the “Warp2” type (see the most recent version of the WarpPLS User Manual, linked below). WarpPLS addresses both of these issues.

Anonymous said...

Excellent. Thank you!