Links to specific topics

Tuesday, June 28, 2011

WarpPLS’ treatment of formative latent variables: PLS regression is more conservative and stable

WarpPLS uses what is often referred to as Wold’s original “PLS regression” algorithm to calculate indicator weights, for both formative and reflective variables. PLS regression was developed by Wold, and is slightly different from the modified versions often referred to as modes A and B, which are the ones normally used in other publicly available PLS-based structural equation modeling software. These modified versions implement an underlying algorithmic assumption that Lohmöller called the "good neighbor" assumption, whereby weights are influenced by inner model links.

Generally speaking, the PLS regression algorithm generates coefficients that are more stable and robust – i.e., reliable for hypothesis testing. It also tends to minimize collinearity. On the other hand, it may be lead to a higher demand for computational power in some cases, which may be the reason why modified versions have been implemented. Lohmöller discusses multiple algorithm versions, with some characteristics placing them within broad types called “modes” – see Lohmöller (1989), the PLS "bible", for more details. Personal computers were not that powerful in the 1980s.

Moreover, the type of nonlinear treatment employed by WarpPLS is difficult to perform with Lohmöller’s underlying algorithm (the "good neighbor" assumption), whereby the outer model is influenced by the inner model. The problem is that with Lohmöller’s algorithm, as a model changes, the weights and loadings also change, even if the latent variables do not change. That is, with Lohmöller’s algorithm, two models with the same latent variables but different structures (i.e., links among latent variables) will have different weights and loadings.

The weights of formative latent variables will be essentially the same in WarpPLS as they would be if the variables were defined as reflective. That is, they will be obtained by an iterative algorithm that stops when two conditions are met: (a) the weights between indicators and latent variable are standardized partial regression coefficients calculated with the indicators as independent variables and the latent variable as the dependent variable; and (b) the regression equation expressing the latent variable as a combination of the indicators has an error term of zero.

So why should the user define a latent variable as formative or reflective? The reason are the interpretations of the outputs generated by the software. When a latent variable is formative, both the P values for the weights and the variance inflation factors for the indicators should be generally low; ideally below 0.05 and 2.5, respectively.

True formative variables are fundamentally different from true reflective variables; there are cases that can be seen as “in between” formative and reflective. True formative and reflective variables behave differently, whether the software treats them differently or not. For example, with true formative variables you would expect indicators to be significantly associated with the scores of their respective latent variable; which is indicated by low P values for their weights. However, you would not normally expect the indicators to be redundant; which is indicated by low variance inflation factors for the indicators.

The way formative variables are treated in Lohmöller’s approach leads to unstable weights, with the signs of weights frequently changing in the resample set. See Temme et al. (2006) for a discussion on this phenomenon. Lohmöller’s approach also leads to “lateral” collinearity; or collinearity between predictor and criteria latent variables. This “stealth” type of collinearity often leads to inflated path coefficients for links involving formative latent variables.

Formative variables don't "become reflective", or vice-versa, if one or another algorithm is used. This is a common misconception among users of PLS-based SEM software.


Lohmöller, J.-B. (1989). Latent variable path modeling with partial least squares. Heidelberg, Germany: Physica-Verlag.

Temme, D., Kreis, H., & Hildebrandt, L. (2006). PLS path modeling – A software review. Berlin, Germany: Institute of Marketing, Humboldt University Berlin.


Anonymous said...

In your article (Kock and Mayfield, 2015, p.125), you said that "this is why PLS regression does not have a 'formative mode' per se". Does it mean that PLS regression algorithms "read" all variables in a model as reflective variables? If so, should we simply examine the measurement model using the reflective measurement model criteria if PLS regression is used and if there are both formative and reflective variables in the model? Your comments on these questions are much appreciated.

Ned Kock said...

Hi Anon. In WarpPLS the PLS Regression algorithm (discussed in the article) treats reflective and formative LVs in the same way. This does not mean that formative LVs become reflective, or vice-versa. That is, indicators are still non-redundant with formative LVs, and redundant with reflective LVs, even when the PLS Regression algorithm is used. Therefore, different measurement assessment criteria should be used for formative and reflective LVs. As you may know, PLS Regression is only one of the component-based algorithms implemented by WarpPLS. I hope that the materials linked below can be of use in connection with this.

User Manual (links to specific pages):

The formative-reflective measurement dichotomy

The links above, as well as other links that may be relevant in this context, are available from:

Anonymous said...

Thanks so much for your reply.