Saturday, January 23, 2010

How is the warping done in WarpPLS?


WarpPLS does linear and nonlinear analyses. That is, users can set WarpPLS to estimate parameters based on a standard linear algorithm, and without any warping. They can also choose one of two nonlinear algorithms, thus taking advantage of the warping capabilities of the software.

In nonlinear analyses, what WarpPLS does is relatively simple at a conceptual level. It identifies a set of functions F1(LVp1), F2(LVp2) … that relate blocks of latent variable predictors (LVp1, LVp2 ...) to a criterion latent variable (LVc) in this way:

LVc = p1*F1(LVp1) + p2*F2(LVp2) + … + E.

In the equation above, p1, p2 ... are path coefficients, and E is the error term of the equation. All variables are standardized. Any model can be decomposed into a set of blocks relating latent variable predictors and criteria in this way.

In the Warp2 mode, the functions F1(LVp1), F2(LVp2) ... take the form of U curves (also known as J curves); defaulting to lines, if the relationships are linear. The term "U curve" is used here for simplicity, as noncyclical nonlinear relationships (e.g., exponential growth) can be represented through sections of straight or rotated U curves; the term "S curve" is also used here for simplicity.

In the Warp3 mode, the functions F1(LVp1), F2(LVp2) ... take the form of S curves; defaulting to U curves or lines, if the relationships follow U-curve patterns or are linear, respectively.

S curves are curves whose first derivative is a U curve. Similarly, U curves are curves whose first derivative is a line. U curves seem to be the most commonly found in natural and behavioral phenomena. S curves are also found, but apparently not as frequently as U curves.

U curves can be used to model most of the commonly seen functions in natural and behavioral studies, such as logarithmic, exponential, and hyperbolic decay functions. For these common types of functions, S-curve approximations will usually default to U curves.

Other types of curves beyond S curves might be found in specific types of situations, and require specialized analysis methods that are typically outside the scope of structural equation modeling. Examples are time series and Fourier analyses. Therefore these are beyond the scope of application of WarpPLS.

Typically, the more the functions F1(LVp1), F2(LVp2) ... look like curves, and unlike lines, the greater is the difference between the path coefficients p1, p2 ... and those that would have been obtained through a strictly linear analysis.

So, what WarpPLS does is not unlike what a researcher would do if he or she modified predictor latent variables prior to the calculation of path coefficients using a function like the logarithmic function. For example, as in the equation below, where a log transformation is applied to LVp1.

LVc = p1*log(LVp1) + p2*LVp2 + … + E.

However, WarpPLS does that automatically, and for a much wider range of functions, since a fairly wide range of functions can be modeled as U or S curves. Exceptions are complex trigonometric functions, where the dataset comprises many cycles. These require different methods to be properly modeled, such as the Fourier analyses methods mentioned above, and are usually outside the scope of structural equation modeling (SEM; which is the analysis method that WarpPLS automates).

Often the path coefficients p1, p2 ... will go up in value due to warped analysis, but that may not always be the case. Given the nature of multivariate analysis, an increase in a path coefficient may lead to a decrease in a path coefficient for an arrow pointing at the same criterion latent variable, because each path coefficient in a block is calculated in a way that controls for the effects of the other predictor latent variables.

3 comments:

Anonymous said...

How does WarpPLS calculate the original, unadjusted (or untransformed) values of the latent variables? .... that is, before any sort of linear or non-linear predictive model is applied between the latent variables?

Ned Kock said...

Hi Anon.

The answer is: PLS regression, up to version 3.0.

PLS modes A and B, and variations, will be available in version 4.0.

Still, PLS regression will remain the default, as it tends to minimize collinearity among LVs.

Ned

Anonymous said...

Thank you for responding. WarpPLS really is a fantastic tool, very innovative. I could be mistaken, but I believe that if you use the original PLS algorithm (weighting from the outer via regression and from the inner via either centroid, factor, or path weighting), then the values that you determine for the LV scores will be 'biased' in a 'linear fashion' (but certainly good enough to test non-linear relationships as you have demonstrated). What other approaches are possible to estimate the LV scores, I wonder?