Links to specific topics

Sunday, February 21, 2010

Nonlinearity and type I and II errors in SEM analysis


Many relationships between variables studied in the natural and behavioral sciences seem to be nonlinear, often following a J-curve pattern (a.k.a. U-curve pattern). Other common relationships include the logarithmic, hyperbolic decay, exponential decay, and exponential. These and other relationships are modeled by WarpPLS.

Yet, the vast majority of statistical analysis methods used in the natural and behavioral sciences, from simple correlation analysis to structural equation modeling, assume relationships to be linear in the estimation of coefficients of association (e.g., Pearson correlations, standardized partial regression coefficients).

This may significantly distort results, especially in multivariate analyses, increasing the likelihood that researchers will commit type I and II errors in the same study. A type I error occurs in SEM analysis when an insignificant (the technical term is "non-significant") association is estimated as being significant (i.e., a “false positive”); a type II error occurs when a significant association is estimated as being insignificant (i.e., an existing association is “missed”).

The figure below shows a distribution of points typical of a J-curve pattern involving two variables, disrupted by uncorrelated error. The pattern, however, is modeled as a linear relationship. The line passing through the points is the best linear approximation of the distribution of points. It yields a correlation coefficient of .582. In this situation, the variable on the horizontal axis explains 33.9 percent of the variance of the variable on the vertical axis.


The figure below shows the same J-curve scatter plot pattern, but this time modeled as a nonlinear relationship. The curve passing through the points is the best nonlinear approximation of the distribution of the underlying J-curve, and excludes the uncorrelated error. That is, the curve does not attempt to model the uncorrelated error, only the underlying nonlinear relationship. It yields a correlation coefficient of .983. Here the variable on the horizontal axis explains 96.7 percent of the variance of the variable on the vertical axis.


WarpPLS transforms (or “warps”) J-curve relationship patterns like the one above BEFORE the corresponding path coefficients between each pair of variables are calculated. It does the same for many other nonlinear relationship patterns. In multivariate analyses, this may significantly change the values of the path coefficients, reducing the risk that researchers will commit type I and II errors.

The risk of committing type I and II errors is particularly high when: (a) a block of latent variables includes multiple predictor variables pointing and the same criterion variable; (b) one or more relationships between latent variables are significantly nonlinear; and (c) the predictor latent variables are correlated, even if they clearly measure different constructs (suggested by low variance inflation factors).

No comments: