Links to specific topics

Friday, May 11, 2012

Simpson’s paradox and unexpected results


The algorithms used in version 3.0 and later versions of WarpPLS have been revised so as to pick up instances of what is known as “Simpson’s paradox”. As a result, there may be changes in some coefficients and P values, when compared with previous versions.

Often the P value of the ARS fit index will go up, if instances of Simpson’s paradox are present in the model.

Simpson’s paradox is characterized by the path coefficient and correlation for a pair of variables having different signs. In this situation, the contribution of a predictor variable to the explained variance of the criterion variable in a latent variable block is negative.

In other words, if the predictor latent variable were to be removed from the block, the R-squared for the criterion latent variable would go up. A similar effect would be observed if the direction of the causality was reversed.

One widely held interpretation is that Simpson’s paradox could be an indication that the direction of a hypothesized relationship is reversed, or that the hypothesized relationship is nonsensical/improbable.

In the context of WarpPLS analyses, this is more likely to occur when nonlinear algorithms are used and/or full collineary VIFs are high, but may also occur under other conditions.


2 comments:

Jose Luis said...

Dr. Kock

Simpson's paradox has to do also when you have opposite results with aggregated and disaggregated data...right?

Ned Kock said...

Hi Jose Luis. That is generally the context in which the paradox is discussed, often implying the use of nominal scales, but it can also take place in contexts where variables are measured via ratio scales.

In WarpPLS, the paradox is characterized by a path coefficient and the correlation between a pair of LVs having different signs.