Links to specific topics

Thursday, January 28, 2010

Bootstrapping or jackknifing (or both) in WarpPLS?

Arguably jackknifing does a better job at addressing problems associated with the presence of outliers due to errors in data collection. Generally speaking, jackknifing tends to generate more stable resample path coefficients (and thus more reliable P values) with small sample sizes (lower than 100), and with samples containing outliers. In these cases, outlier data points do not appear more than once in the set of resamples, which accounts for the better performance of jackknifing (see, e.g., Chiquoine & Hjalmarsson, 2009).

Bootstrapping tends to generate more stable resample path coefficients (and thus more reliable P values) with larger samples and with samples where the data points are evenly distributed on a scatter plot. The use of bootstrapping with small sample sizes (lower than 100) has been discouraged (Nevitt & Hancock, 2001).

Since the warping algorithms are also sensitive to the presence of outliers, in many cases it is a good idea to estimate P values with both bootstrapping and jackknifing, and use the P values associated with the most stable coefficients. An indication of instability is a high P value (i.e., statistically insignificant) associated with path coefficients that could be reasonably expected to have low P values. For example, with a sample size of 100, a path coefficient of .2 could be reasonably expected to yield a P value that is statistically significant at the .05 level. If that is not the case, there may be a stability problem. Another indication of instability is a marked difference between the P values estimated through bootstrapping and jackknifing.

P values can be easily estimated using both resampling methods, bootstrapping and jackknifing, by following this simple procedure. Run an SEM analysis of the desired model, using one of the resampling methods, and save the project. Then save the project again, this time with a different name, change the resampling method, and run the SEM analysis again. Then save the second project again. Each project file will now have results that refer to one of the two resampling methods. The P values can then be compared, and the most stable ones used in a research report on the SEM analysis.

References:

Chiquoine, B., & Hjalmarsson, E. (2009). Jackknifing stock return predictions. Journal of Empirical Finance, 16(5), 793-803.

Nevitt, J., & Hancock, G.R. (2001). Performance of bootstrapping approaches to model test statistics and parameter standard error estimation in structural equation modeling. Structural Equation Modeling, 8(3), 353-377.

2 comments:

AndrewG said...

Negative Signs

Hi Ned, in Temme, Kreis and Hildebrandt(2009, they discuss how the different PLS software programs can produce "different solutions with respect to the parameter signs". Some programs "generate opposite signs for the weights of the indicators X1 to Xy". In another program, "the signs for the weights of the formative construct ξ1 and the path coefficient for its effect on η1 in LVPLS are reversed in the displayed path model.
How does WarpPLS handle this issue?

Ned Kock said...

Hi Andrew. If I recall it properly the Temme et al. paper you are referring to was published in 2006, but I may be wrong.

Anyway, that problem does not seem to happen with WarpPLS. At least it hasn’t happened yet, even though thousands of analyses have been already been conducted with WarpPLS.

There are two reasons for this: (a) WarpPLS sets the initial signs of the weights based on a multiple correlation sub-algorithm; and (b) WarpPLS’s underlying algorithm for calculation of weights is PLS regression.

In PLS regression, the inner model is not allowed to influence the outer model, which ends up leading to more stable estimates than the more commonly used algorithm – the so-called “PLS path modeling”.

So, the combination of (a) and (b) seem to eliminate the problem.