Links to specific topics

Monday, October 15, 2012

Hands-On Workshop on WarpPLS; 11-12 January 2013; San Antonio, Texas


*** Two-Day Hands-On Workshop on WarpPLS: SEM Fundamentals with Linear and Nonlinear Applications ***

Structural equation modeling (SEM), or path analysis with latent variables, is one of the most general and comprehensive statistical analysis methods. Path analysis, multiple regression, ANCOVA, ANOVA and other widely used statistical analysis methods can be seen as special cases of SEM.

WarpPLS is a very user-friendly and powerful software tool that can be used for SEM, arguably being the first of its kind to implement linear and nonlinear algorithms. This software provides one of the most extensive sets of SEM outputs; among other things it is the first of its kind to automatically calculate indirect and total effects and respective P values, as well as to calculate full collinearity estimates.

This SEM fundamentals workshop (details below) is designed to be useful to beginners and intermediate SEM practitioners. Among possible participants are those who are interested in: (a) being productive co-authors or research collaborators, even if not doing SEM analyses themselves; (b) conducting basic SEM analyses occasionally in the future; (c) conducting SEM analyses of intermediate complexity on a regular basis.

*** Registration and additional details ***

http://bit.ly/X0IHMP

or

http://scriptwarp.com/warppls/prjs/2013_WarpPLSwkshp_Jan_SanAntonio

*** Instructor ***

Ned Kock, Ph.D.
WarpPLS Developer
http://nedkock.com

*** Location and dates ***

Our Lady of the Lake University
San Antonio, Texas
11-12 January 2013 (Fri-Sat), 8 am–5 pm

*** Workshop program at a glance ***

The main goal of this workshop is to give participants a practical understanding of how to use the software WarpPLS to conduct variance-based SEM. The workshop is very hands-on and covers linear and nonlinear applications.

Day 1 of workshop
Overview of workshop and formation of teams
Overview of web resources: Video clips, blog, publications, spreadsheets, and templates
Overview of steps 1 to 5 of a complete SEM analysis
Hands-on exercise: Complete SEM analysis
Resampling as shuffling multiple decks of cards
Choosing the right resampling method
Hands-on exercise: Resampling options
Choosing the right warping (i.e., nonlinear) algorithm
Viewing and interpreting plots of linear and nonlinear relationships
Hands-on exercise: Linear and nonlinear relationships
Charting non-standardized data
Reporting results in non-standardized terms
Hands-on exercise: Standardized to non-standardized results
Reading discussion: WarpPLS User Manual

Day 2 of workshop
Classical tests of mediating effects – Baron & Kenny and Preacher & Hayes
Using indirect and total effect outputs to test mediating effects
Hands-on exercise: Indirect/mediating and total effects
Reading discussion: Kock & Verville’s free questionnaire data article
Testing a moderating effect
Double, triple etc. moderation
Hands-on exercise: Moderating effects
Adding control variables into an analysis
Using second-, third- etc. order latent variables
Conducting a multi-group analysis
Hands-on exercise: Multi-group analysis
Reading discussion: Kock & Lynn’s lateral collinearity article
Conducting a full collinearity test
Hands-on exercise: Team project using participant’s own data
Presentation of results from team project

Thursday, August 2, 2012

Lateral collinearity and misleading results in variance-based SEM: An illustration and recommendations


A new article discussing methodological issues based on WarpPLS is available. The article is titled “Lateral collinearity and misleading results in variance-based SEM: An illustration and recommendations”. It has been recently published in the Journal of the Association for Information Systems. A full text version of the article is available here as a PDF file. Below is the abstract of the article.

Variance-based structural equation modeling is extensively used in information systems research, and many related findings may have been distorted by hidden collinearity. This is a problem that may extend to multivariate analyses, in general, in the field of information systems as well as in many other fields. In multivariate analyses, collinearity is usually assessed as a predictor-predictor relationship phenomenon, where two or more predictors are checked for redundancy. This type of assessment addresses vertical, or “classic”, collinearity. However, another type of collinearity may also exist, here called “lateral” collinearity. It refers to predictor-criterion collinearity. Lateral collinearity problems are exemplified based on an illustrative variance-based structural equation modeling analysis. The analysis employs WarpPLS 2.0, with the results double-checked with other statistical analysis software tools. It is shown that standard validity and reliability tests do not properly capture lateral collinearity. A new approach for the assessment of both vertical and lateral collinearity in variance-based structural equation modeling is proposed and demonstrated in the context of the illustrative analysis.

Thursday, July 26, 2012

View indirect and total effects in WarpPLS: YouTube video


A new YouTube video for WarpPLS is available; please see link below.

http://youtu.be/D9m4K_fv2vI

The video shows how to view and interpret indirect and total effects, as well as various related coefficients s (e.g., P values), calculated through a structural equation modeling (SEM) analysis using the software WarpPLS.

Enjoy!

Wednesday, July 25, 2012

Create and use second order latent variables in WarpPLS: YouTube video


A new YouTube video for WarpPLS is available; please see link below.

http://youtu.be/bkO6YoRK8Zg

The video shows how to create and use second (and higher) order latent variables with the structural equation modeling (SEM) analysis software WarpPLS.

Enjoy!

Friday, May 11, 2012

Simpson’s paradox and unexpected results


The algorithms used in version 3.0 and later versions of WarpPLS have been revised so as to pick up instances of what is known as “Simpson’s paradox”. As a result, there may be changes in some coefficients and P values, when compared with previous versions.

Often the P value of the ARS fit index will go up, if instances of Simpson’s paradox are present in the model.

Simpson’s paradox is characterized by the path coefficient and correlation for a pair of variables having different signs. In this situation, the contribution of a predictor variable to the explained variance of the criterion variable in a latent variable block is negative.

In other words, if the predictor latent variable were to be removed from the block, the R-squared for the criterion latent variable would go up. A similar effect would be observed if the direction of the causality was reversed.

One widely held interpretation is that Simpson’s paradox could be an indication that the direction of a hypothesized relationship is reversed, or that the hypothesized relationship is nonsensical/improbable.

In the context of WarpPLS analyses, this is more likely to occur when nonlinear algorithms are used and/or full collineary VIFs are high, but may also occur under other conditions.


Wednesday, March 7, 2012

Version 3.0 of WarpPLS is now available!

Version 3.0 of WarpPLS is now available! You can download and install it for a free 90-day trial from:

http://warppls.com

The full User Manual is also available for download from the web site above separately from the software.

Some important notes for users of previous versions:

- Version 2.0 users can use the same license information that they already have; it will work for version 3.0 for the remainder of their license periods.

- Project files generated with previous versions are automatically converted to version 3.0 project files. Users are notified of that by the software, and given the opportunity not to convert the files if they so wish.

- The MATLAB Compiler Runtime 7.14, used in this version, is the same as the one used in version 2.0. Therefore, if you already have WarpPLS 2.0 installed on your computer, you should uncheck the Runtime component on the installer (i.e., the self-installing .exe file). The same Runtime cannot be installed twice on the same computer.

WarpPLS is a powerful PLS-based structural equation modeling (SEM) software. Since its first release in 2009, its user base has grown steadily, with more than 5,000 users worldwide today.

Some of its most distinguishing features are the following:

- It is easy to use, with a step-by-step user interface guide.

- It identifies nonlinear relationships, and estimates path coefficients accordingly.

- It also models linear relationships, using a standard PLS regression algorithm.

- It models reflective and formative variables, as well as moderating effects.

- It calculates P values, model fit indices, and collinearity estimates.

Below is a list of new features in this version. The User Manual has more details on how these new features can be useful in SEM analyses.

- Addition of latent variables as indicators. Users now have the option of adding latent variable scores to the set of standardized indicators used in an SEM analysis.

- Blindfolding. Users now have the option of using a third resampling algorithm, namely blindfolding, in addition to bootstrapping and jackknifing.

- Effect sizes. Cohen’s f-squared effect size coefficients are now calculated and shown for all path coefficients.

- Estimated collinearity. Collinearity is now estimated before the SEM analysis is run. When collinearity appears to be too high, users are warned about it.

- Full collinearity VIFs. VIFs are now shown for all latent variables, separately from the VIFs calculated for predictor latent variables in individual latent variable blocks.

- Indirect and total effects. Indirect and total effects are now calculated and shown, together with the corresponding P values, standard errors, and effect sizes.

- P values for all weights and loadings. P values are now shown for all weights and loadings, including those associated with indicators that make up moderating variables.

- Predictive validity. Stone-Geisser Q-squared coefficients are now calculated and shown for each endogenous variable in an SEM model.

- Ranked data. Users can now select an option to conduct their analyses with only ranked data, whereby all the data is automatically ranked prior to the SEM analysis (the original data is retained in unranked format).

- Restricted ranges. Users can now run their analyses with subsamples defined by a range restriction variable, which may be standardized or unstandardized.

- Standard errors for all weights and loadings. Standard errors are now shown for all loadings and weights.

- VIFs for all indicators. VIFs are now shown for all indicators, including those associated with moderating latent variables.

Enjoy!

Thursday, March 1, 2012

Exploring free questionnaire data with anchor variables: An illustration based on a study of IT in healthcare


A new article discussing methodological issues based on WarpPLS is available. The article is titled “Exploring free questionnaire data with anchor variables: An illustration based on a study of IT in healthcare”. It has been recently published in the International Journal of Healthcare Information Systems and Informatics. A full text version of the article is available here as a PDF file. Below is the abstract of the article.

This paper makes an important methodological contribution regarding the use of free questionnaires, illustrated through a study that shows that a healthcare professional’s propensity to use electronic communication technologies creates opportunities for interaction with other professionals, which would not otherwise be possible only via face-to-face interaction. This in turn appears to increase mutual trust, and eventually improve the quality of group outcomes. Free questionnaires are often used by healthcare information management researchers. They yield datasets without clear associations between constructs and related indicators. If such associations exist, they must first be uncovered so that indicators can be grouped within latent variables referring to constructs, and structural equation modeling analyses be conducted. A novel methodological contribution is made here through the proposal of an anchor variable approach to the analysis of free questionnaires. Unlike exploratory factor analyses, the approach relies on the researcher’s semantic knowledge about the variables stemming from a free questionnaire. The use of the approach is demonstrated using the multivariate statistical analysis software WarpPLS 2.0. The study leads to a measurement model that passes comprehensive validity, reliability, and collinearity tests. It also appears to yield practically relevant and meaningful results.

Sunday, February 5, 2012

New PLS-based SEM email distribution list

A new email distribution list is available for those who share a common interest in partial least squares (PLS) regression and its use in structural equation modeling (SEM). To check it out click here.

Monday, January 23, 2012

Version 3.0 of WarpPLS is coming soon, with several new features

Version 3.0 of WarpPLS is currently undergoing a battery of tests, and will be made available soon. Among the new features is the calculation of indirect and total effects, which are exemplified in this health data analysis post based on the China Study II dataset. Here is a comprehensive list of new features in this version:

    - Addition of latent variables as indicators. Users now have the option of adding latent variable scores to the set of standardized indicators used in an SEM analysis. This option is useful in the removal of outliers, through the use of restricted ranges for latent variable scores, particularly for outliers that are clearly visible on the plots depicting associations among latent variables. This option is also useful in hierarchical analysis, where users define second-order (and higher order) latent variables, and then conduct analyses with different models including latent variables of different orders.

    - Blindfolding. Users now have the option of using a third resampling algorithm, namely blindfolding, in addition to bootstrapping and jackknifing. Blindfolding is a resampling algorithm that creates a number of resamples (a number that can be selected by the user), where each resample has a certain number of rows replaced with the means of the respective columns. The number of rows modified in this way in each resample equals the sample size divided by the number of resamples. For example, if the sample size is 200 and the number of resamples selected is 100, then each resample will have 2 rows modified. If a user chooses a number of resamples that is greater than the sample size, the number of resamples is automatically set to the sample size (as with jackknifing).

    - Effect sizes. Cohen’s (1988) f-squared effect size coefficients are now calculated and shown for all path coefficients. These are calculated as the absolute values of the individual contributions of the corresponding predictor latent variables to the R-square coefficients of the criterion latent variable in each latent variable block. With these effect sizes users can ascertain whether the effects indicated by path coefficients are small, medium, or large. The values usually recommended are 0.02, 0.15, and 0.35; respectively (Cohen, 1988). Values below 0.02 suggest effects that are too weak to be considered relevant from a practical point of view, even when the corresponding P values are statistically significant; a situation that may occur with large sample sizes.

    - Full collinearity VIFs. VIFs are now shown for all latent variables, separately from the VIFs calculated for predictor latent variables in individual latent variable blocks. These new VIFs are calculated based on a full collinearity test, which identifies not only vertical but also lateral collinearity, and allows for a test of collinearity involving all latent variables in a model. Vertical, or classic, collinearity is predictor-predictor latent variable collinearity in individual blocks. Lateral collinearity is a new term that refers to predictor-criterion latent variable collinearity; a type of collinearity that can lead to particularly misleading results. Full collinearity VIFs can also be used for common method (Lindell & Whitney, 2001) bias tests that are more conservative than, and arguably superior to, the traditionally used tests relying on exploratory factor analyses.

    - Incremental code optimization. At several points the code was optimized for speed, which led to incremental gains even as a significant number of new features were added. Several of these new features required new and complex calculations, mostly to generate coefficients that were not available before.

    - Indirect and total effects. Indirect and total effects are now calculated and shown, together with the corresponding P values, standard errors, and effect sizes. The calculation of indirect and total effects can be critical in the evaluation of downstream effects of latent variables that are mediated by other latent variables, especially in complex models with multiple mediating effects in concurrent paths. Indirect effects also allow for direct estimations, via resampling, of the P values associated with mediating effects that have traditionally relied on time-consuming and not fully automated calculations based on linear (Preacher & Hayes, 2004) and nonlinear (Hayes & Preacher, 2010) assumptions.

    - P values for all weights and loadings. P values are now shown for all weights and loadings, including those associated with indicators that make up moderating variables. With these P values, users can check whether moderating latent variables satisfy validity and reliability criteria for either reflective or formative measurement. This can help users demonstrate validity and reliability in hierarchical analyses involving moderating effects, where double, triple etc. moderating effects are tested. For instance, moderating latent variables can be created, added to the model as standardized indicators, and then their effects modeled as being moderated by other latent variables; an example of double moderation.

    - Predictive validity. Stone-Geisser Q-squared coefficients (Geisser, 1974; Stone, 1974) are now calculated and shown for each endogenous variable in an SEM model. The Q-squared coefficient is a nonparametric measure traditionally calculated via blindfolding. It is used for the assessment of the predictive validity (or relevance) associated with each latent variable block in the model, through the endogenous latent variable that is the criterion variable in the block. Sometimes referred to as a resampling analog of the R-squared, it is often similar in value to that measure; even though, unlike the R-squared coefficient, the Q-squared coefficient can assume negative values. Acceptable predictive validity in connection with an endogenous latent variable is suggested by a Q-squared coefficient greater than zero.

    - Ranked data. Users can now select an option to conduct their analyses with only ranked data, whereby all the data is automatically ranked prior to the SEM analysis (the original data is retained in unranked format). When data is ranked, typically the value distances that typify outliers are significantly reduced, effectively eliminating outliers without any decrease in sample size. A concomitant increase in collinearity is usually observed, but not to the point of threatening the credibility of the results. This option can be very useful in assessments of whether the presence of outliers significantly affects path coefficients and respective P values, especially when outliers are not believed to be due to measurement error.

    - Restricted ranges. Users can now run their analyses with subsamples defined by a range restriction variable, which may be standardized or unstandardized. This option is useful in multi-group analyses, whereby separate analyses are conducted for each subsample and the results then compared with one another. One example would be a multi-country analysis, with each country being treated as a subsample, but without separate datasets for each country having to be provided as inputs. This range restriction feature is also useful in situations where outliers are causing instability in a resample set, which can lead to abnormally high standard errors and thus inflated P values. Users can remove outliers by restricting the values assumed by a variable to a range that excludes the outliers, without having to modify and re-read a dataset.

    - Standard errors for all weights and loadings. Standard errors are now shown for all loadings and weights. Among other purposes, these standard errors can be used in multi-group analyses, with the same model but different subsamples. In these cases, users may want to compare the measurement models to ascertain equivalence, using a multi-group comparison technique such as the one documented by Keil et al. (2000), and thus ensure that any observed differences in structural model coefficients are not due to measurement model differences.

    - VIFs for all indicators. VIFs are now shown for all indicators, including those associated with moderating latent variables. With these VIFs, users can check whether moderating latent variables satisfy criteria for formative measurement, in case they do not satisfy validity and reliability criteria for reflective measurement. This can be particularly helpful in hierarchical analyses involving moderating effects, where formative latent variables are frequently employed, including cases where double, triple etc. moderating effects are tested. Here moderating latent variables can be created, added to the model as standardized indicators, and then their effects modeled as being moderated by other latent variables; with this process being repeated at different levels.

References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.

Geisser, S. (1974). A predictive approach to the random effects model. Biometrika, 61(1), 101-107.

Hayes, A. F., & Preacher, K. J. (2010). Quantifying and testing indirect effects in simple mediation models when the constituent paths are nonlinear. Multivariate Behavioral Research, 45(4), 627-660.

Keil, M., Tan, B.C., Wei, K.-K., Saarinen, T., Tuunainen, V., & Wassenaar, A. (2000). A cross-cultural study on escalation of commitment behavior in software projects. MIS Quarterly, 24(2), 299–325.

Lindell, M., & Whitney, D. (2001). Accounting for common method variance in cross-sectional research designs. Journal of Applied Psychology, 86(1), 114-121.

Preacher, K.J., & Hayes, A.F. (2004). SPSS and SAS procedures for estimating indirect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers, 36 (4), 717-731.

Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B, 36(1), 111–147.