Links to specific topics

(See also under "Labels" at the bottom-left area of this blog)
[ Welcome post ] [ Installation issues ] [ ] [ Posts with YouTube links ] [ PLS-SEM email list ]

Tuesday, September 13, 2016

Measurement invariance assessment in PLS-SEM

WarpPLS users can assess measurement invariance in PLS-SEM analyses in a way analogous to a multi-group analysis. That is, WarpPLS users can compare pairs of measurement models to ascertain equivalence, using one of the multi-group comparison techniques building on the pooled and Satterthwaite standard error methods discussed in the article below. By doing so, they will ensure that any observed between-group differences in structural model coefficients, particularly in path coefficients, are not due to measurement model differences.

Kock, N. (2014). Advanced mediating effects tests, multi-group analyses, and measurement model assessments in PLS-based SEM. International Journal of e-Collaboration, 10(3), 1-13.

For measurement invariance assessment, the techniques discussed in the article should be employed with weights and/or loadings. While with path coefficients researchers may be interested in finding statistically significant differences, with weights/loadings the opposite is typically the case – they will want to ensure that differences are not statistically significant. The reason is that significant differences between path coefficients can be artificially induced by significant differences between weights/loadings in different models.

A spreadsheet with formulas for conducting a multi-group analysis building on the pooled and Satterthwaite standard error methods is available from, under “Resources”. As indicated in the article linked above, this same spreadsheet can be used in the assessment of measurement invariance in PLS-SEM analyses.

The menu options “Explore multi-group analyses” and “Explore measurement invariance”, available in WarpPLS starting in version 6.0, allow you to automatically conduct analyses like the ones above. Through these the data is segmented in various groups, all possible combinations of pairs of groups are generated, and each pair of groups is compared. As noted above, in multi-group analyses normally path coefficients are compared, whereas in measurement invariance assessment the foci of comparison are loadings and/or weights. The grouping variables can be unstandardized indicators, standardized indicators, and labels. These types of analyzes can also be conducted via the new menu option “Explore full latent growth”, which presents several advantages (as discussed in the WarpPLS User Manual).

Related YouTube videos:

Explore Multi-Group Analyses in WarpPLS

Explore Measurement Invariance in WarpPLS

Advantages of nonlinear over segmentation analyses in path models

Nonlinear analyses employing the software WarpPLS allow for the identification of linear segments emerging from a nonlinear analysis, but without the need to generate subsamples. A new article is available demonstrating the advantages of nonlinear over data segmentation analyses. These include a larger overall sample size for calculation of P values, and the ability to uncover very high segment-specific path coefficients. Its reference, abstract, and link to full text are available below.

Kock, N. (2016). Advantages of nonlinear over segmentation analyses in path models. International Journal of e-Collaboration, 12(4), 1-6.

The recent availability of software tools for nonlinear path analyses, such as WarpPLS, enables e-collaboration researchers to take nonlinearity into consideration when estimating coefficients of association among linked variables. Nonlinear path analyses can be applied to models with or without latent variables, and provide advantages over data segmentation analyses, including those employing finite mixture segmentation techniques (a.k.a. FIMIX). The latter assume that data can be successfully segmented into subsamples, which are then analyzed with linear algorithms. Nonlinear analyses employing WarpPLS also allow for the identification of linear segments mirroring underlying nonlinear relationships, but without the need to generate subsamples. We demonstrate the advantages of nonlinear over data segmentation analyses.

Among other things this article shows that identification of linear segments emerging from a nonlinear analysis with WarpPLS allows for: (a) a larger overall sample size for calculation of P values, which enables researchers to uncover actual segment-specific effects that could otherwise be rendered non-significant due to a combination of underestimated path coefficients and small subsample sizes; and (b) the ability to uncover very high segment-specific path coefficients, which could otherwise be grossly underestimated.


Thursday, September 1, 2016

Hypothesis testing with confidence intervals and P values

While P values are widely used in PLS-based SEM, as well as in SEM in general, the statistical significances of path coefficients, weights and loadings can also be assessed employing T ratios and/or confidence intervals. These can be obtained in WarpPLS through the menu option “Explore T ratios and confidence intervals”, which also allows you to set the confidence level to be used. This menu option becomes available after Step 5 is completed.

Related YouTube video: Explore T Ratios and Confidence Intervals in WarpPLS

An article is also available explaining how WarpPLS users can test hypotheses based on confidence intervals, contrasting that approach with the one employing P values. A variation of the latter approach, employing T ratios, is also briefly discussed. Below are the reference, link to PDF file, and abstract for the article.

Kock, N. (2016). Hypothesis testing with confidence intervals and P values in PLS-SEM. International Journal of e-Collaboration, 12(3), 1-6.

PDF file

E-collaboration researchers usually employ P values for hypothesis testing, a common practice in a variety of other fields. This is also customary in many methodological contexts, such as analyses of path models with or without latent variables, as well as simpler tests that can be seen as special cases of these (e.g., comparisons of means). We discuss here how a researcher can use another major approach for hypothesis testing, the one building on confidence intervals. We contrast this approach with the one employing P values through the analysis of a simulated dataset, created based on a model grounded on past theory and empirical research. The model refers to social networking site use at work and its impact on job performance. The results of our analyses suggest that tests employing confidence intervals and P values are likely to lead to very similar outcomes in terms of acceptance or rejection of hypotheses.

Note 1:
On Table 1 in the article, each T ratio and confidence interval limits (lower and upper) are calculated through the formulas included below. Normally a hypothesis will not be supported if the confidence interval includes the number 0 (zero).

T ratio = (path coefficient) / (standard error).

Lower confidence interval = (path coefficient) - 1.96 * (standard error).

Upper confidence interval = (path coefficient) + 1.96 * (standard error).

Note 2:
Here is a quick note to technical readers. The P values reported in Table 1 in the article are calculated based on the T ratios using the incomplete beta function, which does not assume that the T distribution is exactly normal. In reality, T distributions have heavier tails than normal distributions, with the difference becoming less noticeable as sample sizes increase.

Wednesday, June 15, 2016

Simpson’s paradox, moderation, and the emergence of quadratic relationships in path models

Among the many innovative features of WarpPLS are those that deal with identification of Simpson’s paradox and modeling of nonlinear relationships. A new article discussing various issues that are important for the understanding of the usefulness of these features is now available. Its reference, abstract, and link to full text are available below.

Kock, N., & Gaskins, L. (2016). Simpson’s paradox, moderation, and the emergence of quadratic relationships in path models: An information systems illustration. International Journal of Applied Nonlinear Science, 2(3), 200-234.

While Simpson’s paradox is well-known to statisticians, it seems to have been largely neglected in many applied fields of research, including the field of information systems. This is problematic because of the strange nature of the phenomenon, the wrong conclusions and decisions to which it may lead, and its likely frequency. We discuss Simpson’s paradox and interpret it from the perspective of path models with or without latent variables. We define it mathematically and argue that it arises from incorrect model specification. We also show how models can be correctly specified so that they are free from Simpson’s paradox. In the process of doing so, we show that Simpson’s paradox may be a marker of two types of co-existing relationships that have been attracting increasing interest from information systems researchers, namely moderation and quadratic relationships.

Among other things this article shows that: (a) Simpson’s paradox may be caused by model misspecification, and thus can in some cases be fixed by proper model specification; (b) a type of model misspecification that may cause Simpson’s paradox involves missing a moderation relationship that exists at the population level; (c) Simpson’s paradox may actually be a marker of nonlinear relationships of the quadratic type, which are induced by moderation; and (d) there is a duality involving moderation and quadratic relationships, which requires separate and targeted analyses for their proper understanding.


Saturday, June 11, 2016

Interview video: Conference on Information Systems in Latin America

Recently an interview was conducted for the 3rd Conference on Information Systems in Latin America. In it, Dr. Ned Kock was interviewed by Dr. Alexandre Graeml. The topics covered include: structural equation modeling (SEM), partial least squares (PLS) and related techniques, PLS-based SEM, covariance-based SEM, factors versus composites, nonlinear analyses, and WarpPLS.

WarpPLS and its application to research in business and information systems

The link below is for the Conference’s web site.

ISLA 2016 - Information Systems in Latin America Conference


Thursday, June 9, 2016

PLS-SEM performance with non-normal data

Many claims have been made in the past about the advantages of structural equation modeling employing the partial least squares method (PLS-SEM). While some claims may have been exaggerated, we are continuously finding that others have not. One of such claims, falling in the latter category (i.e., not an exaggeration), is that PLS-SEM is robust to deviations from normality. In other words, PLS-SEM performs quite well with non-normal data.

An article illustrating this advantage of PLS-SEM is available. Its reference, abstract, and link to full text are available below.

Kock, N. (2016). Non-normality propagation among latent variables and indicators in PLS-SEM simulations. Journal of Modern Applied Statistical Methods, 15(1), 299-315.

Structural equation modeling employing the partial least squares method (PLS-SEM) has been extensively used in business research. Often the use of this method is justified based on claims about its unique performance with small samples and non-normal data, which call for performance analyses. How normal and non-normal data are created for the performance analyses are examined. A method is proposed for the generation of data for exogenous latent variables and errors directly, from which data for endogenous latent variables and indicators are subsequently obtained based on model parameters. The emphasis is on the issue of non-normality propagation among latent variables and indicators, showing that this propagation can be severely impaired if certain steps are not taken. A key step is inducing non-normality in structural and indicator errors, in addition to exogenous latent variables. Illustrations of the method and its steps are provided through simulations based on a simple model of the effect of e-collaboration technology use on job performance.

The article’s main goal is actually to discuss a method to create non-normal data where the data creator has full access to all data elements, including factor or composite scores and all error terms, and where severe non-normality is extended to error terms. In the process of achieving this goal, the article actually demonstrates that PLS-SEM is very robust to severe deviations from normality, even when these deviations apply to all error terms. This is an issue that is often glossed over in PLS-SEM performance tests with non-normal data.

Readers may also find the YouTube video linked below useful in the context of this discussion.

View Skewness and Kurtosis in WarpPLS


A thank you note to the participants in the 2016 PLS Applications Symposium

This is just a thank you note to those who participated, either as presenters or members of the audience, in the 2016 PLS Applications Symposium:

As in previous years, it seems that it was a good idea to run the Symposium as part of the Western Hemispheric Trade Conference. This allowed attendees to take advantage of a subsidized registration fee, and also participate in other Conference sessions and the Conference's social event.

I have been told that the proceedings will be available soon from the Western Hemispheric Trade Conference web site.

Also, the full-day workshop on PLS-SEM using the software WarpPLS was well attended. This workshop was fairly hands-on and interactive. Some participants had quite a great deal of expertise in PLS-SEM and WarpPLS. It was a joy to have conducted the workshop!

As soon as we define the dates, we will be announcing next year’s PLS Applications Symposium. Like this years’ Symposium, it will take place in Laredo, Texas, probably in mid-April as well.

Thank you and best regards to all!

Saturday, April 16, 2016

PLS Applications Symposium; 13 - 15 April 2016; Laredo, Texas

PLS Applications Symposium; 13 - 15 April 2016; Laredo, Texas
(Abstract submissions accepted until 4 March 2016)

*** Only abstracts are needed for the submissions ***

The partial least squares (PLS) method has increasingly been used in a variety of fields of research and practice, particularly in the context of PLS-based structural equation modeling (SEM). The focus of this Symposium is on the application of PLS-based methods, from a multidisciplinary perspective. For types of submissions, deadlines, and other details, please visit the Symposium’s web site:

*** Workshop on PLS-SEM ***

On 13 April 2015 a full-day workshop on PLS-SEM will be conducted by Dr. Ned Kock, using the software WarpPLS. This workshop will be hands-on and interactive. To participate in the workshop, please indicate your interest when making your registration for the Symposium.

The following topics, among others, will be covered - Running a Full PLS-SEM Analysis - Conducting a Moderating Effects Analysis - Viewing Moderating Effects via 3D and 2D Graphs - Creating and Using Second Order Latent Variables - Viewing Indirect and Total Effects - Viewing Skewness and Kurtosis of Manifest and Latent Variables - Conducting a Multi-group Analysis with Range Restriction - Viewing Nonlinear Relationships - Conducting a Factor-Based PLS-SEM Analysis - Viewing and Changing Missing Data Imputation Settings - Isolating Mediating Effects - Identifying and Dealing with Outliers - Solving Indicator Problems - Solving Collinearity Problems.

Ned Kock
Symposium Chair

Monday, February 8, 2016

Conducting a nonlinear robust path analysis

What if a researcher has only one measure for each latent variable, and still wants to perform a nonlinear “robust” analysis where no parametric assumptions (e.g., univariate or multivariate normality) are made beforehand?

This would call for a new nonlinear robust multivariate analysis approach – a nonlinear robust path analysis. Through this approach the variables in the structural model would not be “latent”, strictly speaking, and thus other assessments would have to be performed in place of a confirmatory factor analysis. That is, without multiple indicators per latent variable measurement, quality assessments must deviate somewhat from what would be used in a traditional structural equation modeling analysis.

An article illustrating a nonlinear robust path analysis with WarpPLS is available. To the best of our knowledge, this is one of the first published articles employing this type of analysis. The full reference, link to full text PDF file maintained by the University of California, and abstract for the article are available below.

Kock, N. (2015). Wheat flour versus rice consumption and vascular diseases: Evidence from the China Study II data. Cliodynamics, 6(2), 130–146.

PDF file:

Why does wheat flour consumption appear to be significantly associated with vascular diseases? To answer this question we analyzed data on rice consumption, wheat flour consumption, total calorie consumption, and mortality from vascular diseases obtained from the China Study II dataset. This dataset covers the years of 1983, 1989 and 1993; with data related to biochemistry, diet, lifestyle, and mortality from various diseases in 69 counties in China. Our analyses point at a counterintuitive conclusion: it may not be wheat flour consumption that is the problem, but the culture associated with it, characterized by: decreased levels of physical activity, decreased exposure to sunlight, increased consumption of processed foods, and increased social isolation. Wheat flour consumption may act as a proxy for the extent to which this culture is expressed in a population. The more this culture is expressed, the greater is the prevalence of vascular diseases.

While this is an academic article, I think that the main body of the article is fairly easy to read; which was one of the expectations communicated to us by the Editor and the reviewers. WarpPLS users may find themselves in this same situation – having to prevent more technical statistical material from “spoiling” the reading experience of a non-technical audience. In this case, more technical readers may want to check under “Supporting material”, which is one of the links on the left, where they will find a detailed description of the data used and the results of some specialized statistical tests.