Links to specific topics

(See also under "Labels" at the bottom-left area of this blog)
[ Welcome post ] [ Installation issues ] [ WarpPLS.com ] [ Posts with YouTube links ] [ Model-driven data analytics ] [ PLS-SEM email list ]

Thursday, October 5, 2023

Using logistic regression in PLS-SEM: Dichotomous endogenous variables


The article below discusses how one can use logistic regression with the probit approach, to avoid the problems associated with having dichotomous endogenous variables, in the context of structural equation modeling via partial least squares (PLS-SEM).

Kock, N. (2023). Using logistic regression in PLS-SEM: Dichotomous endogenous variables. Data Analysis Perspectives Journal, 4(4), 1-6.

Link to full-text file for this and other DAPJ articles:

https://scriptwarp.com/dapj/#Published_Articles

Abstract:

A dichotomous endogenous variable would be impossible to occur at the population level, which an empirical sample is assumed to represent, because the structural error term associated with the endogenous variable is expected to be a random variable with many distinct values. Consequently, the endogenous variable is also expected to have many distinct values. This paper discusses how to address this problem, using logistic regression with the probit approach, in the context of structural equation modeling via partial least squares (PLS-SEM). Our discussion is based on an illustrative model analyzed with the software WarpPLS.

Best regards to all!

4 comments:

Zinnia said...

Dear Professor Kock,

I recently discovered WarpPLS while searching for a method to analyze binary variables in my model. Thank you for providing such a promising tool.
As this is my first time using it, I want to ensure that my model is compatible with WarpPLS.

My structural equation model is as follows:
DX (exogenous variable) → BI (mediation variable 1) → IP (dependent variable)
↘ ↓ ↗
↘ PI (mediation variable 2) ↗


DX: 5 measured variables - dx1, dx2, dx3, dx4, dx5 (each measured on a 5-point Likert scale)
BI: 7 measured variables - bi1, bi2, bi3, bi4, bi5, bi6, bi7 (each yes/no binary)
PI: 2 measured variables - pi1, pi2 (each yes/no binary)
IP: 3 measured variables - ip1, ip2, ip3 (ip1 is a ratio scale, and ip2 & ip3 are yes/no binary)


My questions are:
1. Is it possible to analyze the structural equation model of the endogenous variable(BI) with 7 measured variables(each yes/no binary) and the endogenous variable(PI) with 2 measured variables(each yes/no binary) using WarpPLS as described in the model above?
2. Can WarpPLS analyze a dependent variable that has 3 measured variables of different natures? (ip1 is a ratio scale, and ip2 and ip3 are binary variables)
3. If my model can be analyzed with WarpPLS, can I also test the mediating effects of BI and PI?

I'm currently living in South Korea. I would like to incorporate WarpPLS into my research to promote its usage in my country.
Thank you so much for your assistance!

Best regards,
Zinnia

Ned Kock said...

Hi Zinnia. Yes, I believe so. You may want to review the materials here:

warppls.com

Zinnia said...

Dear Professor Kock,

I have a question about the logistic transformation of endogenous variables with multiple dichotomous measures.

For example, consider a model like A -> B -> C :
- Measured variables B1 and B2 in B(latent variable) are both dichotomous(0,1)
- Measured variables C1, C2, and C3 in C(latent variable) are all dichotomous(0,1)

If I create B and C as new logistic regression variables in the above model, in the "Explore logistic regression" window, under the "Variables to be converted" section, can I just put the latent variables B and C, respectively? Or do I need to create two new variables (Ir_B2, IrB2) for B and three new variables (Ir_C1, Ir_C2, Ir_C3) for C, using the measured variables B1, B2, C1, C2, C3, respectively, and assign them as new measured variables for the latent variables B and C?

Thanks, as always, for your help.

Best regards,
Zinnia

Ned Kock said...

What I recommend is that you use the original indicators for B and C to generate LVs using one of the factor-based algorithms (e.g., REG2). Next save B and C as new LVs, and use these in a new model. Then obtain a lr_C, and use it in a third model. Here you should use logit, because C will not be dichotomous (although it will have fewer distinct values than it should). I don't recommend doing the same with a lr_B, because you would then have a model with probabilities causing probabilities - unless theory tells you, for some reason, to have such a relationship in the model.