Preface: I wrote this tutorial based on my own experience and it should be considered a work in progress. Also, it requires some familiarity with basic R and lavaan. If you are reading this and are an expert in ESEM, please send me an e-mail if you discover anything that needs improvement or is utterly wrong. Of course, feel free to reach out if you have any questions.

With the huge feature set of R, I was surprised that there is no satisfactory solution for ESEM yet. This method has so much potential, but it is barely used. One reason for this might be that you can only do it in MPlus. Now MPLus is certainly a great software for statistical purposes, but it is expensive.

Luckily, I found a conference paper on how to approximate an ESEM model in R. It consists of the following steps:

  1. Estimating factor loadings based on Exploratory Factor Analysis. In the paper, they use the factanal() command from base R, which does a maximum-likelihood EFA on a correlation matrix.
  2. “Study the rotation”. In the paper, they rotate the loading matrix manually, but don’t really give any details on their approach. I decided to rely on the methods provided by the psych library in their fa() command.
  3. Specify a SEM model based on EFA results. In the original, they use the sem library but I am more used to lavaan, so I will use that.

In this tutorial I compare the results Marsh, Nagenast and Morin (2013) reported in their ESEM of a larger British Big Five dataset with my method. This way I can show that this method works and that its results approximate those of MPlus.

The dataset used is from Wave 15 of the British Household Panel Survey. I obtained it from the UK data service, cleaned it up, recoded the reverse-scored items and loaded it into the object b5.

First, we perform a CFA and look at the fit measures and loadings. All observed variables are skewed and ordinal, so we will use a robust ML estimator (estimator = MLR in lavaan).

b5.model <- "
A =~ A1+A2+A3
C =~ C1+C2+C3
E =~ E1+E2+E3
N =~ N1+N2+N3
O =~ O1+O2+O3
" <- lavaan::cfa(b5.model, data = b5, estimator = "MLR")
# Marsh et al. (2013) fit measures:
# CFI: .761, TLI: .687, RMSEA: .076
# Our fit measures:
fitmeasures(, c("cfi.robust","tli.robust","rmsea.robust","srmr"))
##   cfi.robust   tli.robust rmsea.robust         srmr 
##        0.789        0.722        0.094        0.091

As we can see, the fit measures are far from being adequate. This is because traditional CFA is very restrictive: it is assumed that all items only load on one factor. Everything else is considered “bad fit”. Normally, we would now look at modification indices and loosen the cross-loading constraints for some items, greatly increasing our chance of aggravating reviewers in personality psych journals.

So let’s try ESEM! First, we conduct an exploratory FA on the dataset. We will assume five personality factors and choose an oblimin geomin rotation because that would be the default in MPlus. Very small loadings (less than 1^-7) will be reduced to zero. Then, we will extract lavaan-compatible equations from the loading matrix.

In the Marsh et al. (2013) paper, “correlated uniqueness” (CUs) parameters were added to the model. This was done to capture method factors resulting from negatively coded items. We try to mimic this approach by adding covariance formulas for those items.

b5.efa <- fa(b5[,1:15], nfact = 5, rotate = "geominQ", fm = "ml")
## Lade nötigen Namensraum: GPArotation

b5.loadmat <- zapsmall(matrix(round(b5.efa$loadings, 2), nrow = 15, ncol = 5))
rownames(b5.loadmat) <- colnames(b5[,1:15])

# This will turn the loading matrix into a lavaan-compatible equation set. 

terms <- vector()
for (i in 1:5) {
  terms[i] <-
    paste0("F",i,"=~ ", paste0(c(b5.loadmat[,i]), "*", names(b5.loadmat[,1]), collapse = "+"))

# "Correlated uniqueness"
terms[6] <- "A1 ~~ C2+E3+N3\n C2 ~~ E3+N3\n E3 ~~ N3"

b5.esem <- paste(terms, collapse = "\n")
## [1] "F1=~ -0.24*A1+0.11*C1+0.03*E1+-0.03*N1+0.67*O1+0.13*A2+-0.29*C2+0.13*E2+0.04*N2+0.57*O2+-0.01*A3+0*C3+-0.16*E3+-0.25*N3+0.62*O3\nF2=~ -0.17*A1+0.06*C1+0.2*E1+0.76*N1+-0.06*O1+0.03*A2+-0.19*C2+-0.03*E2+0.65*N2+0.05*O2+0.03*A3+-0.05*C3+-0.09*E3+0.53*N3+-0.07*O3\nF3=~ 0.01*A1+0.49*C1+0.05*E1+0.1*N1+0.07*O1+-0.01*A2+0.32*C2+0.01*E2+0*N2+-0.07*O2+0.27*A3+0.89*C3+-0.23*E3+-0.1*N3+0.04*O3\nF4=~ -0.06*A1+0.15*C1+0.72*E1+0.01*N1+0.06*O1+0.06*A2+0.03*C2+0.56*E2+-0.15*N2+-0.01*O2+0.05*A3+-0.03*C3+0.47*E3+0.01*N3+0.02*O3\nF5=~ 0.41*A1+-0.08*C1+0.07*E1+0.01*N1+-0.09*O1+0.51*A2+0.06*C2+0.21*E2+0.1*N2+0.07*O2+0.63*A3+0.03*C3+-0.1*E3+-0.16*N3+0.06*O3\nA1 ~~ C2+E3+N3\n C2 ~~ E3+N3\n E3 ~~ N3"

The EFA plot shows that there is a clear factor structure. Because we derived the lavaan syntax from the EFA results, we can use them for a CFA. This way, we can put constraints on the model. In this case, we will constrain intercepts and loadings to be equal for male and female participants. Of course, any other group will work.

b5.cfa2 <- lavaan::cfa(b5.esem, data = b5, verbose = F, estimator = "MLR")

Marsh et al. (2013) fit measures: CFI = .958, TLI = .948, RMSEA = .031. Our fit measures:

fitmeasures(b5.cfa2, c("cfi.robust","tli.robust","rmsea.robust","srmr"))
##   cfi.robust   tli.robust rmsea.robust         srmr 
##        0.974        0.968        0.032        0.028

That’s pretty good fit. Now let’s compare our results to those by Marsh et al. The following table shows the factor loadings from our ESEM model. The numbers in brackets are the loadings reported by Marsh et al. 

Item A C E N O
A1R .41 (.40) .01 -.06 -.17 -.24
A2 .51 (.47) -.01 .06 .03 .13
A3 .63 (.69) .27 .05 .03 -.01
C1 -.08 .49 (.51) .15 .06 .11
C2R .06 .32 (.34) .03 -.19 -.29
C3 .03 .89 (.72) -.03 -.05 .00
E1 .07 .05 .72 (.74) .20 .03
E2 .21 .01 .56 (.56) -.03 .13
E3R -.10 -.23 .47 (.39) -.09 -.16
N1 .01 .10 .01 .76 (.75) -.03
N2 .10 .00 -.15 .65 (.68) .04
N3R -.16 -.10 .01 .53 (.51) -.25
O1 -.09 .07 .06 -.06 .67 (.65)
O2 .07 -.07 -.01 .05 .57 (.56)
O3 .06 .04 .02 -.07 .62 (.55)

What about the factor correlations? Again, the numbers in brackets are the values reported by Marsh et al.

Factor A C E N O
A -
C .46 (.41) -
E .14 (.15) .27 (.19) -
N .05 (-.01) -.02 (-.05) -.19 (-.07) -
O .27 (.16) .39 (.25) .43 (.31) .06 (.01) -

Although we couldn’t reproduce exactly the same loadings, we’re pretty close. The differences between the two approaches can be due to many reasons. The extraction and rotation methods in explorative factor analysis may differ. I haven’t been able to figure out which method is used by default in MPlus. I also lack the possibility to specify the epsilon in the Geomin rotation. However, I suspect that this is how the differences in correlations come about. Furthermore, the robust ML estimators in lavaan are probably not equivalent to those offered by MPlus, so there may be differences in factor loadings.

Finally, I will briefly show you how to check the measurement invariance of the factor model. You can do this by specifying a group factor (in our case gender) and fitting a series of models with increasingly strict invariance constraints:

# No equality constraints (configural invariance)
  model = b5.esem,
  data = b5,
  group = "Gender",
  estimator = "MLR"), c("cfi.robust","tli.robust","rmsea.robust","srmr"))
##   cfi.robust   tli.robust rmsea.robust         srmr 
##        0.967        0.959        0.036        0.030
# Force equal loadings (metric invariance)
  model = b5.esem,
  data = b5,
  group = "Gender",
  group.equal = c("loadings"),
  estimator = "MLR"), c("cfi.robust","tli.robust","rmsea.robust","srmr"))
##   cfi.robust   tli.robust rmsea.robust         srmr 
##        0.967        0.959        0.036        0.030
# Force equal loadings and intercepts (scalar invariance)
  model = b5.esem,
  data = b5,
  group = "Gender",
  group.equal = c("loadings","intercepts"),
  estimator = "MLR"), c("cfi.robust","tli.robust","rmsea.robust","srmr"))
##   cfi.robust   tli.robust rmsea.robust         srmr 
##        0.960        0.952        0.039        0.032

As we can see, the model fit does not decrease substantially. The proposed cut-off values for rejecting measurement invariance are .01 for the CFI and .015 for the RMSEA. In this case, CFI and TLI only drop by .007 and the RMSEA increases by .003.

Some literature on ESEM in Psychology

Asparouhov, T., & Muthén, B. (2009). Exploratory Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 16(3), 397–438.

Herrmann, A., & Pfister, H.-R. (2013). Simple measures and complex structures: Is it worth employing a more complex model of personality in Big Five inventories? Journal of Research in Personality, 47(5), 599–608.

Guàrdia-Olmos, J., Peró-Cebollero, M., Benítez-Borrego, S., & Fox, J. (2013). Using SEM Library in R software to Analyze Exploratory Structural Equation Models. In Proceedings of the 59th ISI World Statistics Congress (pp. 4600–4605). Hong Kong. Retrieved from

Marsh, H. W., Morin, A. J. S., Parker, P. D., & Kaur, G. (2014). Exploratory Structural Equation Modeling: An Integration of the Best Features of Exploratory and Confirmatory Factor Analysis. Annual Review of Clinical Psychology, 10(1), 85–110.

Marsh, H. W., Muthén, B., Asparouhov, T., Lüdtke, O., Robitzsch, A., Morin, A. J. S., & Trautwein, U. (2009). Exploratory Structural Equation Modeling, Integrating CFA and EFA: Application to Students’ Evaluations of University Teaching. Structural Equation Modeling: A Multidisciplinary Journal, 16(3), 439–476.

Marsh, H. W., Nagengast, B., & Morin, A. J. S. (2013). Measurement invariance of big-five factors over the life span: ESEM tests of gender, age, plasticity, maturity, and la dolce vita effects. Developmental Psychology, 49(6), 1194–1218.