Structural equation model (SEM) trees combine SEM with decision-tree style recursive partitioning. Starting from a single template SEM, semtree repeatedly searches for subgroups in which the model parameters differ and yields a tree of locally homogeneous SEMs.

semtree(
  model,
  data = NULL,
  control = NULL,
  constraints = NULL,
  predictors = NULL,
  ...
)

Arguments

model

A template model specification from OpenMx using the mxModel function or a lavaan model using the lavaan function with option fit=FALSE). Model must be syntactically correct within the framework chosen, and converge to a solution.

data

Data.frame used in the model creation using mxModel or lavaan are input here. Order of modeled variables and predictors is not important when providing a dataset to semtree.

control

semtree model specifications from semtree.control are input here. Any changes from the default setting can be specified here.

constraints

A semtree.constraints object setting model parameters as constrained from the beginning of the semtree computation. This includes options to globally or locally set equality constraints and to specify focus parameters (i.e., parameter subsets that exclusively go into the function evaluating splits). Also, options for measurement invariance testing in trees are included.

predictors

A vector of variable names matching variable names in dataset. If NULL (default) all variables that are in dataset and not part of the model are potential predictors. Optional function input to select a subset of the unmodeled variables to use as predictors in the semtree function.

...

Optional arguments passed to the tree growing function.

Value

A semtree object is created which can be examined with summary, plot, and print.

Details

Calling semtree with an mxModel or lavaan model fits the template to the full dataset and then recurses over the following steps until no further meaningful partition is found:

  1. Fit the model on the current node's data (respecting any predefined constraints) and compute the baseline fit.

  2. For each predictor, generate candidate split points (or score tests) and quantify the improvement in model fit using the chosen method in semtree.control.

  3. Select the best-performing predictor/split combination and apply it when it passes the statistical threshold (alpha) and satisfies the size limitations (min.N, min.bucket, max.depth, or a custom stopping rule).

  4. Continue the procedure independently on each resulting child node so that terminal nodes (leaves) contain SEMs with subgroup-specific parameter estimates.

Predictors can be categorical (ordered or unordered) or continuous. When using unordered categorical predictors with many levels, the number of candidate partitions grows quickly, so limiting the predictor set can reduce computation and the number of multiple comparisons.

Splitting quality can be evaluated with three built-in strategies:

1. "naive" selection compares all possible split values across all predictors and chooses the best overall improvement.

2. "fair" selection uses a two-step procedure at each node: a first phase on half the sample identifies the best split value per predictor, and a second phase on the remaining data picks the most promising predictor among those candidates.

3. "score" relies on score-based statistics that provide faster evaluations while retaining favorable statistical properties for detecting parameter instabilities.

All other parameters controlling the tree growing process are available through a separate semtree.control object.

References

Brandmaier, A.M., Oertzen, T. v., McArdle, J.J., & Lindenberger, U. (2013). Structural equation model trees. Psychological Methods, 18(1), 71-86.

Arnold, M., Voelkle, M. C., & Brandmaier, A. M. (2021). Score-guided structural equation model trees. Frontiers in Psychology, 11, Article 564403. https://doi.org/10.3389/fpsyg.2020.564403

See also

Author

Andreas M. Brandmaier, John J. Prindle, Manuel Arnold