Structural equation model (SEM) trees combine SEM with decision-tree style recursive partitioning. Starting from a single template SEM fit to a complete data set, semtree recursively (over and over again) searches for subgroups, in which the model parameters differ most wit

semtree(
  model,
  data = NULL,
  control = NULL,
  constraints = NULL,
  predictors = NULL,
  ...
)

Arguments

model

A template model specification from OpenMx using the mxModel function or a lavaan model using the lavaan function with option do.fit=FALSE). Model must be syntactically correct within the framework chosen, and converge to a solution.

data

A data.frame used for building the tree. Order of modeled variables and predictors is not important when providing a dataset to semtree.

control

semtree model specifications from semtree.control are input here. Any changes from the default setting can be specified here.

constraints

A semtree.constraints object setting model parameters as constrained from the beginning of the semtree computation. This includes options to globally or locally set equality constraints and to specify focus parameters (i.e., parameter subsets that exclusively go into the function evaluating splits). Also, options for measurement invariance testing in trees are included.

predictors

A vector of variable names matching variable names in data set. If NULL (default) all variables that are in data set and not part of the model are potential predictors. Optional function input to select a subset of the unmodeled variables to use as predictors in the semtree function.

...

Optional arguments passed to the tree growing function.

Value

A semtree object is created which can be examined with summary, plot, and print.

Details

Calling semtree with an mxModel or lavaan model fits the template to the data set provided and then recurses over the following steps until no further meaningful partition into sub groups is found:

  1. Fit the model on the current node's data and compute the model fit.

  2. For each predictor, generate candidate split points (or score tests) and estimate the improvement in model fit using the chosen method in semtree.control.

  3. Select the best-performing predictor/split combination and apply it when it passes the statistical threshold (alpha) and satisfies the size limitations (min.N, min.bucket, max.depth, or a custom stopping rule).

  4. Continue the procedure independently on each resulting child node so that terminal nodes (leafs) contain SEMs with subgroup-specific parameter estimates.

Predictors can be categorical (ordered or unordered) or continuous. When using unordered categorical predictors with many levels, the number of candidate partitions grows quickly, so limiting the predictor set can reduce computation and the number of multiple comparisons.

Splitting quality can be evaluated with three built-in strategies:

1. "naive" selection compares all possible split values across all predictors and chooses the best overall improvement.

2. "fair" selection uses a two-step procedure at each node: a first phase on half the sample identifies the best split value per predictor, and a second phase on the remaining data picks the most promising predictor among those candidates.

3. "score" relies on score-based statistics that provide faster evaluations while retaining favorable statistical properties for detecting parameter instabilities.

All other parameters controlling the tree growing process are adjusted in the semtree.control object.

References

Brandmaier, A.M., Oertzen, T. v., McArdle, J.J., & Lindenberger, U. (2013). Structural equation model trees. Psychological Methods, 18(1), 71-86.

Arnold, M., Voelkle, M. C., & Brandmaier, A. M. (2021). Score-guided structural equation model trees. Frontiers in Psychology, 11, Article 564403. https://doi.org/10.3389/fpsyg.2020.564403

See also

Author

Andreas M. Brandmaier, John J. Prindle, Manuel Arnold