Structural equation model (SEM) trees combine SEM with decision-tree style
recursive partitioning. Starting from a single template SEM, semtree
repeatedly searches for subgroups in which the model parameters differ and
yields a tree of locally homogeneous SEMs.
semtree(
model,
data = NULL,
control = NULL,
constraints = NULL,
predictors = NULL,
...
)A template model specification from OpenMx using
the mxModel function or a lavaan model
using the lavaan function with option fit=FALSE).
Model must be syntactically correct within the framework chosen, and
converge to a solution.
Data.frame used in the model creation using
mxModel or lavaan are input here. Order
of modeled variables and predictors is not important when providing a
dataset to semtree.
semtree model specifications from
semtree.control are input here. Any changes from the default
setting can be specified here.
A semtree.constraints object setting model
parameters as constrained from the beginning of the semtree
computation. This includes options to globally or locally set equality
constraints and to specify focus parameters (i.e., parameter subsets that
exclusively go into the function evaluating splits). Also, options for
measurement invariance testing in trees are included.
A vector of variable names matching variable names in
dataset. If NULL (default) all variables that are in dataset and not part of
the model are potential predictors. Optional function input to select a
subset of the unmodeled variables to use as predictors in the semtree
function.
Optional arguments passed to the tree growing function.
A semtree object is created which can be examined with
summary, plot, and print.
Calling semtree with an mxModel or
lavaan model fits the template to the full dataset and
then recurses over the following steps until no further meaningful partition
is found:
Fit the model on the current node's data (respecting any predefined constraints) and compute the baseline fit.
For each predictor, generate candidate split points (or score tests)
and quantify the improvement in model fit using the chosen method in
semtree.control.
Select the best-performing predictor/split combination and apply it
when it passes the statistical threshold (alpha) and satisfies the
size limitations (min.N, min.bucket, max.depth, or a
custom stopping rule).
Continue the procedure independently on each resulting child node so that terminal nodes (leaves) contain SEMs with subgroup-specific parameter estimates.
Predictors can be categorical (ordered or unordered) or continuous. When using unordered categorical predictors with many levels, the number of candidate partitions grows quickly, so limiting the predictor set can reduce computation and the number of multiple comparisons.
Splitting quality can be evaluated with three built-in strategies:
1. "naive" selection compares all possible split values across all predictors and chooses the best overall improvement.
2. "fair" selection uses a two-step procedure at each node: a first phase on half the sample identifies the best split value per predictor, and a second phase on the remaining data picks the most promising predictor among those candidates.
3. "score" relies on score-based statistics that provide faster evaluations while retaining favorable statistical properties for detecting parameter instabilities.
All other parameters controlling the tree growing process are available
through a separate semtree.control object.
Brandmaier, A.M., Oertzen, T. v., McArdle, J.J., & Lindenberger, U. (2013). Structural equation model trees. Psychological Methods, 18(1), 71-86.
Arnold, M., Voelkle, M. C., & Brandmaier, A. M. (2021). Score-guided structural equation model trees. Frontiers in Psychology, 11, Article 564403. https://doi.org/10.3389/fpsyg.2020.564403
semtree.control, summary.semtree,
parameters, se, prune.semtree,
subtree, OpenMx,
lavaan