PrInDT: Prediction and Interpretation in Decision Trees for
Classification and Regression
Optimization of conditional inference trees from the package 'party'
for classification and regression.
For optimization, the model space is searched for the best tree on the full sample by
means of repeated subsampling. Restrictions are allowed so that only trees are accepted
which do not include pre-specified uninterpretable split results (cf. Weihs & Buschfeld, 2021a).
The function PrInDT() represents the basic resampling loop for 2-class classification (cf. Weihs
& Buschfeld, 2021a). The function RePrInDT() (repeated PrInDT()) allows for repeated
applications of PrInDT() for different percentages of the observations of the large and the
small classes (cf. Weihs & Buschfeld, 2021c). The function NesPrInDT() (nested PrInDT())
allows for an extra layer of subsampling for a specific factor variable (cf. Weihs & Buschfeld,
2021b). The functions PrInDTMulev() and PrInDTMulab() deal with multilevel and multilabel
classification. In addition to these PrInDT() variants for classification, the function
PrInDTreg() has been developed for regression problems. Finally, the function PostPrInDT()
allows for a posterior analysis of the distribution of a specified variable in the terminal
nodes of a given tree.
References are:
– Weihs, C., Buschfeld, S. (2021a) "Combining Prediction and Interpretation in
Decision Trees (PrInDT) - a Linguistic Example" <doi:10.48550/arXiv.2103.02336>;
– Weihs, C., Buschfeld, S. (2021b) "NesPrInDT: Nested undersampling in PrInDT"
<doi:10.48550/arXiv.2103.14931>;
– Weihs, C., Buschfeld, S. (2021c) "Repeated undersampling in PrInDT (RePrInDT): Variation
in undersampling and prediction, and ranking of predictors in ensembles" <doi:10.48550/arXiv.2108.05129>.
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=PrInDT
to link to this page.