ctree function - RDocumentation (2024)

Description

Recursive partitioning for continuous, censored, ordered, nominal and multivariate response variables in a conditional inference framework.

Usage

ctree(formula, data, subset, weights, na.action = na.pass, offset, cluster, control = ctree_control(...), ytrafo = NULL, converged = NULL, scores = NULL, doFit = TRUE, ...)

Value

An object of class party.

Arguments

formula

a symbolic description of the model to be fit.

data

a data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

weights

an optional vector of weights to be used in the fitting process. Only non-negative integer valued weights are allowed.

offset

an optional vector of offset values.

cluster

an optional factor indicating independent clusters. Highly experimental, use at your own risk.

na.action

a function which indicates what should happen when the data contain missing value.

control

a list with control parameters, see ctree_control.

ytrafo

an optional named list of functions to be applied to the response variable(s) before testing their association with the explanatory variables. Note that this transformation is only performed once for the root node and does not take weights into account. Alternatively, ytrafo can be a function of data and weights. In this case, the transformation is computed for every node with corresponding weights. This feature is experimental and the user interface likely to change.

converged

an optional function for checking user-defined criteria before splits are implemented. This is not to be used and very likely to change.

scores

an optional named list of scores to be attached to ordered factors.

doFit

a logical, if FALSE, the tree is not fitted.

...

arguments passed to ctree_control.

Details

Function partykit::ctree is a reimplementation of (most of) party::ctree employing the new party infrastructure of the partykit infrastructure. The vignette vignette("ctree", package = "partykit") explains internals of the different implementations.

Conditional inference trees estimate a regression relationship by binary recursive partitioning in a conditional inference framework. Roughly, the algorithm works as follows: 1) Test the global null hypothesis of independence between any of the input variables and the response (which may be multivariate as well). Stop if this hypothesis cannot be rejected. Otherwise select the input variable with strongest association to the response. This association is measured by a p-value corresponding to a test for the partial null hypothesis of a single input variable and the response. 2) Implement a binary split in the selected input variable. 3) Recursively repeate steps 1) and 2).

The implementation utilizes a unified framework for conditional inference, or permutation tests, developed by Strasser and Weber (1999). The stop criterion in step 1) is either based on multiplicity adjusted p-values (testtype = "Bonferroni" in ctree_control) or on the univariate p-values (testtype = "Univariate"). In both cases, the criterion is maximized, i.e., 1 - p-value is used. A split is implemented when the criterion exceeds the value given by mincriterion as specified in ctree_control. For example, when mincriterion = 0.95, the p-value must be smaller than $0.05$ in order to split this node. This statistical approach ensures that the right-sized tree is grown without additional (post-)pruning or cross-validation. The level of mincriterion can either be specified to be appropriate for the size of the data set (and 0.95 is typically appropriate for small to moderately-sized data sets) or could potentially be treated like a hyperparameter (see Section~3.4 in Hothorn, Hornik and Zeileis, 2006). The selection of the input variable to split in is based on the univariate p-values avoiding a variable selection bias towards input variables with many possible cutpoints. The test statistics in each of the nodes can be extracted with the sctest method. (Note that the generic is in the strucchange package so this either needs to be loaded or sctest.constparty has to be called directly.) In cases where splitting stops due to the sample size (e.g., minsplit or minbucket etc.), the test results may be empty.

Predictions can be computed using predict, which returns predicted means, predicted classes or median predicted survival times and more information about the conditional distribution of the response, i.e., class probabilities or predicted Kaplan-Meier curves. For observations with zero weights, predictions are computed from the fitted tree when newdata = NULL.

By default, the scores for each ordinal factor x are 1:length(x), this may be changed for variables in the formula using scores = list(x = c(1, 5, 6)), for example.

For a general description of the methodology see Hothorn, Hornik and Zeileis (2006) and Hothorn, Hornik, van de Wiel and Zeileis (2006).

References

Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2006). A Lego System for Conditional Inference. The American Statistician, 60(3), 257--263.

Hothorn T, Hornik K, Zeileis A (2006). Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics, 15(3), 651--674.

Hothorn T, Zeileis A (2015). partykit: A Modular Toolkit for Recursive Partytioning in R. Journal of Machine Learning Research, 16, 3905--3909.

Strasser H, Weber C (1999). On the Asymptotic Theory of Permutation Statistics. Mathematical Methods of Statistics, 8, 220--250.

Examples

Run this code

### regressionairq <- subset(airquality, !is.na(Ozone))airct <- ctree(Ozone ~ ., data = airq)airctplot(airct)mean((airq$Ozone - predict(airct))^2)### classificationirisct <- ctree(Species ~ .,data = iris)irisctplot(irisct)table(predict(irisct), iris$Species)### estimated class probabilities, a listtr <- predict(irisct, newdata = iris[1:10,], type = "prob")### survival analysisif (require("TH.data") && require("survival") &&  require("coin") && require("Formula")) { data("GBSG2", package = "TH.data") (GBSG2ct <- ctree(Surv(time, cens) ~ ., data = GBSG2)) predict(GBSG2ct, newdata = GBSG2[1:2,], type = "response")  plot(GBSG2ct) ### with weight-dependent log-rank scores ### log-rank trafo for observations in this node only (= weights > 0) h <- function(y, x, start = NULL, weights, offset, estfun = TRUE, object = FALSE, ...) { if (is.null(weights)) weights <- rep(1, NROW(y)) s <- logrank_trafo(y[weights > 0,,drop = FALSE]) r <- rep(0, length(weights)) r[weights > 0] <- s list(estfun = matrix(as.double(r), ncol = 1), converged = TRUE) } ### very much the same tree (ctree(Surv(time, cens) ~ ., data = GBSG2, ytrafo = h))}### multivariate responsesairct2 <- ctree(Ozone + Temp ~ ., data = airq)airct2plot(airct2)

Run the code above in your browser using DataLab

ctree function - RDocumentation (2024)

References

Top Articles
34 Trendsetting Braid Hairstyles for Black Men: Embrace Your Style - Stylorize
A Quick and Easy Blox Fruits Leveling Guide for Gamers
Northern Counties Soccer Association Nj
Dannys U Pull - Self-Service Automotive Recycling
Inducement Small Bribe
Team 1 Elite Club Invite
Overnight Cleaner Jobs
Devotion Showtimes Near Mjr Universal Grand Cinema 16
7.2: Introduction to the Endocrine System
Minn Kota Paws
What Is A Good Estimate For 380 Of 60
The Murdoch succession drama kicks off this week. Here's everything you need to know
RBT Exam: What to Expect
7 Fly Traps For Effective Pest Control
Classic | Cyclone RakeAmerica's #1 Lawn and Leaf Vacuum
Aspen Mobile Login Help
Keck Healthstream
Our History
Why Does Lawrence Jones Have Ptsd
Hdmovie2 Sbs
Poe Str Stacking
Bella Bodhi [Model] - Bio, Height, Body Stats, Family, Career and Net Worth 
Where to eat: the 50 best restaurants in Freiburg im Breisgau
UMvC3 OTT: Welcome to 2013!
Ceramic tiles vs vitrified tiles: Which one should you choose? - Building And Interiors
Best Sports Bars In Schaumburg Il
How to Watch Every NFL Football Game on a Streaming Service
Https E22 Ultipro Com Login Aspx
Plost Dental
Xxn Abbreviation List 2023
Yayo - RimWorld Wiki
Google Flights To Orlando
Mia Malkova Bio, Net Worth, Age & More - Magzica
Kstate Qualtrics
Lucky Larry's Latina's
Studio 22 Nashville Review
Michael Jordan: A timeline of the NBA legend
What Does Code 898 Mean On Irs Transcript
Gold Dipping Vat Terraria
Clima De 10 Días Para 60120
Trivago Sf
Fool's Paradise Showtimes Near Roxy Stadium 14
فیلم گارد ساحلی زیرنویس فارسی بدون سانسور تاینی موویز
Fatal Accident In Nashville Tn Today
Babykeilani
Mauston O'reilly's
Dobratz Hantge Funeral Chapel Obituaries
Bismarck Mandan Mugshots
Festival Gas Rewards Log In
Bob Wright Yukon Accident
Mast Greenhouse Windsor Mo
Latest Posts
Article information

Author: Edwin Metz

Last Updated:

Views: 6312

Rating: 4.8 / 5 (78 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Edwin Metz

Birthday: 1997-04-16

Address: 51593 Leanne Light, Kuphalmouth, DE 50012-5183

Phone: +639107620957

Job: Corporate Banking Technician

Hobby: Reading, scrapbook, role-playing games, Fishing, Fishing, Scuba diving, Beekeeping

Introduction: My name is Edwin Metz, I am a fair, energetic, helpful, brave, outstanding, nice, helpful person who loves writing and wants to share my knowledge and understanding with you.