Skip to contents

Finding Subgroups with Conformal Trees

Usage

r2p(
  data,
  target,
  learner,
  cv_folds = 1,
  alpha = 0.1,
  gamma = 0.01,
  lambda = 0.5,
  max_groups = 5
)

Arguments

data

(data.frame)
data set for model training and uncertainty estimation.

target

(string)
name of the target variable. The target must be a numeric variable.

learner

(model_spec)
the learner for training the prediction model. See parsnip::model_spec() for details.

cv_folds

(count)
number of CV+ folds.

alpha

(proportion)
miscoverage rate.

gamma

(proportion)
regularization parameter ensuring that reduction in the impurity of the confident homogeneity is sufficiently large.

lambda

(proportion)
balance parameter, quantifying the impact of the average interval length relative to the average absolute deviation (i.e. interval width vs. average absolute deviation)

max_groups

(count)
maximum number of subgroups.

Value

The tree.

Examples

library(tidymodels)
library(ranger)
data(bikes)
set.seed(1234)
# Initialize learner:
forest <- rand_forest() %>%
  set_mode("regression") %>%
  set_engine("ranger")
# Detect subgroups:
groups <- r2p(
  data = bikes,
  target = "count",
  learner = forest,
  cv_folds = 10,
  alpha = 0.1,
  gamma = 0.01,
  lambda = 0.5,
  max_groups = 2
)
groups
#> Conformal tree with 2 subgroups:
#> [1] root
#> |   [2] weekday in Sun, Sat: *
#> |   [3] weekday in Mon, Tue, Wed, Thu, Fri: *
#> ---
#> * terminal nodes (subgroups)