krtexas

R package for nonparametric regression via tree-guided feature aggregation

krtexas (Kernel Regression with Tree-EXploring AggregationS) is an R package implementing nonparametric regression for predictors organized in a known hierarchical tree structure — for example, taxonomic trees, brain region hierarchies, or geographical hierarchies.

Developed at the Department of Statistics and Data Science, Cornell University, with Martin T. Wells and Y. Samuel Wang.

The method, KR-TEXAS, simultaneously performs nonparametric regression and learns the correct level of feature aggregation, selecting relevant variables along the way (Manage et al., 2026). It’s built on a penalized Nadaraya–Watson estimator with adaptive weights, enabling joint model selection and aggregation in nonlinear settings.

This is motivated by applications like microbiome and genomic data analysis, where predictors naturally form a tree structure (e.g. species vs. genus level) and choosing a fixed resolution by hand sacrifices interpretability, statistical efficiency, or predictive performance. KR-TEXAS learns the optimal resolution directly from the data.

The package is co-authored with Y. Samuel Wang and Martin T. Wells, and accompanies the manuscript Nonparametric Regression via Tree-Guided Feature Aggregation.

You can install the package directly from GitHub:

# install.packages("devtools")
devtools::install_github(
  "sithijamanage/krtexas",
  build_vignettes = TRUE
)

Source code, documentation, and a full vignette walkthrough are available on GitHub.

References

2026

  1. Nonparametric Regression via Tree-Guided Feature Aggregation
    Sithija Manage, Y Samuel Wang, and Martin T Wells
    arXiv preprint arXiv:2605.26653, 2026