ChiRP
is an R package that implements Chinese Restaurant Process mixtures models for regression and clustering. The package currently supports zero-inflated continuous outcomes, continuous outcomes, and binary outcomes.
Install from GitHub as follows:
## install.packages('devtools' ) ## make sure to have devtools installed
devtools::install_github('stablemarkets/ChiRP')
library(ChiRP)
Help documentation in R
is also available. After installing the package and loading it with library()
, use ?
to access help documentation for specific functions:
?ChiRP::NDPMix # for continuous outcomes
?ChiRP::ZDPMix # for zero-inflated, semi-continuous outcomes
?ChiRP::PDPMix # for binary outcomes
?ChiRP::cluster_assign_mode # computes posterior mode cluster assignment
CRP models, aka Dirichlet Process (DP) models, are a class of Bayesian nonparametric models. They provide a very flexible fit to complex data while also providing easy uncertainty estimates via posterior inference. They work by partitioning complex data into more homogenous clusters and modeling each with a locally parametric model. But don’t be fooled! While they use locally parametric models, CRP models assume there are infinitely many clusters so that the parameter space is infinite-dimensional - making this a truly nonparametric method. Please see the examples page for uses.
The package is stored on GitHub, where you can report issues.
Development Status:
Package author and maintainer: Arman Oganisian. Contact via Email ( aoganisi@upenn.edu ) or Twitter.
Since this package was written in conjunction with this paper, please cite it when using this package. BibTeX
The software itself was published in the Journal of Open Source Software.
@article{oganisian2020,
author = {Oganisian, Arman and Mitra, Nandita and Roy, Jason A.},
title = {A Bayesian nonparametric model for zero-inflated outcomes: Prediction, clustering, and causal estimation},
journal = {Biometrics},
year = {2020},
pages = {1-11},
keywords = {Bayesian, causal inference, clustering, Dirichlet process, healthcare costs, nonparametrics, zero inflation},
doi = {10.1111/biom.13244},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13244},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1111/biom.13244}}
@article{Oganisian2019ChiRP,
journal = {Journal of Open Source Software},
doi = {10.21105/joss.01287},
issn = {2475-9066},
number = {35},
publisher = {The Open Journal},
title = {ChiRP: Chinese Restaurant Process Mixtures for Regression and Clustering},
url = {http://dx.doi.org/10.21105/joss.01287},
volume = {4},
author = {Oganisian, Arman},
pages = {1287},
date = {2019-03-26},
year = {2019},
month = {3},
day = {26},
}
ChiRP
You can contribute in two ways:
examples.Rmd
. Use rmarkdown::render_site()
to build the site. Submit a pull request in that same repository. The issue will be closed once updates are merged.Thanks to Jason Roy for invaluable discussions regarding underlying MCMC computations. Special thanks to Nick Illenberger and Caroyln Lou for designing the package hex and coming up with the creative name, ChiRP
!
This work was supported in part by Grant R01GM112327 from National Institute Of General Medical Sciences.
A Bayesian Nonparametric Method for Estimating Causal Treatment Effects on Zero-Inflated Outcomes. A. Oganisian et al. 2018.
Bayesian nonparametric generative models for causal inference with missing at random covariates. Roy et al. 2018.
A Bayesian nonparametric approach to marginal structural models for point treatments and a continuous or survival outcome. Roy et al. 2016.
Dirichlet Process Mixtures of Generalized Linear Models. Hannah et al. 2011.
Markov Chain Sampling Methods for Dirichlet Process Mixture Models. Radford Neal. 2000.