Installation

ChiRP is an R package that implements Chinese Restaurant Process mixtures models for regression and clustering. The package currently supports zero-inflated continuous outcomes, continuous outcomes, and binary outcomes.

Install from GitHub as follows:

## install.packages('devtools' ) ## make sure to have devtools installed 
devtools::install_github('stablemarkets/ChiRP')
library(ChiRP)

Help documentation in R is also available. After installing the package and loading it with library(), use ? to access help documentation for specific functions:

?ChiRP::NDPMix  # for continuous outcomes
?ChiRP::ZDPMix  # for zero-inflated, semi-continuous outcomes
?ChiRP::PDPMix  # for binary outcomes
?ChiRP::cluster_assign_mode # computes posterior mode cluster assignment

What Are Chinese Restaurant Process Models?

CRP models, aka Dirichlet Process (DP) models, are a class of Bayesian nonparametric models. They provide a very flexible fit to complex data while also providing easy uncertainty estimates via posterior inference. They work by partitioning complex data into more homogenous clusters and modeling each with a locally parametric model. But don’t be fooled! While they use locally parametric models, CRP models assume there are infinitely many clusters so that the parameter space is infinite-dimensional - making this a truly nonparametric method. Please see the examples page for uses.

Contact, GitHub, and Issue Reporting

The package is stored on GitHub, where you can report issues.

Development Status:

License: MIT Build Status Coveralls github status DOI

Package author and maintainer: Arman Oganisian. Contact via Email ( aoganisi@upenn.edu ) or Twitter.

How to Cite this Package

Since this package was written in conjunction with this paper, please cite it when using this package. BibTeX

The software itself was published in the Journal of Open Source Software.

@article{oganisian2020,
author = {Oganisian, Arman and Mitra, Nandita and Roy, Jason A.},
title = {A Bayesian nonparametric model for zero-inflated outcomes: Prediction, clustering, and causal estimation},
journal = {Biometrics},
year = {2020},
pages = {1-11},
keywords = {Bayesian, causal inference, clustering, Dirichlet process, healthcare costs, nonparametrics, zero inflation},
doi = {10.1111/biom.13244},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1111/biom.13244},
eprint = {https://onlinelibrary.wiley.com/doi/pdf/10.1111/biom.13244}}

@article{Oganisian2019ChiRP,
    journal = {Journal of Open Source Software},
    doi = {10.21105/joss.01287},
    issn = {2475-9066},
    number = {35},
    publisher = {The Open Journal},
    title = {ChiRP: Chinese Restaurant Process Mixtures for Regression and Clustering},
    url = {http://dx.doi.org/10.21105/joss.01287},
    volume = {4},
    author = {Oganisian, Arman},
    pages = {1287},
    date = {2019-03-26},
    year = {2019},
    month = {3},
    day = {26},
}

Contributing to ChiRP

You can contribute in two ways:

  1. Contribute to base code: First, start an issue in this repository with the proposed modification. Fork this repository, make changes/enhancements, then submit a pull request. The issue will be closed once the pull request is merged.
  2. Contribute an example: First, start an issue in the companion site’s repository. Fork the repository and add a new example to examples.Rmd. Use rmarkdown::render_site() to build the site. Submit a pull request in that same repository. The issue will be closed once updates are merged.

Acknowledgements

Thanks to Jason Roy for invaluable discussions regarding underlying MCMC computations. Special thanks to Nick Illenberger and Caroyln Lou for designing the package hex and coming up with the creative name, ChiRP!

This work was supported in part by Grant R01GM112327 from National Institute Of General Medical Sciences.

References

A Bayesian Nonparametric Method for Estimating Causal Treatment Effects on Zero-Inflated Outcomes. A. Oganisian et al. 2018.

Bayesian nonparametric generative models for causal inference with missing at random covariates. Roy et al. 2018.

A Bayesian nonparametric approach to marginal structural models for point treatments and a continuous or survival outcome. Roy et al. 2016.

Dirichlet Process Mixtures of Generalized Linear Models. Hannah et al. 2011.

Markov Chain Sampling Methods for Dirichlet Process Mixture Models. Radford Neal. 2000.