Computation of confidence intervals for binomial proportions and for difference of binomial proportions.
[GitHub Pages] [Read the Docs]
π NEW π Streamlit support! See here for an app deployed on Streamlit Community Cloud.
Run
python -m pip install diff-binom-confintor install the latest version in GitHub using
python -m pip install git+https://github.com/DeepPSP/DBCI.gitor git clone this repository and install locally via
cd DBCI
python -m pip install .Install using
python -m pip install diff-binom-confint[acc]from diff_binom_confint import compute_difference_confidence_interval
n_positive, n_total = 84, 101
ref_positive, ref_total = 89, 105
confint = compute_difference_confidence_interval(
n_positive,
n_total,
ref_positive,
ref_total,
conf_level=0.95,
method="wilson",
)Click to view!
| Method (type) | Implemented |
|---|---|
| wilson | βοΈ |
| wilson-cc | βοΈ |
| wald | βοΈ |
| wald-cc | βοΈ |
| agresti-coull | βοΈ |
| jeffreys | βοΈ |
| clopper-pearson | βοΈ |
| arcsine | βοΈ |
| logit | βοΈ |
| pratt | βοΈ |
| witting | βοΈ |
| mid-p | βοΈ |
| lik | βοΈ |
| blaker | βοΈ |
| modified-wilson | βοΈ |
| modified-jeffreys | βοΈ |
Click to view!
| Method (type) | Implemented |
|---|---|
| wilson | βοΈ |
| wilson-cc | βοΈ |
| wald | βοΈ |
| wald-cc | βοΈ |
| haldane | βοΈ |
| jeffreys-perks | βοΈ |
| mee | βοΈ |
| miettinen-nurminen | βοΈ |
| true-profile | βοΈ |
| hauck-anderson | βοΈ |
| agresti-caffo | βοΈ |
| carlin-louis | βοΈ |
| brown-li | βοΈ |
| brown-li-jeffrey | βοΈ |
| miettinen-nurminen-brown-li | βοΈ |
| exact | β |
| mid-p | β |
| santner-snell | β |
| chan-zhang | β |
| agresti-min | β |
| wang | βοΈ |
| pradhan-banerjee | β |
One can use the make_risk_report function to create a report of the confidence intervals for difference of binomial proportions.
from diff_binom_confint import make_risk_report
# df_train and df_test are pandas.DataFrame providing the data
table = make_risk_report((df_train, df_test), target = "binary_target")
# or if df_data is a pandas.DataFrame containing both training and testing data
table = make_risk_report(df_data, target = "binary_target")For more details, see corresponding documenation. The produced table is similar to the following:
- SAS
- PASS
- statsmodels.stats.proportion
- scipy.stats._binomtest
- corplingstats
- DescTools.StatsAndCIs
- Newcombee
Reference 1 has errors in the description of the methods Wilson CC, Mee, Miettinen-Nurminen.
The correct computation of Wilson CC is given in Reference 5.
The correct computation of Mee, Miettinen-Nurminen are given in the code blocks in Reference 1
Test data are
-
taken (with slight modification, e.g. the
upper_boundofmiettinen-nurminen-brown-limethod in the edge case file) from Reference 1 for automatic test of the correctness of the implementation of the algorithms. -
generated using DescTools.StatsAndCIs via
library("DescTools") library("data.table") results = data.table() for (m in c("wilson", "wald", "waldcc", "agresti-coull", "jeffreys", "modified wilson", "wilsoncc", "modified jeffreys", "clopper-pearson", "arcsine", "logit", "witting", "pratt", "midp", "lik", "blaker")){ ci = BinomCI(84,101,method = m) new_row = data.table("method" = m, "ratio"=ci[1], "lower_bound" = ci[2], "upper_bound" = ci[3]) results = rbindlist(list(results, new_row)) } fwrite(results, "./test/test-data/example-84-101.csv") # with manual slight adjustment of method names
-
taken from Reference 7 (Table II).
The filenames has the following pattern:
# for computing confidence interval for difference of binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)-vs-(?P<ref_positive>[\\d]+)-(?P<ref_total>[\\d]+)\\.csv"
# for computing confidence interval for binomial proportions
"example-(?P<n_positive>[\\d]+)-(?P<n_total>[\\d]+)\\.csv"Note that the out-of-range values (e.g. > 1) are left as empty values in the .csv files.
- Edge cases incorrect for the method
true-profile.
