You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: blogs/test/package-review.qmd
+32-29Lines changed: 32 additions & 29 deletions
Original file line number
Diff line number
Diff line change
@@ -25,43 +25,44 @@ When looking at a package the first place to start is the CRAN index page. You c
25
25
26
26
- Are there references to external peer reviewed papers?
27
27
28
-
- Is there a website / vignettes?
28
+
- Is there a website or vignettes?
29
29
30
30
- Is there a way to report bugs?
31
31
32
32
- Can the packages handle different edge cases?
33
33
34
34
- Does the package have a lot of dependencies / unusual dependencies?
35
35
36
-
-Look at community adoption?
36
+
-How much has the statistical community or industry adopted it?
37
37
38
38
Using this checklist can help you quickly and consistently get a sense of a package before spending time looking into the code directly. Let's see how this works in practice.
39
39
40
40
## Worked Example: Wilcoxon Rank-Sum Test
41
41
42
-
For this, we are going to look at the Wilcoxon Rank-Sum test and the associated Hodges-Lehmann confidence interval. After googling a bit, I found three different packages that do a Wilcoxon Rank-Sum p-value and Hodges-Lehmann CI:
42
+
To think this through, we are going to look at a specific worked example - the Wilcoxon Rank-Sum test and the associated Hodges-Lehmann confidence interval.
43
+
After googling for a little while, I found three different packages that do a Wilcoxon Rank-Sum p-value and Hodges-Lehmann CI:
43
44
44
-
1.{stats} (part of base R)
45
+
1.**stats** (part of base R)
45
46
46
-
2.{pairwiseCI}
47
+
2.**pairwiseCI**
47
48
48
-
3.{coin}
49
+
3.**coin**
49
50
50
-
Great! I might be kind of done, because I tend to favor base R stats functions, but as I start looking into this, I found the {stats} function can't handle ties if I want the exact methods. So I need to look into and compare the {pairwiseCI} and {coin} packages.
51
+
Great! I might be kind of done, because I tend to favor base R stats functions, but as I start looking into this, I found the **stats** function can't handle ties if I want the exact methods. So I need to look into and compare the **pairwiseCI** and **coin** packages.
51
52
52
53
::: callout-tip
53
54
You often find that differences between packages and software show up when there are ties, missing data, and/or extreme values, so it is good to try to include these in the dataset you are using to compare.
54
55
:::
55
56
56
-
Now I need to choose between {pairwaiseCI} and {coin}. I could just run the model in both and see if the results match, but that will be a lot of work. So before I get started I want to go through our checklist.
57
+
Now I need to choose between **pairwaiseCI** and **coin**. I could just run the model in both and see if the results match, but that will be a lot of work. So before I get started I want to go through our checklist.
57
58
58
59
Let's pull up the CRAN index pages for each of these packages and see if we can figure out which package we should use for this analysis.
59
60
60
-
### {pairwiseCI}
61
+
### **pairwiseCI**
61
62
62
-
Starting with {pairwiseCI}, the [index](https://cran.r-project.org/web/packages/pairwiseCI/index.html) page looks like this:
63
+
Starting with **pairwiseCI**, the [index](https://cran.r-project.org/web/packages/pairwiseCI/index.html) page looks like this:
@@ -107,7 +108,7 @@ Now let's go down the checklist to see if there are any red flags for this packa
107
108
108
109
</p>
109
110
110
-
- Is there a website / Vignettes?
111
+
- Is there a website or vignettes?
111
112
112
113
<pstyle="color:blue;">
113
114
@@ -135,17 +136,17 @@ Now let's go down the checklist to see if there are any red flags for this packa
135
136
136
137
<pstyle="color:blue;">
137
138
138
-
It looks like this package only has two dependencies, {MCPAN} and, interestingly, {coin}, the other package we are looking at.
139
+
It looks like this package only has two dependencies, **MCPAN** and, interestingly, **coin**, the other package we are looking at.
139
140
140
141
</p>
141
142
142
-
Okay, having gone through all but the final question, I would say I feel not amazing about the package, but if it was my only option I would still try to use it. The author gives me confidence in the package, but other things like documentation and last update date, make me a bit nervous about this package.
143
+
Okay, having gone through all but the final question, I would say I do not feel great about the package, but if it was my only option I would still try to use it. The author gives me confidence in the package, but other things like documentation and last update date make me a bit nervous about this package.
143
144
144
-
### {coin}
145
+
### **coin**
145
146
146
-
Now on to {coin} with the same questions. The [index](https://cran.r-project.org/web/packages/coin/index.html) page is as follows:
147
+
Now on to **coin** with the same questions. The [index](https://cran.r-project.org/web/packages/coin/index.html) page is as follows:
Having gone through most the questions, I am fairly confident in saying I want to use {coin} to investigate this method rather than {pairwiseCI}. For almost all the questions {coin} looks slightly better than {pairwiseCI} and really just has a larger accumulation of evidence of quality. But, I haven't answered the last question in my checklist for either these packages. What about community adoption? It can be a bit hard to look at directly, but I tend to use a few different ways.
225
+
Having gone through the questions, I am fairly confident in saying I want to use **coin** to investigate this method rather than **pairwiseCI**. For almost all of the questions **coin** looks slightly better than **pairwiseCI** and really just has a larger accumulation of evidence of quality. But, I haven't answered the last question in my checklist for either of these packages - How much has the statistical community or industry adopted it? It can be a bit hard to look at directly, but I tend to investigate this a few different ways.
225
226
226
-
First, staying on the CRAN index page for the package, I look at the Reverse Dependencies. This section gets split into three parts, "Reverse depends", "Reverse imports", and "Reverse suggests" which explains how the other packages are using the package. In terms of community adoption, it doesn't matter if other packages are depending, importing or suggesting the package, all that matters is they are using it. **Note:** This section only appears if other packages on CRAN use the package.
227
+
First, staying on the CRAN index page for the package, I look at the Reverse Dependencies. This section gets split into three parts, "Reverse depends", "Reverse imports", and "Reverse suggests" which explains how the other packages are using the package. In terms of community adoption, it doesn't matter if other packages are depending on, importing or suggesting the package. All that matters is that they are using it. **Note:** This section only appears if other packages on CRAN use the package.
227
228
228
-
For these two packages, only {coin} has this section and we can see there are many other packages that use {coin}.
229
+
For these two packages, only **coin** has this section and we can see there are many other packages that use **coin**.
And you can see {coin} is much more popular than {pairwiseCI}.
247
+
And you can see **coin** is much more popular than **pairwiseCI**.
247
248
248
-
So with all of this information, I think starting with {coin} is going to be the best use of my time.
249
+
So with all of this information, I think starting with **coin** is going to be the best use of my time.
249
250
250
-
When looking at the number of downloads, you can look over a longer period like over the last month (by using the `when` parameter) or you can look between specific dates (by using the `from` and `to` parameters). But, it will give you the download numbers for each day, which you will need to summaries. These day-by-day numbers can be very helpful to look at trends, especially when there is a new package that is getting rapidly adopted.
251
+
When looking at the number of downloads, you can look over a longer period like over the last month (by using the `when` parameter) or you can look between specific dates (by using the `from` and `to` parameters). But, it will give you the download numbers for each day, which you will need to summarise. These day-by-day numbers can be very helpful to look at trends, especially when there is a new package that is getting rapidly adopted.
251
252
252
-
The checklist isn't intended to replace a full review of the package for an GxP workflows. But, when just trying to decide which package to look into for a particular stats method it can be helpful.
253
+
Please note that this checklist isn't intended to replace a full review of the package for GxP workflows. But it is intended to be helpful when starting to think through the issues involved in package choice particularly for statistical methods.
253
254
254
-
In summary, selecting the appropriate R package for statistical analyses is hard. Google, isn't perfect and so it worth finding a few packages and going through this checklist. By taking a few minutes to consider factors like maintenance, documentation, and community adoption can save you time in the long run.
255
+
## Summary
256
+
257
+
In summary, selecting the appropriate R package for statistical analyses is hard. Google isn't perfect and so it is worth finding a few packages and going through this checklist. By taking a few minutes to consider factors like maintenance, documentation, and community adoption can save you time in the long run.
0 commit comments