Skip to content

TwoSampleMR 0.6.30#669

Merged
remlapmot merged 34 commits intoMRCIEU:masterfrom
remlapmot:devel-2026-02
Feb 6, 2026
Merged

TwoSampleMR 0.6.30#669
remlapmot merged 34 commits intoMRCIEU:masterfrom
remlapmot:devel-2026-02

Conversation

@remlapmot
Copy link
Contributor

Some optimizations including:

  • Vectorised mr_egger_regression_bootstrap()
  • Vectorised weighted_median_bootstrap()
  • Deleted duplicated weighted_median() function
  • Replace plyr function calls with data.table function calls
    • plyr::rbind.fill(...) to data.table::rbindlist(..., fill = TRUE, use.names = TRUE)
    • plyr::ddply(dat, cols, func) to lapply() over unique combinations + data.table::rbindlist()
    • Added data.table::setDF() calls to convert back to data.frame for compatibility
    • And removed plyr from Imports list
  • In flip_alleles() use chartr() instead of 4 gsub() calls
  • In random_string() use single call to sample() instead of n calls
  • Optimized mr_mode()
  • Replaced apply(..., any(is.na())) with complete.case()
  • Optimized the mr() function
  • Optimized the Optimize get_r_from_lor() function
  • Optimized the mr_rucker_bootstrap() and mr_rucker_jackknife_internal() functions
  • Replaced sapply() with vapply() in several cases
  • Optimized the simple_cap() function
  • And a few other minor optimizations

- `plyr::rbind.fill(...)` → `data.table::rbindlist(..., fill = TRUE, use.names = TRUE)`
- `plyr::ddply(dat, cols, func)` → `lapply()` over unique combinations + `data.table::rbindlist()`
- Added `data.table::setDF()` calls to convert back to data.frame for compatibility
**1. `beta()` function:**

- Replaced growing vector (`beta <- NULL; beta[length(beta) + 1] <- ...`) with pre-allocated `numeric(length(phi))`
- Used `which.max(densityIV$y)` instead of `densityIV$y == max(densityIV$y)` (safer, avoids floating-point equality issues)

**2. `boot()` function:**

- Pre-generate all random values as two matrices (2 `rnorm` calls instead of 2000)
- Pre-compute `ones <- rep(1, n)` and `weights` outside the loop instead of recomputing each iteration
- Cache `nphi <- length(phi)` to avoid repeated `length()` calls

The loop itself still iterates 1000 times calling `beta()` (which calls `stats::density()`), but the per-iteration overhead is reduced by eliminating random number generation and redundant computations from inside the loop
- **Moved `ve` computation outside the loop** — was being recalculated identically on every iteration
- **Eliminated the `for` loop entirely** — the `prop` calculation, `vg`, `r`, `correction`, and `sign` operations are all vectorized now
- **Single call to `get_population_allele_frequency()`** with full vectors instead of element-by-element
- Pre-generate all random values as two matrices (2 `rnorm` calls instead of 2000) — `boot_exp` and `boot_out` are `nboot x nsnp` matrices
- Loop now indexes rows from the pre-generated matrices instead of calling `rnorm()` each iteration
- `sapply` → `vapply` for `model`, `Q`, `Qdash` extraction (4 calls)
- sapply` → `vapply` for `model`, `Q`, `Qdash` extraction (4 calls)
@remlapmot remlapmot merged commit 07311dd into MRCIEU:master Feb 6, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant