Skip to content

Lower default tolerance to find optimal matching #253

@nremenyi

Description

@nremenyi

I noticed that using optmatch that I got very different solution from another package (clue). The reason was the tolerance. I think it would be wise to lower the default tolerance?

### Simulate data set: trt (1) / ctrl (0) indicator and covariate x
N = 10^3
M = 9*10^3
set.seed(124)
data = rbind(
  data.frame(id = 1:N, group = 1, x = rnorm(N, mean = 0, sd = 5)),
  data.frame(id = (N+1):(N+M), group = 0, x = rnorm(M, mean = 0, sd = 5))
)

### Optmatch pairmatch
mhd = optmatch::match_on(group ~ x, data = data)
m1 = optmatch::pairmatch(mhd, data = data)
cat("Total distance for Optmatch pairmatch with default tol:", 
    sum(optmatch::matched.distances(m1, distance = mhd)), "\n")
m2 = optmatch::pairmatch(mhd, data = data, tol = 0.000001)
cat("Total distance for Optmatch pairmatch with tol = 0.000001:", 
    sum(optmatch::matched.distances(m2, distance = mhd)), "\n")

### Clue package (Assignment problem)
sol_h2 = clue::solve_LSAP(x = mhd, maximum = FALSE)
match_mx_h2 = matrix(NA, nrow = min(N,M), ncol = 2)
match_mx_h2[,1] = 1:min(N,M)
match_mx_h2[,2] = sol_h2[]
cat("Total distance for Clue Assignment:", sum(mhd[match_mx_h2]), "\n")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions