You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 22, 2025. It is now read-only.
In the "inside_polygon" function of "cytopy/data/geometry.py", the performance of multi process implementation is very poor. A 10000 line of data takes nearly 30 seconds, but it only takes less than 0.04 seconds to change to normal programming.For the time being, I simply handle it like this, adding a row count judgment.Maybe you have a better way to deal with it.
if len(df) < 100000: # row count judgment.
# Single thread implementation
xy = df[[x, y]].values
(min_x, min_y, max_x, max_y) = poly.bounds
mask = []
for p in xy:
bol = min_x <= p[0] <= max_x and min_y <= p[1] <= max_y and point_in_poly(p, poly) is True
mask.append(bol)
else:
# Multi process implementation
if njobs < 0:
njobs = cpu_count()
xy = df[[x, y]].values
f = partial(point_in_poly, poly=poly)
with Pool(njobs) as pool:
mask = list(pool.map(f, xy))
return df.iloc[mask]
My configuration:
python: 3.8.7
Memory: 32g
CPU: i7-1165g7 (4 cores and 8 threads)