Skip to content

Commit 1794e06

Browse files
committed
stitchraster added. movegroup made more efficient, added param to save csvs for individual core and general areas for stitchraster.
1 parent 3d238b2 commit 1794e06

File tree

9 files changed

+223
-49
lines changed

9 files changed

+223
-49
lines changed

DESCRIPTION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Package: movegroup
22
Title: Visualizing and Quantifying Space Use Data for Groups of Animals
3-
Version: 2025.01.28
3+
Version: 2025.05.13
44
Authors@R: c(
55
person("Simon", "Dedman", email = "simondedman@gmail.com", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-9108-972X")),
66
person("Maurits", "van Zinnicq Bergmann", email = "mauritsvzb@gmail.com", role = c("aut"), comment = c(ORCID = "0000-0002-8414-5025")))

NAMESPACE

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,15 @@ export(moveLocErrorCalc)
55
export(movegroup)
66
export(plotraster)
77
export(scaleraster)
8+
export(stitchraster)
89
import(ggmap)
910
import(ggplot2)
1011
import(utils)
1112
importFrom(beepr,beep)
1213
importFrom(dplyr,across)
1314
importFrom(dplyr,arrange)
1415
importFrom(dplyr,bind_cols)
16+
importFrom(dplyr,bind_rows)
1517
importFrom(dplyr,distinct)
1618
importFrom(dplyr,filter)
1719
importFrom(dplyr,group_by)
@@ -76,5 +78,6 @@ importFrom(stats,setNames)
7678
importFrom(stringr,str_remove)
7779
importFrom(terra,project)
7880
importFrom(tidyr,drop_na)
81+
importFrom(tidyselect,all_of)
7982
importFrom(tidyselect,everything)
8083
importFrom(viridis,scale_fill_viridis)

NEWS.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
---
22
title: "NEWS.md"
33
author: "Simon Dedman"
4-
date: "2024-03-05"
4+
date: "2024-05-13"
55
output: html_document
66
---
7+
# v2025.05.13
8+
* stitchraster added, movegroup made more efficient and updated for tidyselect 1.2.0
9+
710
# v2024.03.05
811
* first CRAN release
912

R/movegroup.R

Lines changed: 40 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,10 @@
122122
#' @param savedir Save outputs to a temporary directory (default) else change
123123
#' to desired directory e.g. "/home/me/folder". Do not use getwd() for this.
124124
#' Do NOT include terminal slash. Directory must exist. Default tempdir().
125+
#' @param saveAreaCT Save tiny individual core and general use areas tables to
126+
#' disk. These are the only things retained in the per-individual loop, so if
127+
#' your large dataset causes memory crashes, you can run it in chunks and stitch
128+
#' the results together later with stitchraster. Default FALSE.
125129
#' @param alerts Audio warning for failures. Default TRUE.
126130
#'
127131
#' @return Individual-level utilization distributions, saved as rasters, as well
@@ -130,8 +134,9 @@
130134
#' move::brownian.motion.variance.dyn. No processed object is returned, i.e. bad: "objectname <-
131135
#' movegroup()", good: "movegroup()".
132136
#' @details
133-
#' When used together, the order of functions would be: movegroup, scaleraster, alignraster if
134-
#' required, plotraster.
137+
#' When used together, the order of functions would be: movegroup, (stitchraster
138+
#' if required), scaleraster, (alignraster if required then stitchraster again)
139+
#' , plotraster.
135140
#'
136141
#' ## Errors and their origins:
137142
#'
@@ -158,6 +163,14 @@
158163
#' of that individual. Try lowering movemargin (default 11, has to be odd) and then lowering
159164
#' dbbmmwindowsize (default 23, has to be >=2*movemargin, has to be odd).
160165
#'
166+
#' 8. Error in validityMethod(as(object, superClass)): The used raster is not a
167+
#' UD (sum unequal to 1), sum is NaN. Potentially from memory overrun from large
168+
#' datasets. Close all other programs, restart your session, remove other
169+
#' objects from R, and try again, watching the RAM usage piechart by the
170+
#' Environment tab. If unsuccessful, run function in chunks of individuals until
171+
#' all (of length >= windowsize) are saved to disc as asc files, then produce
172+
#' final summary stats with stitchraster.
173+
#'
161174
#' @examples
162175
#' \donttest{
163176
#' # load data
@@ -216,6 +229,10 @@ movegroup <- function(
216229
absVolumeAreaSaveName = "VolumeArea_AbsoluteScale.csv",
217230
savedir = tempdir(), # save outputs to a temporary directory (default) else change to current
218231
# directory e.g. "/home/me/folder". Do not use getwd() here.
232+
saveAreaCT = FALSE, # save tiny individual core and general use areas tables
233+
# to disk. These are the only things retained in the per-individual loop, so
234+
# if your large dataset causes memory crashes, you can run it in chunks and
235+
# stitch the results together later with stitchraster.
219236
alerts = TRUE # audio warning for failures
220237
) {
221238
# If savedir has a terminal slash, remove it, it's added later
@@ -259,13 +276,6 @@ movegroup <- function(
259276
"HFA" = ".img"
260277
)
261278

262-
# data <- dplyr::rename(
263-
# .data = data, # Rename user entry to "ID", Ditto Datetime Lat & Lon
264-
# Datetime = .data[[Datetime]],
265-
# ID = .data[[ID]],
266-
# Lat = .data[[Lat]],
267-
# Lon = .data[[Lon]]
268-
# )
269279
data <- dplyr::rename(
270280
# Rename user entry to "ID", Ditto Datetime Lat & Lon
271281
.data = data,
@@ -299,23 +309,6 @@ movegroup <- function(
299309
data <- cbind(data, cord.UTM) # 1308 x 7
300310
rm(list = c("cord.UTM", "cord.dec"))
301311

302-
# Construct movement models per individual.
303-
# Some notes regarding the 'move package and construction of movement models: Several arguments
304-
# need to be used to run the model. 1. Window size: corresponds with number of locations and moves
305-
# along a given trajectory to estimate the MA parameter within defined subsections of the path.
306-
# This increases the ability to detect breakpoints where changes in behaviour occur. The window
307-
# size should relate to what kind of behaviours the model is desired to identify e.g., a window
308-
# size of 23 means the sliding window is moved every 23 locations or every 23 hours (has to do
309-
# with sampling interval) 2. Margin: motion variance based on only the middle section of the
310-
# trajectory; the ends of the movement trajectory where no changes are allowed because at some
311-
# stage you want to have a few locations to base your estimation of the variance on and how many
312-
# locations in either side of the window we use for this, is called the margin. Smaller values for
313-
# window size and margin is expected to give a higher frequency of behavioural changes; make these
314-
# large for looking at migrations. 3. The raster dictates the grid cell size for the UD to be
315-
# calculated per grid cell per individual. Create the raster of certain size that matches with
316-
# coordinates used to make the move object 4. Extent: is incorporated if there are animal
317-
# locations that border the edges of the raster.
318-
319312
# A dBBMM is not run if total detections of individual < window size (default value, 31).
320313
# Below code checks and filters out individuals with insufficient data.
321314
enoughrelocs <- data |>
@@ -325,24 +318,10 @@ movegroup <- function(
325318
pull(ID)
326319
data <- data |> filter(ID %in% enoughrelocs)
327320
rm(enoughrelocs)
328-
# check1 <- data |>
329-
# dplyr::group_by(.data$ID) |>
330-
# dplyr::summarise(relocations = length(.data$Datetime))
331-
# check2 <- dplyr::filter(check1, .data$relocations >= dbbwindowsize) # filter: removed 2 rows (14%), 12 rows remaining
332-
#
333-
# if (length(check1$ID) != length(check2$ID)) {
334-
# data <- dplyr::semi_join(data, check2) # Joining, by = "ID". semi_join: added no columns
335-
# check1 <- data |>
336-
# dplyr::group_by(.data$ID) |>
337-
# dplyr::summarise(relocations = length(.data$Datetime))
338-
# check2 <- dplyr::filter(check1, .data$relocations >= dbbwindowsize) # filter: no rows removed
339-
# length(check1$ID) == length(check2$ID)
340-
# }
341321

342322
# Create per-individual move object, project it, construct a dBBMM, calculate the volume area
343323
# within the 50% and 95% contours, save as ASCII.
344324

345-
bb <- list()
346325
bb.list <- list()
347326

348327
data <- tidyr::drop_na(data = data, ID) |>
@@ -669,6 +648,15 @@ movegroup <- function(
669648
# Add ID id
670649
area.ct$ID <- i
671650

651+
# Save area.ct's if requested due to memory issues
652+
if (exists("saveAreaCT") && saveAreaCT) {
653+
write.csv(
654+
area.ct,
655+
file = file.path(savedir, paste0("areact_", i, ".csv")),
656+
row.names = FALSE
657+
)
658+
}
659+
672660
# Put in list
673661
bb.list[[counter]] <- area.ct
674662

@@ -686,9 +674,16 @@ movegroup <- function(
686674
if (writeRasterFormat != "CDF") bylayer = TRUE, # bylayer kills ncdf4
687675
overwrite = TRUE
688676
)
677+
rm(bb)
689678
gc() # cleanup
690679
} # close for i in unique data$ID
691680

681+
# MEMORY OVERRUN ISSUE ####
682+
# conceptually I could save area.ct
683+
# (and bb? bb is created as a blank list but then overwritten as a new UD then named (may do nothing) then reoverwritten in the next loop)
684+
# then load all area.ct objects in a loop
685+
# so users could break their runs into chunks
686+
692687
# Put everything in a data.frame
693688
md <- dplyr::bind_rows(bb.list, .id = "column_label") |>
694689
dplyr::select(!"column_label") # remove column_label column
@@ -705,22 +700,22 @@ movegroup <- function(
705700
# 2023-10-04 Vital memory bug warning
706701
if (all(md$core.use == md$core.use[1]))
707702
message(
708-
"All core UDs identical. Possibly due to insufficient memory for raster calculations. Check rasterResolution"
703+
"All core UDs identical. Maybe insufficient memory for raster calcs - check rasterResolution"
709704
)
710705
if (all(md$general.use == md$general.use[1]))
711706
message(
712-
"All general UDs identical. Possibly due to insufficient memory for raster calculations. Check rasterResolution"
707+
"All general UDs identical. Maybe insufficient memory for raster calcs - check rasterResolution"
713708
)
714709
if (length(which(md$core.use == max(md$core.use, na.rm = TRUE))) > 1)
715710
message(
716-
"More than 1 individual share exactly the same max value for core use, possibly due to insufficient memory for raster calculations. Check rasterResolution"
711+
"More than 1 individual share exactly the same max value for core use, maybe insufficient memory for raster calcs - check rasterResolution"
717712
)
718713
if (length(which(md$general.use == max(md$general.use, na.rm = TRUE))) > 1)
719714
message(
720-
"More than 1 individual share exactly the same max value for general use, possibly due to insufficient memory for raster calculations. Check rasterResolution"
715+
"More than 1 individual share exactly the same max value for general use, maybe insufficient memory for raster calcs - check rasterResolution"
721716
)
722717
if (length(which((md$core.use - md$general.use) == 0)) > 0)
723718
message(
724-
"1 or more individuals have exactly the same value for core and general use, possibly due to insufficient memory for raster calculations. Check rasterResolution"
719+
"1 or more individuals have exactly the same value for core and general use, maybe insufficient memory for raster calcs - check rasterResolution"
725720
)
726721
} # close function

R/stitchraster.R

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
#' Stitch together movegroup data individuals core and general use areas
2+
#'
3+
#' If over-large datasets cause RAM crashes for movegroup, one can run batches
4+
#' of individuals in movegroup then join the individual saved area.ct csv files.
5+
#'
6+
#' See www.GitHub.com/SimonDedman/movegroup for issues, feedback, and
7+
#' development suggestions. Install 'move' development version with:
8+
#' remotes::install_git('https://gitlab.com/bartk/move.git')
9+
#'
10+
#' @import utils
11+
#' @importFrom dplyr mutate rename select bind_rows pull
12+
#' @importFrom rlang .data
13+
#' @importFrom tidyselect all_of
14+
#'
15+
#' @export stitchraster
16+
#'
17+
#' @param data Data frame object containing the data. Requires columns Lat Lon DateTime ID and
18+
#' potentially a grouping column (not currently implemented, email to request). Column names
19+
#' specified in later parameters.
20+
#' @param ID Name of animal tag ID column in data. "Character".
21+
#' @param absVolumeAreaSaveName File name plus extension where UD estimates are saved. Default
22+
#' "VolumeArea_AbsoluteScale.csv".
23+
#' @param savedir Save outputs to a temporary directory (default) else change
24+
#' to desired directory e.g. "/home/me/folder". Do not use getwd() for this.
25+
#' Do NOT include terminal slash. Directory must exist. Default tempdir().
26+
#'
27+
#' @return Calculated volume area estimates for 50 and 95pct contours csv.
28+
#' @details
29+
#' Parameters values should match those used in movegroup.
30+
#'
31+
#' @author Simon Dedman, \email{simondedman@@gmail.com}
32+
#'
33+
stitchraster <- function(
34+
data = NULL, # Data frame object containing the data. Requires columns Lat Lon DateTime ID and optionally a grouping column.
35+
ID = NULL, # Name of animal tag ID column in data.
36+
absVolumeAreaSaveName = "VolumeArea_AbsoluteScale.csv",
37+
savedir = tempdir() # save outputs to a temporary directory (default) else change to current
38+
# directory e.g. "/home/me/folder". Do not use getwd() here.
39+
) {
40+
data <- dplyr::rename(
41+
# Rename user entry to "ID", Ditto Datetime Lat & Lon
42+
.data = data,
43+
ID = tidyselect::all_of(ID)
44+
) |>
45+
dplyr::mutate(ID = make.names(ID)) |>
46+
# remove all extraneous columns, massively reducing computational need
47+
dplyr::select(tidyselect::all_of(c("ID", "Datetime", "Lat", "Lon")))
48+
49+
ID <- unique(data$ID)
50+
rm(data)
51+
bb.list <- list()
52+
counter <- 0
53+
54+
for (i in ID) {
55+
counter <- counter + 1
56+
# read in area cts
57+
area.ct <- read.csv(file = file.path(savedir, paste0("areact_", i, ".csv")))
58+
# Put in list
59+
bb.list[[counter]] <- area.ct
60+
} # close for i in ID
61+
62+
# Put everything in a data.frame
63+
md <- dplyr::bind_rows(bb.list, .id = "column_label") |>
64+
dplyr::select(!"column_label") # remove column_label column
65+
# 2023-08-30 quoted column_label to hopefully address gbm.factorplot: no visible binding for global variable ‘column_label’
66+
67+
# read resterres from file
68+
rasterres <- read.csv(file = file.path(savedir, "Resolutions.csv")) |>
69+
dplyr::pull(rasterres)
70+
md$core.use <- (rasterres * md$core.use.new) / 1000000 # convert from cells/pixels to metres squared area based on cell size, then to kilometres squared area
71+
md$general.use <- (rasterres * md$general.use.new) / 1000000
72+
73+
write.csv(
74+
md,
75+
file = file.path(savedir, absVolumeAreaSaveName),
76+
row.names = FALSE
77+
)
78+
79+
# 2023-10-04 Vital memory bug warning
80+
if (all(md$core.use == md$core.use[1]))
81+
message(
82+
"All core UDs identical. Maybe insufficient memory for raster calcs - check rasterResolution"
83+
)
84+
if (all(md$general.use == md$general.use[1]))
85+
message(
86+
"All general UDs identical. Maybe insufficient memory for raster calcs - check rasterResolution"
87+
)
88+
if (length(which(md$core.use == max(md$core.use, na.rm = TRUE))) > 1)
89+
message(
90+
"More than 1 individual share exactly the same max value for core use, maybe insufficient memory for raster calcs - check rasterResolution"
91+
)
92+
if (length(which(md$general.use == max(md$general.use, na.rm = TRUE))) > 1)
93+
message(
94+
"More than 1 individual share exactly the same max value for general use, maybe insufficient memory for raster calcs - check rasterResolution"
95+
)
96+
if (length(which((md$core.use - md$general.use) == 0)) > 0)
97+
message(
98+
"1 or more individuals have exactly the same value for core and general use, maybe insufficient memory for raster calcs - check rasterResolution"
99+
)
100+
} # close function

README.Rmd

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,12 @@ However please appraise yourself of the meaning of the various parameters as the
109109

110110
***
111111

112+
### stitchraster
113+
114+
If over-large datasets cause RAM crashes for movegroup, one can run batches of individuals in movegroup then join the individual saved area.ct csv files.
115+
116+
***
117+
112118
### scaleraster
113119

114120
Scales Individual Utilization Distribution Rasters and Volume Area Estimates

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ downloads](https://cranlogs.r-pkg.org/badges/movegroup)](https://cran.r-project.
1414
<!-- badgeplacer(location = ".", status = "active", githubaccount = SimonDedman, githubrepo = movegroup, branch = master, name = "README.Rmd") -->
1515

1616
<p align="center">
17+
1718
<img src="man/figures/logo.png" width="200">
1819
</p>
1920

@@ -225,6 +226,14 @@ adjust these elements after later seeing the resulting plots from
225226

226227
------------------------------------------------------------------------
227228

229+
### stitchraster
230+
231+
If over-large datasets cause RAM crashes for movegroup, one can run
232+
batches of individuals in movegroup then join the individual saved
233+
area.ct csv files.
234+
235+
------------------------------------------------------------------------
236+
228237
### scaleraster
229238

230239
Scales Individual Utilization Distribution Rasters and Volume Area

0 commit comments

Comments
 (0)