Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,13 @@ R package with utility functions for dynamic reporting, spatial analysis, hydrol

```r
devtools::document()
devtools::test()
devtools::check()
devtools::test() # Use this for development - faster
devtools::check(vignettes = FALSE) # Only when needed; always skip vignettes
devtools::install()
```
Build documentation and run checks before committing.
- Prefer `devtools::test()` during development - it's faster
- Only run `devtools::check()` when preparing for release or CI
- Always skip vignettes during checks (`vignettes = FALSE`)

## Commit Style (fledge)

Expand Down Expand Up @@ -72,3 +74,20 @@ All exported functions use prefix `ngr_` followed by category:
- Keep Imports alphabetized
- Don't duplicate packages in both Imports and Suggests
- Add vignette-only packages to Suggests (e.g., mapview, rstac)

## GitHub CLI (gh) Idiosyncrasies

- **Use backticks for function names in issue titles** - makes them pop:
```bash
gh issue create --title "Add \`ngr_fs_type_write()\` for feature" # correct
```
- **No `gh milestone` command** - use `gh api` to create/manage milestones:
```bash
gh api repos/NewGraphEnvironment/ngr/milestones -X POST -f title="Milestone title" -f description="Description"
```
- **`--milestone` flag needs title, not number** - use the milestone name:
```bash
gh issue create --title "Issue" --milestone "Type-preserving flat file operations" # correct
gh issue create --title "Issue" --milestone 6 # won't work
```
- **Flag CLI limitations to user** - if you encounter unexpected CLI behavior, inform the user so they're aware of the limitation
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Config/testthat/edition: 3
URL: https://github.com/NewGraphEnvironment/ngr, https://newgraphenvironment.github.io/ngr/
BugReports: https://github.com/NewGraphEnvironment/ngr/issues
Imports:
arrow,
chk,
cli,
curl,
Expand Down
7 changes: 7 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ export(ngr_dbqs_ltree)
export(ngr_dbqs_tbl_quote)
export(ngr_fs_copy_if_missing)
export(ngr_fs_id_missing)
export(ngr_fs_type_read)
export(ngr_fs_type_write)
export(ngr_git_issue)
export(ngr_git_issue_details)
export(ngr_hyd_q_daily)
Expand Down Expand Up @@ -44,6 +46,11 @@ export(ngr_tidy_type)
export(ngr_xl_map_colnames)
export(ngr_xl_map_formulas)
export(ngr_xl_read_formulas)
importFrom(arrow,read_csv_arrow)
importFrom(arrow,read_parquet)
importFrom(arrow,schema)
importFrom(arrow,write_csv_arrow)
importFrom(arrow,write_parquet)
importFrom(chk,abort_chk)
importFrom(chk,chk_character)
importFrom(chk,chk_data)
Expand Down
54 changes: 54 additions & 0 deletions R/ngr_fs_type_read.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#' Read data from a flat file with type schema preservation
#'
#' Reads a data frame from a flat file format (CSV by default) using a companion
#' parquet schema file to restore column types. This enables type-preserving
#' round-trips for formats that don't natively preserve types.
#'
#' @param path Character. Path to the file to read.
#' A companion schema file with suffix `_schema.parquet` must exist.
#' @param format Character. File extension to replace when finding the schema file.
#' Default is "csv".
#'
#' @return A [tibble] with column types restored from the schema file.
#' @family fs
#' @family serialization
#' @seealso [ngr_fs_type_write()] for writing files with type preservation
#' @export
#' @importFrom arrow read_csv_arrow read_parquet schema
#' @importFrom chk chk_file chk_string
#'
#' @examples
#' \dontrun{
#' # Create example data with various types
#' df <- data.frame(
#' int_col = 1:3L,
#' dbl_col = c(1.1, 2.2, 3.3),
#' chr_col = c("a", "b", "c"),
#' date_col = as.Date(c("2024-01-01", "2024-01-02", "2024-01-03")),
#' lgl_col = c(TRUE, FALSE, TRUE)
#' )
#'
#' # Write to temporary file
#' path <- tempfile(fileext = ".csv")
#' ngr_fs_type_write(df, path)
#'
#' # Read back with types preserved
#' df2 <- ngr_fs_type_read(path)
#' str(df2)
#'
#' # Compare types
#' sapply(df, class)
#' sapply(df2, class)
#' }
ngr_fs_type_read <- function(path, format = "csv") {
chk::chk_string(path)
chk::chk_file(path)
chk::chk_string(format)

pattern <- paste0("\\.", format, "$")
schema_path <- sub(pattern, "_schema.parquet", path, ignore.case = TRUE)
chk::chk_file(schema_path)

schema <- arrow::schema(arrow::read_parquet(schema_path))
arrow::read_csv_arrow(path, schema = schema, skip = 1)
}
58 changes: 58 additions & 0 deletions R/ngr_fs_type_write.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#' Write data to a flat file with type schema preservation
#'
#' Writes a data frame to a flat file format (CSV by default) and stores the
#' column type schema in a companion parquet file. This enables type-preserving
#' round-trips for formats that don't natively preserve types, while keeping
#' data in human-readable flat files suitable for GitHub collaboration.
#'
#' @param x A [data.frame] or [tibble] to write.
#' @param path Character. Path to write the file to.
#' @param format Character. File extension to replace when creating the schema file.
#' Default is "csv".
#'
#' @return Invisibly returns the file path.
#' @family fs
#' @family serialization
#' @seealso [ngr_fs_type_read()] for reading files with preserved types
#' @export
#' @importFrom arrow write_csv_arrow write_parquet
#' @importFrom chk chk_data chk_string
#' @importFrom fs dir_create path_dir
#'
#' @examples
#' \dontrun{
#' # Create example data with various types
#' df <- data.frame(
#' int_col = 1:3L,
#' dbl_col = c(1.1, 2.2, 3.3),
#' chr_col = c("a", "b", "c"),
#' date_col = as.Date(c("2024-01-01", "2024-01-02", "2024-01-03")),
#' lgl_col = c(TRUE, FALSE, TRUE)
#' )
#'
#' # Write to temporary file
#' path <- tempfile(fileext = ".csv")
#' ngr_fs_type_write(df, path)
#'
#' # Schema file is created alongside
#' schema_path <- sub("\\.csv$", "_schema.parquet", path)
#' file.exists(schema_path)
#'
#' # Read back with types preserved
#' df2 <- ngr_fs_type_read(path)
#' str(df2)
#' }
ngr_fs_type_write <- function(x, path, format = "csv") {
chk::chk_data(x)
chk::chk_string(path)
chk::chk_string(format)

pattern <- paste0("\\.", format, "$")
schema_path <- sub(pattern, "_schema.parquet", path, ignore.case = TRUE)

fs::dir_create(fs::path_dir(path))
arrow::write_csv_arrow(x, path)
arrow::write_parquet(x[0, , drop = FALSE], schema_path)

invisible(path)
}
4 changes: 3 additions & 1 deletion man/ngr_fs_copy_if_missing.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion man/ngr_fs_id_missing.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

60 changes: 60 additions & 0 deletions man/ngr_fs_type_read.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

62 changes: 62 additions & 0 deletions man/ngr_fs_type_write.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading