Skip to content

age_adjust() with keep_age should try harder? #45

@gadenbuie

Description

@gadenbuie

If age groups don't match, then what?

library(tidyverse)
library(fcds)

fcds <- fcds_load()

# work with random subsample
fcds <- fcds %>% group_by(!!!rlang::syms(fcds_vars("demo"))) %>% sample_n(1) %>% ungroup()

If we do the regrouping first, age_adjust() will ultimately fail.

fcds_regrouped <- 
  fcds %>% 
  separate_age_groups() %>% 
  mutate(
    age_group = case_when(
      age_high < 20 ~ "< 20",
      age_high < 50 ~ "20 - 49",
      age_high < 60 ~ "50 - 64",
      age_high < 85 ~ "65 - 84",
      TRUE ~ "85 +"
    ),
    age_group = fct_reorder(age_group, age_low)
  )

fcds_vars(.data = fcds_regrouped, "demo")
#> # A tibble: 14,815 x 8
#>    age_group race  sex   origin marital_status birth_country birth_state
#>    <fct>     <fct> <fct> <fct>  <fct>          <fct>         <fct>      
#>  1 < 20      White Male  Non-H… Married; Unma… US States an… Florida    
#>  2 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  3 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  4 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  5 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  6 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  7 < 20      White Male  Non-H… Single, Separ… US States an… Florida    
#>  8 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#>  9 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#> 10 < 20      White Male  Non-H… Single, Separ… US States an… Other US S…
#> # … with 14,805 more rows, and 1 more variable: primary_payer <fct>

fcds_regrouped %>% 
  count_fcds() %>% 
  age_adjust(keep_age = TRUE)
#> The age groups in `data` do not match any age groups in
#> `population_standard`.

The current way around this is to do the re-grouping after the age adjustment.

fcds %>% 
  count_fcds() %>% 
  age_adjust(keep_age = TRUE) %>% 
  separate_age_groups() %>%
  group_drop(age_group) %>% 
  mutate(
    age_group = case_when(
      age_high < 20 ~ "< 20",
      age_high < 50 ~ "20 - 49",
      age_high < 60 ~ "50 - 64",
      age_high < 85 ~ "65 - 84",
      TRUE ~ "85 +"
    ),
    age_group = fct_reorder(age_group, age_low)
  ) %>% 
  group_by(age_group, add = TRUE) 
#> # A tibble: 126 x 9
#> # Groups:   year, year_mid, age_group [35]
#>    year  year_mid age_group age_low age_high     n population std_pop
#>    <fct> <chr>    <fct>       <dbl>    <dbl> <int>      <dbl>   <dbl>
#>  1 1981… 1983     < 20            0        4    20     672372  1.90e7
#>  2 1981… 1983     < 20            5        9    17     605665  1.99e7
#>  3 1981… 1983     < 20           10       14    17     712639  2.01e7
#>  4 1981… 1983     < 20           15       19    28     789181  1.98e7
#>  5 1981… 1983     20 - 49        20       24    48     890738  1.83e7
#>  6 1981… 1983     20 - 49        25       29    41     892078  1.77e7
#>  7 1981… 1983     20 - 49        30       34    41     793533  1.95e7
#>  8 1981… 1983     20 - 49        35       39    36     686575  2.22e7
#>  9 1981… 1983     20 - 49        40       44    46     581196  2.25e7
#> 10 1981… 1983     20 - 49        45       49    50     514395  1.98e7
#> # … with 116 more rows, and 1 more variable: w <dbl>

But the re-grouped ages overlap the underlying standard ages, so age_adjust() could have called standardize_age_groups() on the population data relative to the input data to do this for us.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions