diff --git a/posts/published-osdc/index.qmd b/posts/published-osdc/index.qmd new file mode 100644 index 0000000..959b6b4 --- /dev/null +++ b/posts/published-osdc/index.qmd @@ -0,0 +1,75 @@ +--- +title: "Published our first externally collaborated R package, osdc" +description: | + Over the last two year, we've collaborated with a researcher at Steno Aarhus on building an + R package called osdc. This package, titled + *Open Source Diabetes Classifier*, classifies diabetes status in the + Danish registers. And finally, we've published it to CRAN! +author: +- Luke W. Johnston +date: "2025-12-18" +categories: + - packaging + - publishing + - programming +--- + +On December 10th, 2025, we finally published our first R package to +[CRAN](https://cran.r-project.org/)! :tada: + +The package is called +[osdc](https://cran.r-project.org/web/packages/osdc/index.html), or +"Open Source Diabetes Classifier", and it is our first package that +we've built in collaboration with an external researcher, [Anders Aasted +Isaksen](https://www.stenoaarhus.dk/kontakt/anders-aasted-isaksen/). He +developed an algorithm to classify type 1 and type 2 diabetes using +Danish registers as data sources, and we worked together to turn this +algorithm into an R package that others can use. We started the +collaboration back in 2023, and after a lot of work, we finally got it +to a stage that we could publish a first version to CRAN. + +The package has two aims (as described in the package +[documentation](https://steno-aarhus.github.io/osdc/)): + +1. To provide an open-source, code-based algorithm to classify type 1 + and type 2 diabetes using Danish registers as data sources. There + are other diabetes algorithms developed in Denmark for the + registers, but they are not open source nor packaged into a reusable + format. +2. To inspire discussions within the Danish register-based research + space on the openness and ease of use on the existing tooling and + registers, and on the need for an official process for updating or + contributing to existing data sources. + +## Who is it for and why use it? + +The main reason for building the osdc package was to provide a tool for +researchers doing diabetes research with Danish register data to +classify diabetes. There are no Danish registers that fully captures the +different ways that a person could be classified with diabetes, as +administrative diagnosis data is not always complete nor accurate. So, +researchers have had to develop different algorithms to get a better +idea of who has diabetes in the Danish registers. + +However, these algorithms have not been open source, and they have not +been packaged into reusable tools. Which has lead to many researchers +having different "in-house" solutions for their group or organisation +that other groups can't really use effectively. We wanted to change +that. + +So, we built the osdc package with all the necessary details for +researchers to classify diabetes status in their own Danish register +data. For example, the package provides a list of which registers and +variables are needed with the use of the `registers()` function. Other +than a few other helper functions, the main function of the package is +`classify_diabetes()`, which takes all the required registers and +outputs a data frame with a list of individuals, their diabetes status, +and the date when the classification was made. + +Aside from those functions, the package provides an `algorithm()` +function that lists all the specific criteria used in the algorithm. +This makes it easier for others to assess how exactly the algorithm +classifies diabetes. + +The next step is to start using the osdc package in collaborating +projects that use Denmark Statistics and register data :tada: