From d4ad089ef02fef43b344da75766d5b67f46668cb Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Mon, 29 Sep 2025 17:00:51 -0400 Subject: [PATCH 01/18] add skeleton for industry talk review blog post --- site/posts/industry-talk-review/index.ipynb | 64 +++++++++++++++++++ site/posts/industry-talk-review/index.qmd | 37 +++++++++++ .../posts/industry-talk-review/references.bib | 28 ++++++++ 3 files changed, 129 insertions(+) create mode 100644 site/posts/industry-talk-review/index.ipynb create mode 100644 site/posts/industry-talk-review/index.qmd create mode 100644 site/posts/industry-talk-review/references.bib diff --git a/site/posts/industry-talk-review/index.ipynb b/site/posts/industry-talk-review/index.ipynb new file mode 100644 index 0000000..eb80204 --- /dev/null +++ b/site/posts/industry-talk-review/index.ipynb @@ -0,0 +1,64 @@ +{ + "cells": [ + { + "cell_type": "raw", + "metadata": {}, + "source": [ + "---\n", + "title: \"Industry Talk Review\"\n", + "author: \"Rita Pecuch\"\n", + "date: \"2025-03-29\"\n", + "categories: [review, git, pharmaceuticals]\n", + "# format:\n", + "# html:\n", + "# toc: true\n", + "# code-fold: true\n", + "bibliography: references.bib\n", + "# TODO: add the csl file and uncomment\n", + "# csl: apa.csl # for APA style citation\n", + "toc: TRUE\n", + "toc-title: \"Table of Contents\"\n", + "toc-depth: 5\n", + "---" + ], + "id": "a8052c25" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction\n", + "\n", + "Test reference @esteva2019guide\n", + "\n", + "# Search Strategy\n", + "\n", + "# Major Themes\n", + "\n", + "## Theme 1\n", + "\n", + "## Theme 2\n", + "\n", + "# Key Findings\n", + "\n", + "# Conclusion\n", + "\n", + "# References\n", + "\n", + "\n", + "```{bibliography}\n", + "```" + ], + "id": "f904cdc1" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd new file mode 100644 index 0000000..c0761d8 --- /dev/null +++ b/site/posts/industry-talk-review/index.qmd @@ -0,0 +1,37 @@ +--- +title: "Industry Talk Review" +author: "Rita Pecuch" +date: "2025-03-29" +categories: [review, git, pharmaceuticals] +# format: +# html: +# toc: true +# code-fold: true +bibliography: references.bib +# TODO: add the csl file and uncomment +# csl: apa.csl # for APA style citation +toc: TRUE +toc-title: "Table of Contents" +toc-depth: 5 +--- + +# Introduction + +Test reference @esteva2019guide + +# Search Strategy + +# Major Themes + +## Theme 1 + +## Theme 2 + +# Key Findings + +# Conclusion + +# References + +```{bibliography} +``` diff --git a/site/posts/industry-talk-review/references.bib b/site/posts/industry-talk-review/references.bib new file mode 100644 index 0000000..d426cf7 --- /dev/null +++ b/site/posts/industry-talk-review/references.bib @@ -0,0 +1,28 @@ +@article{esteva2019guide, + title={A guide to deep learning in healthcare}, + author={Esteva, Andre and Chou, Kangchen and Yeung, Serena and Naik, Nikhil and Madani, Ali and Mottaghi, Ali and Liu, Yun and Topol, Eric}, + journal={Nature medicine}, + volume={25}, + number={1}, + pages={24--29}, + year={2019}, + publisher={Nature Publishing Group} +} + +@article{rajkomar2018scalable, + title={Scalable and accurate deep learning with electronic health records}, + author={Rajkomar, Alvin and et al.}, + journal={NPJ Digital Medicine}, + volume={1}, + number={1}, + pages={18}, + year={2018}, + publisher={Nature Publishing Group} +} + +@article{holzinger2017we, + title={What do we need to build explainable AI systems for the medical domain?}, + author={Holzinger, Andreas and et al.}, + journal={arXiv preprint arXiv:1712.09923}, + year={2017} +} From 70d8259e10a9355363387487fbc8c58eaeda3aea Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 30 Sep 2025 08:39:08 -0400 Subject: [PATCH 02/18] added introduction draft and start of search strategy --- site/posts/industry-talk-review/index.ipynb | 8 ++++---- site/posts/industry-talk-review/index.qmd | 12 ++++++------ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/site/posts/industry-talk-review/index.ipynb b/site/posts/industry-talk-review/index.ipynb index eb80204..acf1126 100644 --- a/site/posts/industry-talk-review/index.ipynb +++ b/site/posts/industry-talk-review/index.ipynb @@ -13,7 +13,7 @@ "# html:\n", "# toc: true\n", "# code-fold: true\n", - "bibliography: references.bib\n", + "# bibliography: references.bib\n", "# TODO: add the csl file and uncomment\n", "# csl: apa.csl # for APA style citation\n", "toc: TRUE\n", @@ -21,7 +21,7 @@ "toc-depth: 5\n", "---" ], - "id": "a8052c25" + "id": "18a4a0be" }, { "cell_type": "markdown", @@ -29,7 +29,7 @@ "source": [ "# Introduction\n", "\n", - "Test reference @esteva2019guide\n", + "Test reference \n", "\n", "# Search Strategy\n", "\n", @@ -49,7 +49,7 @@ "```{bibliography}\n", "```" ], - "id": "f904cdc1" + "id": "858e852d" } ], "metadata": { diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index c0761d8..5aaf504 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -7,9 +7,6 @@ categories: [review, git, pharmaceuticals] # html: # toc: true # code-fold: true -bibliography: references.bib -# TODO: add the csl file and uncomment -# csl: apa.csl # for APA style citation toc: TRUE toc-title: "Table of Contents" toc-depth: 5 @@ -17,10 +14,14 @@ toc-depth: 5 # Introduction -Test reference @esteva2019guide +The usage of Git in clinical programming within the pharmaceutical industry has been steadily increasing over the last several years. Beyond the discussions of our PHUSE working group, several talks at recent industry events have showcased how various companies are leveraging this technology. # Search Strategy +The search for materials spanned a few major conferences over the past few years. The PHUSE archive was the primary resource utilized, as heavy representation from pharmaceutical companies is customary for PHUSE events. In addition, talks from the Posit Conference were evaluated, as this is a major open-source technology event which typically includes at least a small degree of representation from the pharmaceutical industry. + +R/Pharma + # Major Themes ## Theme 1 @@ -33,5 +34,4 @@ Test reference @esteva2019guide # References -```{bibliography} -``` + From 443a82b63bb1c3e9df7be614bd2989e69ad9777c Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 30 Sep 2025 14:44:17 -0400 Subject: [PATCH 03/18] added citations for papers that will be reviewed --- site/posts/industry-talk-review/index.qmd | 34 +++++++++++++++++++---- 1 file changed, 28 insertions(+), 6 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 5aaf504..62dac71 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -3,10 +3,6 @@ title: "Industry Talk Review" author: "Rita Pecuch" date: "2025-03-29" categories: [review, git, pharmaceuticals] -# format: -# html: -# toc: true -# code-fold: true toc: TRUE toc-title: "Table of Contents" toc-depth: 5 @@ -18,9 +14,9 @@ The usage of Git in clinical programming within the pharmaceutical industry has # Search Strategy -The search for materials spanned a few major conferences over the past few years. The PHUSE archive was the primary resource utilized, as heavy representation from pharmaceutical companies is customary for PHUSE events. In addition, talks from the Posit Conference were evaluated, as this is a major open-source technology event which typically includes at least a small degree of representation from the pharmaceutical industry. +The search for materials spanned the 2025 PHUSE archive, as heavy representation from pharmaceutical companies is customary for PHUSE events. -R/Pharma +The aim is not only to summarize how companies are leveraging Git, but to evaluate applications of open-source technology and propose how Git can be useful. # Major Themes @@ -34,4 +30,30 @@ R/Pharma # References +In paper reference template - white papers: +(Furst & DeMillo, 2006) +Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf + +Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf + +Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf + +Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf + +Kim, J. (2025). The changes in the job of statistical programmer in the +pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf + +Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf + +Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf + +Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf + +Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf + +Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf + +Veríssimo, A., Viyash, V. (2025). Accelerating Shiny Application Development and Validation with Rhino [White paper OS15]. Appsilon. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS15.pdf + +Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf From 414c384a4e4e342616be744a497166a726f734ed Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Wed, 1 Oct 2025 10:55:33 -0400 Subject: [PATCH 04/18] added notes from first paper review --- site/posts/industry-talk-review/index.qmd | 40 ++++++++++++++++++++--- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 62dac71..d736821 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -20,9 +20,42 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua # Major Themes -## Theme 1 +## Increasing Usage of Open-Source Programming -## Theme 2 +(Andhale & Sood, 2025) +- R is powerful language for statistical modeling and hypothesis testing within the industry +- Clean data, apply stats models, and implement ML algorithms all within single platform +- Shiny allows visualizing large datasets that are common in this industry, and get actionable insights +- HTML, CSS, and JS knowledge not required --> so lowered barrier for getting more people into open-source programming +- Interactive elements for deeper data exploration +- Integrate data from several sources, across studies, leverage historical data +- More flexibility to meet diverse needs of stakeholders +- Scalability important for exponentially increasing amounts of data due to multicenter international trials + +## Rise of Agile Methologies + +(Andhale & Sood, 2025) +- becoming more common +- Real-time communication and iterative progress in workflows + +## Need for Version Control + +(Andhale & Sood, 2025) +- Important to support agile and collaborative development +- Particulary useful for complex projects + + +## GxP Compliance + +(Andhale & Sood, 2025): +- All open source packages used for regulatory submissions must be validated (proven to meet regulatory standards), documented, and controlled in SCE +- Must be traceability of all code changes +- Allows reproducibility of all versions of code + +## Collaborative Learning Communities + +(Andhale & Sood, 2025) +- For R programming # Key Findings @@ -30,9 +63,6 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua # References -In paper reference template - white papers: -(Furst & DeMillo, 2006) - Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf From 16256f1c839ba38865d533d7660dacea0d3f0751 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Wed, 1 Oct 2025 10:57:03 -0400 Subject: [PATCH 05/18] removed .pynb file --- site/posts/industry-talk-review/index.ipynb | 64 --------------------- 1 file changed, 64 deletions(-) delete mode 100644 site/posts/industry-talk-review/index.ipynb diff --git a/site/posts/industry-talk-review/index.ipynb b/site/posts/industry-talk-review/index.ipynb deleted file mode 100644 index acf1126..0000000 --- a/site/posts/industry-talk-review/index.ipynb +++ /dev/null @@ -1,64 +0,0 @@ -{ - "cells": [ - { - "cell_type": "raw", - "metadata": {}, - "source": [ - "---\n", - "title: \"Industry Talk Review\"\n", - "author: \"Rita Pecuch\"\n", - "date: \"2025-03-29\"\n", - "categories: [review, git, pharmaceuticals]\n", - "# format:\n", - "# html:\n", - "# toc: true\n", - "# code-fold: true\n", - "# bibliography: references.bib\n", - "# TODO: add the csl file and uncomment\n", - "# csl: apa.csl # for APA style citation\n", - "toc: TRUE\n", - "toc-title: \"Table of Contents\"\n", - "toc-depth: 5\n", - "---" - ], - "id": "18a4a0be" - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Introduction\n", - "\n", - "Test reference \n", - "\n", - "# Search Strategy\n", - "\n", - "# Major Themes\n", - "\n", - "## Theme 1\n", - "\n", - "## Theme 2\n", - "\n", - "# Key Findings\n", - "\n", - "# Conclusion\n", - "\n", - "# References\n", - "\n", - "\n", - "```{bibliography}\n", - "```" - ], - "id": "858e852d" - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} \ No newline at end of file From dde4ab5c4f8f94fef64786bda2e550114ea24a7f Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Wed, 1 Oct 2025 10:57:52 -0400 Subject: [PATCH 06/18] removed old external references file --- .../posts/industry-talk-review/references.bib | 28 ------------------- 1 file changed, 28 deletions(-) delete mode 100644 site/posts/industry-talk-review/references.bib diff --git a/site/posts/industry-talk-review/references.bib b/site/posts/industry-talk-review/references.bib deleted file mode 100644 index d426cf7..0000000 --- a/site/posts/industry-talk-review/references.bib +++ /dev/null @@ -1,28 +0,0 @@ -@article{esteva2019guide, - title={A guide to deep learning in healthcare}, - author={Esteva, Andre and Chou, Kangchen and Yeung, Serena and Naik, Nikhil and Madani, Ali and Mottaghi, Ali and Liu, Yun and Topol, Eric}, - journal={Nature medicine}, - volume={25}, - number={1}, - pages={24--29}, - year={2019}, - publisher={Nature Publishing Group} -} - -@article{rajkomar2018scalable, - title={Scalable and accurate deep learning with electronic health records}, - author={Rajkomar, Alvin and et al.}, - journal={NPJ Digital Medicine}, - volume={1}, - number={1}, - pages={18}, - year={2018}, - publisher={Nature Publishing Group} -} - -@article{holzinger2017we, - title={What do we need to build explainable AI systems for the medical domain?}, - author={Holzinger, Andreas and et al.}, - journal={arXiv preprint arXiv:1712.09923}, - year={2017} -} From 79ac619320ebfecd3cff6ec6aaeab4993f5c9dee Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 2 Oct 2025 13:55:01 -0400 Subject: [PATCH 07/18] added notes from arancibia paper --- site/posts/industry-talk-review/index.qmd | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index d736821..5a35ab9 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -32,6 +32,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - More flexibility to meet diverse needs of stakeholders - Scalability important for exponentially increasing amounts of data due to multicenter international trials +(Arancibia, 2025) +- R is primary open-source language for biostatistics (for many organizations) + ## Rise of Agile Methologies (Andhale & Sood, 2025) @@ -44,6 +47,14 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - Important to support agile and collaborative development - Particulary useful for complex projects +(Arancibia, 2025) +- Need data sources for metrics to report to senior leadership (topics such as how much open-source is being used) +• R Code size out of total code (R and SAS) +• Total Repos with any R usage +• Repos where R code size (out of R and SAS) is greater than Target Precent +• Growth of R: Rate of change of R code in a repo and organization +- Lots of useful info can be pulled with GitHub API to automate metrics reporting. And several languages including R have packages that interface with the GitHub API and simplify the syntax of API call execution + ## GxP Compliance @@ -52,21 +63,26 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - Must be traceability of all code changes - Allows reproducibility of all versions of code -## Collaborative Learning Communities +## Training and Support (Andhale & Sood, 2025) -- For R programming +- Collaborative learning communities r R programming + +(Arancibia, 2025) +- Pulling specific info via GitHub API to assess where repos need improvements # Key Findings # Conclusion # References +Posit. (2024, November). GSK's R Journey: From Pilot Projects to Enterprise Adoption | Hosted by Posit. Retrieved from https://www.youtube.com/watch?v=xDrt6txplek Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf +NEXT: Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf From 6a08ba4c6789b81d9a748e1253824e87324c8242 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 9 Oct 2025 10:39:37 -0400 Subject: [PATCH 08/18] added notes from ching paper --- site/posts/industry-talk-review/index.qmd | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 5a35ab9..7f51212 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -41,6 +41,11 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - becoming more common - Real-time communication and iterative progress in workflows +(Ching, 2025) +- dynamic and iterative approach +- needs tailored approach for leading and managing change +- sprints to allow continuous improvement + ## Need for Version Control (Andhale & Sood, 2025) @@ -71,6 +76,14 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Arancibia, 2025) - Pulling specific info via GitHub API to assess where repos need improvements +(Ching, 2025): +- "Although GitHub has many selling points, many of its concepts are new and foreign to most of the statisticians and programmers going through this transition. Taking into account the effort and time required to upskill an entire department in a very new way of working, the transition to GitHub proved to be more daunting than we realized, as it was not just about learning how to use GitHub, it was also about learning to adapt to process changes to leverage the full capabilities of GitHub such as branching, tags, and the level of granularity in its version control" +- "creating a culture that embraces change is essential. This can be achieved through leadership support, clear +communication of the benefits of change, and recognition of employees' efforts during the transition" +- "it’s important to celebrate and appreciate those who are leading, implementing, and embracing change to +maintain morale and motivation. This means celebrating small wins and milestones achieved during the project" +- technical support important --> in this case helpful to have SMEs embedded in teams to help out + # Key Findings # Conclusion @@ -82,9 +95,9 @@ Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP complia Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf -NEXT: Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf +LEFT OFF HERE: Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf Kim, J. (2025). The changes in the job of statistical programmer in the From 7dfafa293dd1662e253a920386004b57e0b61a9f Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Mon, 13 Oct 2025 08:19:15 -0400 Subject: [PATCH 09/18] added notes from hume paper --- site/posts/industry-talk-review/index.qmd | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 7f51212..a0dcf0f 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -35,6 +35,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Arancibia, 2025) - R is primary open-source language for biostatistics (for many organizations) +(Hume, 2025) +- CDISC using open-source tools and test frameworks in the Enable and Automate pillar of 360i strategy + ## Rise of Agile Methologies (Andhale & Sood, 2025) @@ -60,6 +63,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua • Growth of R: Rate of change of R code in a repo and organization - Lots of useful info can be pulled with GitHub API to automate metrics reporting. And several languages including R have packages that interface with the GitHub API and simplify the syntax of API call execution +(Hume, 2025) +- "GitHub provides the means to collaborate on software and standards development activities as well as providing mechanisms for discussions, documentation, and planning activities." + ## GxP Compliance @@ -97,9 +103,9 @@ Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biost Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf -LEFT OFF HERE: Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf +LEFT OFF HERE: Kim, J. (2025). The changes in the job of statistical programmer in the pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf From a8bfd00b65c2bedbd1ba1365262d15b8c726d864 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 14 Oct 2025 11:38:01 -0400 Subject: [PATCH 10/18] added notes from kim paper --- site/posts/industry-talk-review/index.qmd | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index a0dcf0f..201153c 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -38,6 +38,10 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Hume, 2025) - CDISC using open-source tools and test frameworks in the Enable and Automate pillar of 360i strategy +(Kim, 2025): +- rise of open-source making distinction between statistical programmers and traditional CS programmers less clear, as distcintion used to be based on use of proprietary software by stat programmers +- open-source encourages collaboration + ## Rise of Agile Methologies (Andhale & Sood, 2025) @@ -74,6 +78,15 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - Must be traceability of all code changes - Allows reproducibility of all versions of code +(Kim, 2025): +- bound to work with the CFR21 Part 11 rule +- programming must be done in controlled environment +- do not violate data privacy +- can share certain metadata information with RAs to simplify reproducibility of programming environments +- Testing and validation should not be used interchangeably. Testing can stay with computer science programmers (CSP) and use open source tools such as GitHub Actions, validation of data outputs should stay with stat programmers (DSPs) and follow SOPs +- Set of standard checks on code can be helpful to improve overall program quality +- CSPs more concerned with if logic produced expected results given particular test data, and DSPs more concerned with comparisons of data outputs + ## Training and Support (Andhale & Sood, 2025) @@ -105,10 +118,10 @@ Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in t Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf -LEFT OFF HERE: Kim, J. (2025). The changes in the job of statistical programmer in the pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf +LEFT OFF HERE: Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf From 81efa51a3d01952dc89bf91e187ed65cea7a1199 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Wed, 15 Oct 2025 15:10:35 -0400 Subject: [PATCH 11/18] added notes from liao paper --- site/posts/industry-talk-review/index.qmd | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 201153c..9203db5 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -42,6 +42,11 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - rise of open-source making distinction between statistical programmers and traditional CS programmers less clear, as distcintion used to be based on use of proprietary software by stat programmers - open-source encourages collaboration +(Liao, 2025): +- use or propietary software is expensive and restricts freedom of operation +- some visualization tools have steep learning curves for both developers and viewers +- transparency of code is an attractive feature --> so let's also have transparency in all code changes + ## Rise of Agile Methologies (Andhale & Sood, 2025) @@ -70,6 +75,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Hume, 2025) - "GitHub provides the means to collaborate on software and standards development activities as well as providing mechanisms for discussions, documentation, and planning activities." +(Liao, 2025): +- Git is integrated with several platforms that can be used as SCEs, such as Posit Workbench/Connect + ## GxP Compliance @@ -103,6 +111,9 @@ communication of the benefits of change, and recognition of employees' efforts d maintain morale and motivation. This means celebrating small wins and milestones achieved during the project" - technical support important --> in this case helpful to have SMEs embedded in teams to help out +(Liao, 2025): +- embrace the openness of knowledge sharing in the open-source community + # Key Findings # Conclusion @@ -121,9 +132,9 @@ Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automat Kim, J. (2025). The changes in the job of statistical programmer in the pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf -LEFT OFF HERE: Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf +LEFT OFF HERE: Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf From 8a50cacf2b109b5764c2e71edcc96cf7e1712162 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 16 Oct 2025 08:46:37 -0400 Subject: [PATCH 12/18] added notes from opalka poster --- site/posts/industry-talk-review/index.qmd | 24 +++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 9203db5..87abdbf 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -114,7 +114,27 @@ maintain morale and motivation. This means celebrating small wins and milestones (Liao, 2025): - embrace the openness of knowledge sharing in the open-source community -# Key Findings +## Tool Comparisons + +(Opalka, 2025): +- GitHub: + - Owned by Microsoft + - More users across community + - Unclear if you can suggest code edits which can then be accepted/rejected + - Centered around main branch + - Cheaper +- GitLab + - Standalone company + - Less users across community + - Can suggest code edits which can then be accepted/rejected + - Encourages use of pre-production branch before production branch + - More expensive +- "From an enterprise perspective… + Github is much cheaper than Gitlab - the enterprise plans are $21 per user/month and $99 per +user/month respectively, however Gitlab has security elements already included and with GitHub +these are about $50/month extra (Prices as of 2024). Which repository is for you ultimately +depends on whether you want to prioritise stability, safety and an all-in-one solution (GitLab) +or speed, cheaper costs and a bigger community (GitHub)." # Conclusion @@ -134,9 +154,9 @@ pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-cent Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf -LEFT OFF HERE: Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf +LEFT OFF HERE: Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf From 4ce5daa38d8b834b5666a1b6837722e4f9f3cfa0 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 16 Oct 2025 09:00:21 -0400 Subject: [PATCH 13/18] added notes from patel paper --- site/posts/industry-talk-review/index.qmd | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 87abdbf..0af2fc4 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -47,6 +47,12 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - some visualization tools have steep learning curves for both developers and viewers - transparency of code is an attractive feature --> so let's also have transparency in all code changes +(Patel, 2025): +- Customize analyses to meet specific needs in clinical trials +- Easy re-running and updates of results when using scripting +- R stands out for data wrangling/manipulation and specialized statistical packages +- Python stands out for automation and machine learning/AI + ## Rise of Agile Methologies (Andhale & Sood, 2025) @@ -95,6 +101,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - Set of standard checks on code can be helpful to improve overall program quality - CSPs more concerned with if logic produced expected results given particular test data, and DSPs more concerned with comparisons of data outputs +(Patel, 2025): +- reproducibility and transparency are key + ## Training and Support (Andhale & Sood, 2025) @@ -156,9 +165,9 @@ Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Emp Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf -LEFT OFF HERE: Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf +LEFT OFF HERE: Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf From feb5aa39e68405ac7f65e21f162ae826ef9c4829 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 21 Oct 2025 16:46:51 -0400 Subject: [PATCH 14/18] added notes from shi et al paper --- site/posts/industry-talk-review/index.qmd | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 0af2fc4..e5c1038 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -84,6 +84,10 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Liao, 2025): - Git is integrated with several platforms that can be used as SCEs, such as Posit Workbench/Connect +(Shi, 2025): +- Use templating feature to set up default files / folder structure +- Many steps can be automated with a bash script + ## GxP Compliance @@ -104,6 +108,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Patel, 2025): - reproducibility and transparency are key +(Shi, 2025): +- reproducibility is key + ## Training and Support (Andhale & Sood, 2025) @@ -167,11 +174,15 @@ Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05 Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf -LEFT OFF HERE: Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf +LEFT OFF HERE: Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf Veríssimo, A., Viyash, V. (2025). Accelerating Shiny Application Development and Validation with Rhino [White paper OS15]. Appsilon. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS15.pdf Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf + +A Risk-based Approach for Assessing R package Accuracy within a Validated Infrastructure: +https://www.pharmar.org/white-paper/ +External R Package Qualification Process in Regulated Environment: PharmaSUG-2022-SI-057.pdf From ce235ebc9b369a492b2776646a7f912a3029f587 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 23 Oct 2025 12:49:36 -0400 Subject: [PATCH 15/18] added notes from tahiliani paper --- site/posts/industry-talk-review/index.qmd | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index e5c1038..0d0bd92 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -64,6 +64,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - needs tailored approach for leading and managing change - sprints to allow continuous improvement +(Tahiliani, 2025): +- some principles can be useful for SCE migration such as iterative progress and continuous feedback + ## Need for Version Control (Andhale & Sood, 2025) @@ -111,6 +114,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (Shi, 2025): - reproducibility is key +(PHUSE WG Notes): +- align on access strategy (i.e. code from production programmers should not be visible to QC programmers) + ## Training and Support (Andhale & Sood, 2025) @@ -130,6 +136,11 @@ maintain morale and motivation. This means celebrating small wins and milestones (Liao, 2025): - embrace the openness of knowledge sharing in the open-source community +(Tahiliani, 2025): +- "Providing a few months' advance notice helps team members process the information, reducing anxiety and allowing them time to understand the change and come to terms with it." +- Build a positive narrative and focus on the benefits +- Repetition is important + ## Tool Comparisons (Opalka, 2025): @@ -176,9 +187,9 @@ Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Repor Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf -LEFT OFF HERE: Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf +LEFT OFF HERE: Veríssimo, A., Viyash, V. (2025). Accelerating Shiny Application Development and Validation with Rhino [White paper OS15]. Appsilon. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS15.pdf Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf From b3564cc9056bf026037147df2d9244ece1f2ec5f Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Thu, 30 Oct 2025 15:22:14 -0400 Subject: [PATCH 16/18] added notes from zhang paper --- site/posts/industry-talk-review/index.qmd | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 0d0bd92..0924b54 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -91,6 +91,10 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua - Use templating feature to set up default files / folder structure - Many steps can be automated with a bash script +(Zhang, 2025): +- leveraging automated deployment features to minimize human error +- using PRs also encourgaes more collaboration and learning from team members + ## GxP Compliance @@ -117,6 +121,9 @@ The aim is not only to summarize how companies are leveraging Git, but to evalua (PHUSE WG Notes): - align on access strategy (i.e. code from production programmers should not be visible to QC programmers) +(Zhang, 2025): +- complete history of code changes which cannot be modified + ## Training and Support (Andhale & Sood, 2025) @@ -189,11 +196,10 @@ Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, repro Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf -LEFT OFF HERE: -Veríssimo, A., Viyash, V. (2025). Accelerating Shiny Application Development and Validation with Rhino [White paper OS15]. Appsilon. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS15.pdf - Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf +LEFT OFF HERE: A Risk-based Approach for Assessing R package Accuracy within a Validated Infrastructure: https://www.pharmar.org/white-paper/ + External R Package Qualification Process in Regulated Environment: PharmaSUG-2022-SI-057.pdf From 26d953ecd0fb77759fa8f8024a1943bb9bbe4908 Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 18 Nov 2025 12:56:26 -0500 Subject: [PATCH 17/18] finished first draft --- site/posts/industry-talk-review/index.qmd | 181 +++------------------- 1 file changed, 25 insertions(+), 156 deletions(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 0924b54..914bdf8 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -10,196 +10,65 @@ toc-depth: 5 # Introduction -The usage of Git in clinical programming within the pharmaceutical industry has been steadily increasing over the last several years. Beyond the discussions of our PHUSE working group, several talks at recent industry events have showcased how various companies are leveraging this technology. +The usage of Git in statistical programming within the pharmaceutical industry has been steadily increasing over the last several years. Beyond the discussions of our PHUSE working group, several talks at recent industry events have showcased why and how various companies are leveraging this technology. # Search Strategy -The search for materials spanned the 2025 PHUSE archive, as heavy representation from pharmaceutical companies is customary for PHUSE events. - -The aim is not only to summarize how companies are leveraging Git, but to evaluate applications of open-source technology and propose how Git can be useful. +The search for materials spanned the 2025 PHUSE archive, as heavy representation from pharmaceutical companies is customary for PHUSE events, as well as one relevant paper that was presented at PharmaSUG 2025 and R/Pharma 2025. # Major Themes ## Increasing Usage of Open-Source Programming -(Andhale & Sood, 2025) -- R is powerful language for statistical modeling and hypothesis testing within the industry -- Clean data, apply stats models, and implement ML algorithms all within single platform -- Shiny allows visualizing large datasets that are common in this industry, and get actionable insights -- HTML, CSS, and JS knowledge not required --> so lowered barrier for getting more people into open-source programming -- Interactive elements for deeper data exploration -- Integrate data from several sources, across studies, leverage historical data -- More flexibility to meet diverse needs of stakeholders -- Scalability important for exponentially increasing amounts of data due to multicenter international trials - -(Arancibia, 2025) -- R is primary open-source language for biostatistics (for many organizations) - -(Hume, 2025) -- CDISC using open-source tools and test frameworks in the Enable and Automate pillar of 360i strategy - -(Kim, 2025): -- rise of open-source making distinction between statistical programmers and traditional CS programmers less clear, as distcintion used to be based on use of proprietary software by stat programmers -- open-source encourages collaboration - -(Liao, 2025): -- use or propietary software is expensive and restricts freedom of operation -- some visualization tools have steep learning curves for both developers and viewers -- transparency of code is an attractive feature --> so let's also have transparency in all code changes - -(Patel, 2025): -- Customize analyses to meet specific needs in clinical trials -- Easy re-running and updates of results when using scripting -- R stands out for data wrangling/manipulation and specialized statistical packages -- Python stands out for automation and machine learning/AI - -## Rise of Agile Methologies +Open-source languages, particularly R, have emerged as powerful tools for reporting, statistical modeling, and enabling actionable insights within the pharmaceutical industry, capable of facilitating entire workflows from data cleaning to advanced analytics in a single environment [1,2]. In addition, the integration of frameworks such as RShiny allows users to interactively visualize large-scale datasets without requiring knowledge of HTML, CSS, or JavaScript, thus reducing barriers to entry for open-source programming [1]. This accessibility, combined with R’s capacity to integrate data from multiple studies and leverage historical information, provides the flexibility and scalability needed to support diverse stakeholder needs and accommodate growing amounts of data, particularly from international, multicenter trials [1]. -(Andhale & Sood, 2025) -- becoming more common -- Real-time communication and iterative progress in workflows +Large organizations such as GSK, Pfizer, and CDISC, as well as smaller biotech companies, have incorporated open-source tools into their strategies, highlighting the broader industry momentum toward transparency, collaboration, and freedom from the high costs, restrictions, and often steep learning curves of proprietary software [2,3,4,5,6]. The rise of open-source solutions has also began to blur the boundaries between statistical and computer science programmers, emphasizing collaboration and the ability to customize, rerun, and update clinical analyses efficiently [2,6]. -(Ching, 2025) -- dynamic and iterative approach -- needs tailored approach for leading and managing change -- sprints to allow continuous improvement +## Rise of Agile Methologies -(Tahiliani, 2025): -- some principles can be useful for SCE migration such as iterative progress and continuous feedback +Agile methodologies are becoming increasingly common in the pharmaceutical industry due to their focus on real-time communication, iterative workflows, and dynamic approaches that foster continuous improvement [1,7]. Successfully leading and managing change within agile environments requires tailored strategies, including the use of sprints to drive ongoing progress and adaptability [7]. Key agile principles—such as iterative progress and continuous feedback—can also be valuable when applied to complex industry-specific efforts such as SCE migration, ensuring flexibility and responsiveness throughout the transition [8]. ## Need for Version Control -(Andhale & Sood, 2025) -- Important to support agile and collaborative development -- Particulary useful for complex projects - -(Arancibia, 2025) -- Need data sources for metrics to report to senior leadership (topics such as how much open-source is being used) -• R Code size out of total code (R and SAS) -• Total Repos with any R usage -• Repos where R code size (out of R and SAS) is greater than Target Precent -• Growth of R: Rate of change of R code in a repo and organization -- Lots of useful info can be pulled with GitHub API to automate metrics reporting. And several languages including R have packages that interface with the GitHub API and simplify the syntax of API call execution - -(Hume, 2025) -- "GitHub provides the means to collaborate on software and standards development activities as well as providing mechanisms for discussions, documentation, and planning activities." - -(Liao, 2025): -- Git is integrated with several platforms that can be used as SCEs, such as Posit Workbench/Connect - -(Shi, 2025): -- Use templating feature to set up default files / folder structure -- Many steps can be automated with a bash script - -(Zhang, 2025): -- leveraging automated deployment features to minimize human error -- using PRs also encourgaes more collaboration and learning from team members +For teams using open-source technology in the pharmaceutical industry, Git provides robust support for agile and collaborative development, especially when managing complex projects that require frequent and coordinated contributions from multidisciplinary teams [1]. Its integration with platforms like GitHub enables transparent code management, collaborative planning, documentation, and discussion, all of which are essential for efficient development activities [4,5]. The transparency of code changes and use of features such as pull requests further enhance collaboration and foster team learning, and allows opportunities to document why certain decisions were made regarding code changes. Templating and automation such as GitHub Actions and other bash scripting streamline project setup and repetitive tasks, reducing the risk of human error [9,10]. +Additionally, Git’s ecosystem provides valuable opportunities for data-driven reporting and strategic oversight. Tools like the GitHub API allow organizations to automatically extract key metrics for continuous improvement as well as reporting to senior leadership. For example, GSK extract metrics such as the proportion and growth of R code versus SAS, the total number of repositories using R, and usage trends over time [3]. Languages like R offer dedicated packages to interface with the GitHub API, making metric collection and analysis more accessible. ## GxP Compliance -(Andhale & Sood, 2025): -- All open source packages used for regulatory submissions must be validated (proven to meet regulatory standards), documented, and controlled in SCE -- Must be traceability of all code changes -- Allows reproducibility of all versions of code +For teams using open-source technology in the pharmaceutical industry, the use of Git can support compliance with key regulatory and auditing requirements by providing a complete immutable history of code changes [1,10]. Compliance with regulations such as CFR21 Part 11 requires that programming takes place in controlled environments, preserving data privacy while allowing specific metadata sharing with regulatory authorities to facilitate reproducibility [2,6,9]. Git is integrated with several SCEs (e.g., Posit Workbench/Connect) [5]. -(Kim, 2025): -- bound to work with the CFR21 Part 11 rule -- programming must be done in controlled environment -- do not violate data privacy -- can share certain metadata information with RAs to simplify reproducibility of programming environments -- Testing and validation should not be used interchangeably. Testing can stay with computer science programmers (CSP) and use open source tools such as GitHub Actions, validation of data outputs should stay with stat programmers (DSPs) and follow SOPs -- Set of standard checks on code can be helpful to improve overall program quality -- CSPs more concerned with if logic produced expected results given particular test data, and DSPs more concerned with comparisons of data outputs - -(Patel, 2025): -- reproducibility and transparency are key - -(Shi, 2025): -- reproducibility is key - -(PHUSE WG Notes): -- align on access strategy (i.e. code from production programmers should not be visible to QC programmers) - -(Zhang, 2025): -- complete history of code changes which cannot be modified +Although Git is traditionally used for tracking code changes, having a clear easy to understand history of which code versions produced which data outputs is extremely beneficial. Data outputs, even in development and testing environments, and traditionally stored in specific environments. Taking advantage of the unique identifiers known as commit hashes assigned to each version of the code by Git can serve as helpful labels for different versions of data outputs. For example, a strategy employed by Graticule is to label different versions of data outputs stored in their data lake with the commit hash of the version of the code that produced those outputs [11]. ## Training and Support -(Andhale & Sood, 2025) -- Collaborative learning communities r R programming - -(Arancibia, 2025) -- Pulling specific info via GitHub API to assess where repos need improvements - -(Ching, 2025): -- "Although GitHub has many selling points, many of its concepts are new and foreign to most of the statisticians and programmers going through this transition. Taking into account the effort and time required to upskill an entire department in a very new way of working, the transition to GitHub proved to be more daunting than we realized, as it was not just about learning how to use GitHub, it was also about learning to adapt to process changes to leverage the full capabilities of GitHub such as branching, tags, and the level of granularity in its version control" -- "creating a culture that embraces change is essential. This can be achieved through leadership support, clear -communication of the benefits of change, and recognition of employees' efforts during the transition" -- "it’s important to celebrate and appreciate those who are leading, implementing, and embracing change to -maintain morale and motivation. This means celebrating small wins and milestones achieved during the project" -- technical support important --> in this case helpful to have SMEs embedded in teams to help out - -(Liao, 2025): -- embrace the openness of knowledge sharing in the open-source community - -(Tahiliani, 2025): -- "Providing a few months' advance notice helps team members process the information, reducing anxiety and allowing them time to understand the change and come to terms with it." -- Build a positive narrative and focus on the benefits -- Repetition is important - -## Tool Comparisons - -(Opalka, 2025): -- GitHub: - - Owned by Microsoft - - More users across community - - Unclear if you can suggest code edits which can then be accepted/rejected - - Centered around main branch - - Cheaper -- GitLab - - Standalone company - - Less users across community - - Can suggest code edits which can then be accepted/rejected - - Encourages use of pre-production branch before production branch - - More expensive -- "From an enterprise perspective… - Github is much cheaper than Gitlab - the enterprise plans are $21 per user/month and $99 per -user/month respectively, however Gitlab has security elements already included and with GitHub -these are about $50/month extra (Prices as of 2024). Which repository is for you ultimately -depends on whether you want to prioritise stability, safety and an all-in-one solution (GitLab) -or speed, cheaper costs and a bigger community (GitHub)." +Despite the many advantages of using Git to support emerging trends of open-source technology use and collaborative programming in the pharmaceutical industry, this technology can have a steep learning curve of concepts that are new to many statistical programmers [7]. The open-source community is meant to be collaborate by nature and offers many resources and learning communities to support individuals that are learning Git [1,5]. Specific recommendations that have been discussed for better fostering a culture that embraces change include clear communication of benefits and specific goals of new approach, recognition of employees' efforts along the learning journey by celebrating small wins, embedding subject matter experts into adopting teams, providing ample notice in advance of the change, and providing ample opportunities for practice and repetition [7,8]. Metrics can also be gathered along the way by utilizing the GitHub API to assess trends which could be improved across repositories [3]. This can vary by organization and what the specific goals are to be tracked, but examples could include number of merge conflicts and commit message analysis. # Conclusion -# References -Posit. (2024, November). GSK's R Journey: From Pilot Projects to Enterprise Adoption | Hosted by Posit. Retrieved from https://www.youtube.com/watch?v=xDrt6txplek +Git can provide robust support for change tracking as open-source programming and agile methodologies become more prevalanet in the pharmaceutical industry. Ways in using this technology to best support GxP practices are continuing to be discussed, and fostering a culture that welcomes change for the better is essential for a smooth implementation. -Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf - -Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf +# References -Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf +1. Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf -Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf +2. Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf -Kim, J. (2025). The changes in the job of statistical programmer in the -pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf +3. Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf -Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf +4. Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf -Opalka, A. (2025). GitLab vs GitHub: The Battle of the Repositories [Poster PP05]. Roche. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/CSS/EU/Utrecht/POS_PP05.pdf +5. Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf -Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf +6. Kim, J. (2025). The changes in the job of statistical programmer in the +pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf -Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf +7. Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf -Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf +8. Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf -Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf +9. Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf -LEFT OFF HERE: -A Risk-based Approach for Assessing R package Accuracy within a Validated Infrastructure: -https://www.pharmar.org/white-paper/ +10. Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf -External R Package Qualification Process in Regulated Environment: PharmaSUG-2022-SI-057.pdf +11. Dusendang, J., Bath, S., Orozco, S., Asper, A. , Koren, Y. (2025). Integrating Collaborative Programming with Automated Traceability and Reproducibility in Pharma Studies and Real-World Data Projects by Adapting DevOps Best-Practices. [Paper OS-111]. https://pharmasug.org/proceedings/2025/OS/PharmaSUG-2025-OS-111.pdf. From d666550b47371a7398350733ba1e5f11c0a9178e Mon Sep 17 00:00:00 2001 From: Rita Pecuch Date: Tue, 18 Nov 2025 13:01:31 -0500 Subject: [PATCH 18/18] updated date --- site/posts/industry-talk-review/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/site/posts/industry-talk-review/index.qmd b/site/posts/industry-talk-review/index.qmd index 914bdf8..ddba1eb 100644 --- a/site/posts/industry-talk-review/index.qmd +++ b/site/posts/industry-talk-review/index.qmd @@ -1,7 +1,7 @@ --- title: "Industry Talk Review" author: "Rita Pecuch" -date: "2025-03-29" +date: "2025-11-18" categories: [review, git, pharmaceuticals] toc: TRUE toc-title: "Table of Contents"