-
Notifications
You must be signed in to change notification settings - Fork 12
Replace BLAS functions w/ pure F90 code in integrators #142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
int/rosenbrock.f90 int/rosenbrock_adj.f90 int/rosenbrock_autoreduce.f90 int/rosenbrock_tlm.f90 - Replace BLAS functions (WAXPY, WCOPY, WSCAL, etc) with pure F90 code. I used AI to help determine the replacement code int/rosenbrock_h211b_qssa.f90 - Removed commented out BLAS code CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
|
Very good idea, thanks for taking the effort!!! A few notes:
by
|
int/rosenbrock_autoreduce.f90
- Fixed typo in error message ("T + 10*H" -> "T + 0.1*H")
- Replaced calls to BLAS functions with fresh replacement code
generated by AI
- Break lines in a couple of instances for readability
Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
9aacc5a to
b02d0e4
Compare
|
Thanks @RolfSander for the feedback. I can make those changes. I can also look at WLAMCH. As a sanity check, I did run the C-I tests on the dev branch vs. this branch and the results were the same (except for the C_rk output which I think may be related to #107). I can certainly do more rigorous tests with the KPP-Standalone before I make the PR ready for review. |
|
Thanks, @yantosca! If you have a script to perform those tests, maybe you can store that I have further ideas for code cleanup in the future, and I would like to |
|
The script to run the C-I tests is |
|
Thanks, that's good to know! |
|
Thanks @yantosca! I like the CI tests being used to validate the changes. Do these check against a reference set of results, or just build and run the mechanism? It would be interesting to see how much the results change numerically with the removal of BLAS. |
|
The CI tests just run the mechanism. But you can run the tests on 2 branches (and send the results to a log file), then compare the logs. Also I think most tests also create a *.dat file as well that could be compared. |
b02d0e4 to
617eeb0
Compare
int/sdirk.f90 int/sdirk4.f90 int/sdirk_adj.f90 - Replaced calls to BLAS functions with pure F90 code (as determined by AI) Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
617eeb0 to
3de658c
Compare
int/radau5.f90 int/runge_kutta.f90 int/runge_kutta_adj.f90 int/runge_kutta_tlm.f90 int/sdirk_tlm.f90 - Replace BLAS routines with pure F90 code Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
|
I actually found that I needed to keep the ONE in some expressions in order to avoid introducing numerical diffs: Old code: KPP_REAL :: HGammaInv
HGammaInv = ONE/(H*rkGamma)
CALL WSCAL(N,HGammaInv,RHS,1)New code: ! This code replicates the output of the previous
! call to WAXPY (@yantosca, 16 Oct 2025)
RHS(1:N) = RHS(1:N) * (ONE / (H*rkGamma)) |
int/runge_kutta_adj.f90 int/runge_kutta_tlm.f90 - Removed commented-out calls to BLAS routines Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
|
If keeping the For numerical reasons, we cannot expect that |
|
Also I think we may need to keep WLAMCH because that sets the Roundoff variable to the machine epsilon. There may be a way to get that in pure F90 though. I can look. Will start testing the integrators more rigorously now. The C-I tests did point out a couple of bugs in my implementation that I have since fixed. |
|
AFAIK, function In f90, the function |
|
@RolfSander: I added some test code to print out the results of epsilon: Roundoff = EPSILON( 0.0_dp )
print*, 'epsilon, Roundoff: ', roundoff
Roundoff = WLAMCH('E')
print*, 'epsilon, WLAMCH: ', roundoff
stopand I get epsilon, Roundoff: 2.2204460492503131E-016
epsilon, WLAMCH: 2.2204460492503131E-016so we can indeed take out the WLAMCH calls. The F90 docs say that EPSILON returns the smallest number X so that 1+X>1. See: https://gcc.gnu.org/onlinedocs/gfortran/EPSILON.html |
|
Great :-) |
int/dvode.f90 int/lsode.f90 int/radau5.f90 int/rosenbrock*f90 int/runge_kutta*.f90 int/sdirk*.f90 int/seulex.f90 - Replaced calls to WLAMCH with F90 intrinsic EPSILON, which performs the exact same computation - Removed WLAMCH from USE statements Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
util/blas.f90 - Removed the following BLAS functions, which have been replaced by pure F90 code in the various F90 integrator modules: (1) WCOPY (2) WAXPY (3) WSCAL (4) WLAMCH (5) WLAMCH_ADD (6) SET2ZERO (7) WADD Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
8e8faf5 to
3b756f4
Compare
.ci-pipelines/ci-common-defs.sh - Added F90_rkadj, F90_sd4, F90_sdtlm to the list of tests to be done examples/rkadj.kpp - KPP file to integrate the small_strato mechanism with the runge_kutta_adj integrator examples/sd4.kpp - KPP file to integrate the small_strato mechanism with the sdirk4 integrator examples/sdtlm.kpp - KPP file to integrate the small_strato mechanism with the sdirk_tlm integrator ci-tests/F90_rkadj/F90_rkadj.kpp - Symbolic link to the rkadj.kpp example file ci-tests/F90_sd4/F90_sd4.kpp - Symbolic link to the sd4.kpp example file ci-tests/F90_sdtlm/F90_sdtlm.kpp - Symbolic link to the F90_sdtlm.kpp example file docs/source/tech_info/06_info_for_kpp_developers.rst - Added F90_rkadj, F90_sd4, F90_sdtlm to list table of C-I tests CHANGELOG.md - Updated accordingly Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
util/blas.f90 - Replaced WAXPY and WSCAL with explicit loops (generated by AI) in the WGEFA and WGESL routines. These are needed for some integrators. Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
int/runge_kutta_tlm.f90 - Add a missing & continuation character in USE statement int/sdirk.f90 - Replace WLAMCH call by EPSILON util/blas.F90 - Explicitly declare loop index i Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
|
I was able to perform several sensitivity tests with the KPP-Standalone Box Model model, using the sample input (a dump of concentrations and rates at a particular GEOS-Chem box). I was able to confirm that the AI-generated replacement code for BLAS routines results in 100% bitwise identical output w/r/t with the prior code.
I am running a couple of GEOS-Chem 1-month benchmark simulations, with and without BLAS in the integrators. The runs use the rosenbrock_autoreduce integrator (with |
|
@RolfSander @jimmielin @msl3v, I was finally able to run "out-of-the-box" GEOS-Chem 1-month benchmarks with and without the BLAS routines. We get identical results: ###############################################################################
### OH Metrics
### Ref = KPP+BLAS
### Dev = KPP-BLAS
###############################################################################
------------------------------------------------------------
Global mass-weighted OH concentration [10^5 molec cm^-3]
------------------------------------------------------------
Ref : 13.20455034956
Dev : 13.20455034956
Abs diff : 0.00000000000
% diff : 0.000000
------------------------------------------------------------
CH3CCl3 (aka MCF) lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref : 4.731282
Dev : 4.731282
Abs diff : 0.000000
% diff : 0.000000
------------------------------------------------------------
CH4 lifetime w/r/t tropospheric OH [years]
------------------------------------------------------------
Ref : 7.989512
Dev : 7.989512
Abs diff : 0.000000
% diff : 0.000000
#########################################################################################################
### Global mass (Gg) at end of simulation (Trop + Strat) ###
### ###
### Ref = KPP+BLAS ###
### Dev = KPP-BLAS ###
### ###
### Dev and Ref are identical ###
#########################################################################################################
Ref Dev Dev - Ref % diff diffs
A3O2 : 0.079549 0.079549 0.000000 0.000000
ACET : 6832.116244 6832.116244 0.000000 0.000000
ACO3 : 0.003702 0.003702 0.000000 0.000000
ACR : 8.442249 8.442249 0.000000 0.000000
ACRO2 : 0.006237 0.006237 0.000000 0.000000
ACTA : 530.954337 530.954337 0.000000 0.000000
AERI : 6.480293 6.480293 0.000000 0.000000
ALD2 : 292.404144 292.404144 0.000000 0.000000
ALK4 : 267.709856 267.709856 0.000000 0.000000
ALK4N1 : 0.005326 0.005326 0.000000 0.000000
ALK4N2 : 10.414189 10.414189 0.000000 0.000000
ALK4O2 : 0.179222 0.179222 0.000000 0.000000
ALK4P : 2.775849 2.775849 0.000000 0.000000
ALK6 : 115.347750 115.347750 0.000000 0.000000
AONITA : 83.675084 83.675084 0.000000 0.000000
APAN : 1.982932 1.982932 0.000000 0.000000
APINN : 18.842481 18.842481 0.000000 0.000000
APINO2 : 0.563659 0.563659 0.000000 0.000000
APINP : 58.095914 58.095914 0.000000 0.000000
AROMCHO : 0.027412 0.027412 0.000000 0.000000
AROMCO3 : 0.000419 0.000419 0.000000 0.000000
AROMP4 : 0.143384 0.143384 0.000000 0.000000
AROMP5 : 0.102109 0.102109 0.000000 0.000000
AROMPN : 0.470140 0.470140 0.000000 0.000000
AROMRO2 : 0.356114 0.356114 0.000000 0.000000
ASOA1 : 8.233134 8.233134 0.000000 0.000000
ASOA2 : 2.766221 2.766221 0.000000 0.000000
ASOA3 : 7.080648 7.080648 0.000000 0.000000
ASOAN : 55.823063 55.823063 0.000000 0.000000
ASOG1 : 5.802423 5.802423 0.000000 0.000000
ASOG2 : 6.964670 6.964670 0.000000 0.000000
ASOG3 : 115.283817 115.283817 0.000000 0.000000
ATO2 : 0.528140 0.528140 0.000000 0.000000
ATOOH : 151.561513 151.561513 0.000000 0.000000
B3O2 : 0.316317 0.316317 0.000000 0.000000
BALD : 2.475327 2.475327 0.000000 0.000000
BCPI : 108.812684 108.812684 0.000000 0.000000
BCPO : 21.352639 21.352639 0.000000 0.000000
BENZ : 186.386614 186.386614 0.000000 0.000000
BENZO : 0.009947 0.009947 0.000000 0.000000
BENZO2 : 0.278536 0.278536 0.000000 0.000000
BENZP : 87.503931 87.503931 0.000000 0.000000
BPINN : 5.220994 5.220994 0.000000 0.000000
BPINO : 41.732055 41.732055 0.000000 0.000000
BPINO2 : 0.040273 0.040273 0.000000 0.000000
BPINON : 10.295940 10.295940 0.000000 0.000000
BPINOO2 : 0.191478 0.191478 0.000000 0.000000
BPINOOH : 13.278845 13.278845 0.000000 0.000000
BPINP : 14.410023 14.410023 0.000000 0.000000
BRO2 : 0.070057 0.070057 0.000000 0.000000
BUTDI : 12439.927467 12439.927467 0.000000 0.000000
BUTN : 0.059672 0.059672 0.000000 0.000000
BUTO2 : 0.003297 0.003297 0.000000 0.000000
BZCO3 : 0.002052 0.002052 0.000000 0.000000
BZCO3H : 0.756357 0.756357 0.000000 0.000000
BZPAN : 2.222325 2.222325 0.000000 0.000000
Br : 1.234656 1.234656 0.000000 0.000000
Br2 : 11.098290 11.098290 0.000000 0.000000
BrCl : 11.542632 11.542632 0.000000 0.000000
BrNO2 : 0.329711 0.329711 0.000000 0.000000
BrNO3 : 28.093289 28.093289 0.000000 0.000000
BrO : 14.251124 14.251124 0.000000 0.000000
BrSALA : 0.407664 0.407664 0.000000 0.000000
BrSALC : 4.621334 4.621334 0.000000 0.000000
C2H2 : 189.847262 189.847262 0.000000 0.000000
C2H4 : 173.082948 173.082948 0.000000 0.000000
C2H6 : 1367.785561 1367.785561 0.000000 0.000000
C3H8 : 262.194960 262.194960 0.000000 0.000000
C4H6 : 0.616078 0.616078 0.000000 0.000000
C4HVP1 : 0.000814 0.000814 0.000000 0.000000
C4HVP2 : 0.002032 0.002032 0.000000 0.000000
C96N : 16.989848 16.989848 0.000000 0.000000
C96O2 : 0.136932 0.136932 0.000000 0.000000
C96O2H : 27.139280 27.139280 0.000000 0.000000
CCl4 : 1956.283676 1956.283676 0.000000 0.000000
CFC11 : 5093.224796 5093.224796 0.000000 0.000000
CFC113 : 2188.349943 2188.349943 0.000000 0.000000
CFC114 : 466.367924 466.367924 0.000000 0.000000
CFC115 : 230.825448 230.825448 0.000000 0.000000
CFC12 : 10263.474154 10263.474154 0.000000 0.000000
CH2Br2 : 18.816244 18.816244 0.000000 0.000000
CH2Cl2 : 448.599490 448.599490 0.000000 0.000000
CH2I2 : 0.071044 0.071044 0.000000 0.000000
CH2IBr : 0.065988 0.065988 0.000000 0.000000
CH2ICl : 0.320779 0.320779 0.000000 0.000000
CH2O : 1165.090687 1165.090687 0.000000 0.000000
CH2OO : 0.000005 0.000005 0.000000 0.000000
CH3Br : 100.454693 100.454693 0.000000 0.000000
CH3CCl3 : 29.937498 29.937498 0.000000 0.000000
CH3CHOO : 0.000014 0.000014 0.000000 0.000000
CH3Cl : 4367.268734 4367.268734 0.000000 0.000000
CH3I : 6.029633 6.029633 0.000000 0.000000
CH4 : 5131307.945757 5131307.945757 0.000000 0.000000
CHBr3 : 24.611531 24.611531 0.000000 0.000000
CHCl3 : 161.628706 161.628706 0.000000 0.000000
CLOCK : 6.32626499e+19 6.32626499e+19 0.000000 0.000000
CO : 316257.739855 316257.739855 0.000000 0.000000
CO2 : 2.15301617e+09 2.15301617e+09 0.000000 0.000000
CSL : 0.521638 0.521638 0.000000 0.000000
Cl : 0.259676 0.259676 0.000000 0.000000
Cl2 : 4.964031 4.964031 0.000000 0.000000
Cl2O2 : 40.043337 40.043337 0.000000 0.000000
ClNO2 : 2.143306 2.143306 0.000000 0.000000
ClNO3 : 643.644063 643.644063 0.000000 0.000000
ClO : 48.547473 48.547473 0.000000 0.000000
ClOO : 0.000432 0.000432 0.000000 0.000000
DMS : 201.115959 201.115959 0.000000 0.000000
DST1 : 5659.000883 5659.000883 0.000000 0.000000
DST2 : 3882.369471 3882.369471 0.000000 0.000000
DST3 : 5226.095855 5226.095855 0.000000 0.000000
DST4 : 2868.363695 2868.363695 0.000000 0.000000
EBZ : 1.93005638e-13 1.93005638e-13 0.000000 0.000000
EOH : 144.654443 144.654443 0.000000 0.000000
ETHLN : 0.828012 0.828012 0.000000 0.000000
ETHN : 3.428465 3.428465 0.000000 0.000000
ETHP : 50.816548 50.816548 0.000000 0.000000
ETNO3 : 27.413353 27.413353 0.000000 0.000000
ETO : 1.43524378e-08 1.43524378e-08 0.000000 0.000000
ETO2 : 0.269303 0.269303 0.000000 0.000000
ETOO : 0.326030 0.326030 0.000000 0.000000
ETP : 61.544620 61.544620 0.000000 0.000000
FURA : 10.262396 10.262396 0.000000 0.000000
GCO3 : 0.100703 0.100703 0.000000 0.000000
GLYC : 97.130694 97.130694 0.000000 0.000000
GLYX : 12.948782 12.948782 0.000000 0.000000
H : 0.000301 0.000301 0.000000 0.000000
H1211 : 86.984943 86.984943 0.000000 0.000000
H1301 : 83.148894 83.148894 0.000000 0.000000
H2 : 178186.612556 178186.612556 0.000000 0.000000
H2402 : 16.528553 16.528553 0.000000 0.000000
H2O : 1.42865551e+10 1.42865551e+10 0.000000 0.000000
H2O2 : 2958.340520 2958.340520 0.000000 0.000000
HAC : 164.754501 164.754501 0.000000 0.000000
HACTA : 29.099577 29.099577 0.000000 0.000000
HBr : 7.716902 7.716902 0.000000 0.000000
HC5A : 4.517634 4.517634 0.000000 0.000000
HCFC123 : 7.36812418e-20 7.36812418e-20 0.000000 nan
HCFC141b : 505.237569 505.237569 0.000000 0.000000
HCFC142b : 393.782061 393.782061 0.000000 0.000000
HCFC22 : 3711.111455 3711.111455 0.000000 0.000000
HCOOH : 509.872241 509.872241 0.000000 0.000000
HCl : 950.821080 950.821080 0.000000 0.000000
HI : 0.267856 0.267856 0.000000 0.000000
HMHP : 48.412440 48.412440 0.000000 0.000000
HMML : 17.036996 17.036996 0.000000 0.000000
HMS : 104.735350 104.735350 0.000000 0.000000
HNO2 : 4.928120 4.928120 0.000000 0.000000
HNO3 : 6111.851859 6111.851859 0.000000 0.000000
HNO4 : 201.395634 201.395634 0.000000 0.000000
HO2 : 32.056700 32.056700 0.000000 0.000000
HOBr : 11.460876 11.460876 0.000000 0.000000
HOCl : 40.722675 40.722675 0.000000 0.000000
HOI : 10.594519 10.594519 0.000000 0.000000
HONIT : 8.277498 8.277498 0.000000 0.000000
HPALD1 : 2.287048 2.287048 0.000000 0.000000
HPALD1OO : 0.003919 0.003919 0.000000 0.000000
HPALD2 : 7.018294 7.018294 0.000000 0.000000
HPALD2OO : 0.012247 0.012247 0.000000 0.000000
HPALD3 : 2.075359 2.075359 0.000000 0.000000
HPALD4 : 5.380793 5.380793 0.000000 0.000000
HPETHNL : 3.732882 3.732882 0.000000 0.000000
I : 1.127550 1.127550 0.000000 0.000000
I2 : 0.089895 0.089895 0.000000 0.000000
I2O2 : 0.044031 0.044031 0.000000 0.000000
I2O3 : 0.103267 0.103267 0.000000 0.000000
I2O4 : 0.008912 0.008912 0.000000 0.000000
IBr : 0.092775 0.092775 0.000000 0.000000
ICHE : 14.194810 14.194810 0.000000 0.000000
ICHOO : 0.000464 0.000464 0.000000 0.000000
ICN : 8.153456 8.153456 0.000000 0.000000
ICNOO : 0.041412 0.041412 0.000000 0.000000
ICPDH : 4.243333 4.243333 0.000000 0.000000
ICl : 0.888580 0.888580 0.000000 0.000000
IDC : 5.425018 5.425018 0.000000 0.000000
IDCHP : 2.776822 2.776822 0.000000 0.000000
IDHDP : 9.708152 9.708152 0.000000 0.000000
IDHNBOO : 0.897676 0.897676 0.000000 0.000000
IDHNDOO1 : 0.001436 0.001436 0.000000 0.000000
IDHNDOO2 : 0.000952 0.000952 0.000000 0.000000
IDHPE : 40.352386 40.352386 0.000000 0.000000
IDN : 0.975507 0.975507 0.000000 0.000000
IDNOO : 0.000580 0.000580 0.000000 0.000000
IEPOXA : 97.709599 97.709599 0.000000 0.000000
IEPOXAOO : 0.002425 0.002425 0.000000 0.000000
IEPOXB : 56.287349 56.287349 0.000000 0.000000
IEPOXBOO : 0.000961 0.000961 0.000000 0.000000
IEPOXD : 5.178930 5.178930 0.000000 0.000000
IHN1 : 1.066544 1.066544 0.000000 0.000000
IHN2 : 2.278643 2.278643 0.000000 0.000000
IHN3 : 2.401724 2.401724 0.000000 0.000000
IHN4 : 0.315732 0.315732 0.000000 0.000000
IHOO1 : 0.634145 0.634145 0.000000 0.000000
IHOO4 : 0.141391 0.141391 0.000000 0.000000
IHPNBOO : 0.000910 0.000910 0.000000 0.000000
IHPNDOO : 0.004024 0.004024 0.000000 0.000000
IHPOO1 : 0.005807 0.005807 0.000000 0.000000
IHPOO2 : 0.001404 0.001404 0.000000 0.000000
IHPOO3 : 0.008838 0.008838 0.000000 0.000000
INA : 1.24158360e-09 1.24158360e-09 0.000000 0.000000
INDIOL : 570.320765 570.320765 0.000000 0.000000
INO : 0.004203 0.004203 0.000000 0.000000
INO2B : 0.619913 0.619913 0.000000 0.000000
INO2D : 0.529019 0.529019 0.000000 0.000000
INPB : 2.308355 2.308355 0.000000 0.000000
INPD : 2.951093 2.951093 0.000000 0.000000
IO : 2.411476 2.411476 0.000000 0.000000
IONITA : 0.382525 0.382525 0.000000 0.000000
IONO : 0.401631 0.401631 0.000000 0.000000
IONO2 : 4.588322 4.588322 0.000000 0.000000
IPRNO3 : 40.220815 40.220815 0.000000 0.000000
ISALA : 0.911212 0.911212 0.000000 0.000000
ISALC : 0.309634 0.309634 0.000000 0.000000
ISOP : 172.893429 172.893429 0.000000 0.000000
ISOPNOO1 : 0.002314 0.002314 0.000000 0.000000
ISOPNOO2 : 0.002259 0.002259 0.000000 0.000000
ITCN : 3.553658 3.553658 0.000000 0.000000
ITHN : 8.332576 8.332576 0.000000 0.000000
KO2 : 0.233214 0.233214 0.000000 0.000000
LBRO2H : 0.191747 0.191747 0.000000 0.000000
LBRO2N : 0.224611 0.224611 0.000000 0.000000
LCH4 : 23.956180 23.956180 0.000000 0.000000
LCO : 92.941382 92.941382 0.000000 0.000000
LIMAL : 2.846134 2.846134 0.000000 0.000000
LIMKB : 0.895541 0.895541 0.000000 0.000000
LIMKET : 0.364979 0.364979 0.000000 0.000000
LIMKO2 : 0.006269 0.006269 0.000000 0.000000
LIMN : 0.278719 0.278719 0.000000 0.000000
LIMNB : 3.067774 3.067774 0.000000 0.000000
LIMO : 1.271169 1.271169 0.000000 0.000000
LIMO2 : 0.023484 0.023484 0.000000 0.000000
LIMO2H : 0.803024 0.803024 0.000000 0.000000
LIMO3 : 0.046852 0.046852 0.000000 0.000000
LIMO3H : 7.412819 7.412819 0.000000 0.000000
LIMPAN : 22.753441 22.753441 0.000000 0.000000
LISOPNO3 : 0.601547 0.601547 0.000000 0.000000
LISOPOH : 4.888948 4.888948 0.000000 0.000000
LNRO2H : 0.000000 0.000000 0.000000 nan
LNRO2N : 0.000000 0.000000 0.000000 nan
LOx : 4021.874495 4021.874495 0.000000 0.000000
LTRO2H : 0.137570 0.137570 0.000000 0.000000
LTRO2N : 0.343161 0.343161 0.000000 0.000000
LVOC : 0.058163 0.058163 0.000000 0.000000
LVOCOA : 16.811974 16.811974 0.000000 0.000000
LXRO2H : 0.102357 0.102357 0.000000 0.000000
LXRO2N : 0.521967 0.521967 0.000000 0.000000
MACR : 80.957138 80.957138 0.000000 0.000000
MACR1OO : 0.045987 0.045987 0.000000 0.000000
MACR1OOH : 5.833672 5.833672 0.000000 0.000000
MACRNO2 : 0.000385 0.000385 0.000000 0.000000
MAP : 666.106087 666.106087 0.000000 0.000000
MCO3 : 1.336663 1.336663 0.000000 0.000000
MCRDH : 10.486185 10.486185 0.000000 0.000000
MCRENOL : 1.542985 1.542985 0.000000 0.000000
MCRHN : 0.449600 0.449600 0.000000 0.000000
MCRHNB : 0.839177 0.839177 0.000000 0.000000
MCRHP : 9.412579 9.412579 0.000000 0.000000
MCROHOO : 0.003694 0.003694 0.000000 0.000000
MCT : 1.188473 1.188473 0.000000 0.000000
MEK : 281.572488 281.572488 0.000000 0.000000
MEKCO3 : 0.004624 0.004624 0.000000 0.000000
MEKPN : 2.024265 2.024265 0.000000 0.000000
MENO3 : 70.426428 70.426428 0.000000 0.000000
MGLY : 45.566321 45.566321 0.000000 0.000000
MO2 : 26.343659 26.343659 0.000000 0.000000
MOH : 2286.549104 2286.549104 0.000000 0.000000
MONITA : 0.138375 0.138375 0.000000 0.000000
MONITS : 6.488769 6.488769 0.000000 0.000000
MONITU : 0.672746 0.672746 0.000000 0.000000
MP : 2252.094035 2252.094035 0.000000 0.000000
MPAN : 10.606988 10.606988 0.000000 0.000000
MPN : 264.036216 264.036216 0.000000 0.000000
MSA : 43.640892 43.640892 0.000000 0.000000
MTPA : 31.037356 31.037356 0.000000 0.000000
MTPO : 10.258244 10.258244 0.000000 0.000000
MVK : 161.763749 161.763749 0.000000 0.000000
MVKDH : 52.935560 52.935560 0.000000 0.000000
MVKHC : 14.493869 14.493869 0.000000 0.000000
MVKHCB : 13.825096 13.825096 0.000000 0.000000
MVKHP : 18.257085 18.257085 0.000000 0.000000
MVKN : 3.590694 3.590694 0.000000 0.000000
MVKOHOO : 0.532870 0.532870 0.000000 0.000000
MVKPC : 5.482576 5.482576 0.000000 0.000000
MYRCO : 1.338212 1.338212 0.000000 0.000000
N : 0.000102 0.000102 0.000000 0.000000
N2 : 3.85977177e+12 3.85977177e+12 0.000000 0.000000
N2O : 2478280.950254 2478280.950254 0.000000 0.000000
N2O5 : 546.724253 546.724253 0.000000 0.000000
NAP : 1.80719474e-20 1.80719474e-20 0.000000 nan
NH3 : 214.424623 214.424623 0.000000 0.000000
NH4 : 422.708795 422.708795 0.000000 0.000000
NIT : 296.803817 296.803817 0.000000 0.000000
NITs : 35.504793 35.504793 0.000000 0.000000
NO : 605.469478 605.469478 0.000000 0.000000
NO2 : 1909.876652 1909.876652 0.000000 0.000000
NO3 : 25.222311 25.222311 0.000000 0.000000
NPHEN : 1.260206 1.260206 0.000000 0.000000
NPRNO3 : 8.232387 8.232387 0.000000 0.000000
NRO2 : 2.24411910e-20 2.24411910e-20 0.000000 nan
O : 149.949959 149.949959 0.000000 0.000000
O1D : 0.000026 0.000026 0.000000 0.000000
O2 : 1.18273569e+12 1.18273569e+12 0.000000 0.000000
O3 : 3232947.633482 3232947.633482 0.000000 0.000000
OCPI : 764.362075 764.362075 0.000000 0.000000
OCPO : 90.441643 90.441643 0.000000 0.000000
OCS : 5007.173417 5007.173417 0.000000 0.000000
OClO : 6.875820 6.875820 0.000000 0.000000
OH : 2.335335 2.335335 0.000000 0.000000
OIO : 0.507425 0.507425 0.000000 0.000000
OLND : 0.890275 0.890275 0.000000 0.000000
OLNN : 0.141467 0.141467 0.000000 0.000000
OTHRO2 : 0.378799 0.378799 0.000000 0.000000
PAN : 1753.945404 1753.945404 0.000000 0.000000
PCO : 55.072065 55.072065 0.000000 0.000000
PH2O2 : 33.924308 33.924308 0.000000 0.000000
PH2SO4 : 0.648058 0.648058 0.000000 0.000000
PHAN : 76.815880 76.815880 0.000000 0.000000
PHEN : 2.896459 2.896459 0.000000 0.000000
PIN : 0.022149 0.022149 0.000000 0.000000
PINAL : 50.487242 50.487242 0.000000 0.000000
PINO3 : 0.099687 0.099687 0.000000 0.000000
PINO3H : 15.605511 15.605511 0.000000 0.000000
PINONIC : 1.725440 1.725440 0.000000 0.000000
PINPAN : 61.955297 61.955297 0.000000 0.000000
PIO2 : 0.002832 0.002832 0.000000 0.000000
PIP : 0.101256 0.101256 0.000000 0.000000
PO2 : 0.255666 0.255666 0.000000 0.000000
POx : 4011.219610 4011.219610 0.000000 0.000000
PP : 32.174072 32.174072 0.000000 0.000000
PPN : 122.492837 122.492837 0.000000 0.000000
PRN1 : 0.557431 0.557431 0.000000 0.000000
PROPNN : 10.327546 10.327546 0.000000 0.000000
PRPE : 55.269587 55.269587 0.000000 0.000000
PRPN : 3.460014 3.460014 0.000000 0.000000
PSO4 : 3.692178 3.692178 0.000000 0.000000
PSO4AQ : 0.034423 0.034423 0.000000 0.000000
PYAC : 3.357677 3.357677 0.000000 0.000000
R4N1 : 0.282483 0.282483 0.000000 0.000000
R4N2 : 15.188487 15.188487 0.000000 0.000000
R4O2 : 0.294161 0.294161 0.000000 0.000000
R4P : 19.067385 19.067385 0.000000 0.000000
R7N1 : 0.072931 0.072931 0.000000 0.000000
R7N2 : 59.511281 59.511281 0.000000 0.000000
R7O2 : 0.329096 0.329096 0.000000 0.000000
R7P : 7.399648 7.399648 0.000000 0.000000
RA3P : 12.137416 12.137416 0.000000 0.000000
RB3P : 31.055257 31.055257 0.000000 0.000000
RCHO : 68.523252 68.523252 0.000000 0.000000
RCO3 : 0.190588 0.190588 0.000000 0.000000
RCOOH : 54.224876 54.224876 0.000000 0.000000
RIPA : 89.591574 89.591574 0.000000 0.000000
RIPB : 18.359725 18.359725 0.000000 0.000000
RIPC : 2.788663 2.788663 0.000000 0.000000
RIPD : 1.167552 1.167552 0.000000 0.000000
RNO3 : 0.013925 0.013925 0.000000 0.000000
ROH : 4.608702 4.608702 0.000000 0.000000
RP : 56.470392 56.470392 0.000000 0.000000
SALA : 317.170278 317.170278 0.000000 0.000000
SALAAL : 20.038165 20.038165 0.000000 0.000000
SALACL : 63.749163 63.749163 0.000000 0.000000
SALC : 2749.463849 2749.463849 0.000000 0.000000
SALCAL : 944.090675 944.090675 0.000000 0.000000
SALCCL : 1443.859257 1443.859257 0.000000 0.000000
SO2 : 320.313958 320.313958 0.000000 0.000000
SO4 : 1619.514721 1619.514721 0.000000 0.000000
SO4s : 0.484277 0.484277 0.000000 0.000000
SOAGX : 62.886205 62.886205 0.000000 0.000000
SOAIE : 193.045661 193.045661 0.000000 0.000000
SOAP : 172.044436 172.044436 0.000000 0.000000
SOAS : 1132.513086 1132.513086 0.000000 0.000000
STYR : 0.310726 0.310726 0.000000 0.000000
TLFUO2 : 0.000515 0.000515 0.000000 0.000000
TLFUONE : 0.039798 0.039798 0.000000 0.000000
TMB : 1.490197 1.490197 0.000000 0.000000
TOLU : 45.046026 45.046026 0.000000 0.000000
TRO2 : 0.060802 0.060802 0.000000 0.000000
TSOA0 : 53.291710 53.291710 0.000000 0.000000
TSOA1 : 21.265265 21.265265 0.000000 0.000000
TSOA2 : 91.879338 91.879338 0.000000 0.000000
TSOA3 : 44.100110 44.100110 0.000000 0.000000
TSOG0 : 5.900383 5.900383 0.000000 0.000000
TSOG1 : 10.227592 10.227592 0.000000 0.000000
TSOG2 : 216.753546 216.753546 0.000000 0.000000
TSOG3 : 518.842048 518.842048 0.000000 0.000000
XRO2 : 0.048364 0.048364 0.000000 0.000000
XYLE : 16.269421 16.269421 0.000000 0.000000
ZRO2 : 0.000938 0.000938 0.000000 0.000000
pFe : 1.024884 1.024884 0.000000 0.000000 The run w/o BLAS took slightly longer but that may be just normal load on the system. I did request whole nodes for the runs but variance in the disk I/O speeds could have affected the timings. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% GEOS-Chem Classic Benchmark Timing Information
%%%
%%% Ref = KPP+BLAS
%%% Dev = KPP-BLAS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Timer Ref [s] Dev [s] % Diff
-------------------------------------------------------------------------------
GEOS-Chem 12400.656 12715.938 2.542
HEMCO 1934.723 2006.875 3.729
All chemistry 5292.875 5450.250 2.973
=> Gas-phase chem 3379.211 3511.250 3.907
=> Photolysis 477.059 483.000 1.245
=> Aerosol chem 1231.848 1243.562 0.951
=> Linearized chem 16.391 18.688 14.014 *
Transport 989.945 989.250 -0.070
Convection 981.352 990.750 0.958
Boundary layer mixing 1131.090 1152.625 1.904
Dry deposition 42.602 47.312 11.056 *
Wet deposition 409.375 431.062 5.298
Diagnostics 1184.406 1216.750 2.731
Unit conversions 483.379 486.062 0.555Also, we may get better speedup when using integrators like SDIRK or Runge-Kutta, which had a lot more calls to the BLAS routines than the Rosenbrock integrators did. |
|
This looks good! I think it's ready for merging now. |
|
Thanks @RolfSander, Could you approve the PR so that I can start merging? Thanks. |
RolfSander
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
This PR seeks to replace calls to functions in
util/blas.f90with pure F90 code. The BLAS functions pre-dated modern Fortran, which now has core array operations, thus rendering BLAS obsolete.I have been using AI tools to generate the replacement code for the BLAS functions. Otherwise this would take much longer.
Tagging @RolfSander @jimmielin @msl3v