-
Notifications
You must be signed in to change notification settings - Fork 23
Description
MPAS-Model currently has two ways to build executables: CMake and make.
It was hypothesized that CMake-built executables will run slower than the make-built executables since the default CMake options are not optimized as make.
rrfs-workflow has adopted the make method so far. The problem is that different spack-stack versions may provide different combinations of C++/C/Fortran compilers and for each new combinations, we may need to add a new section into the MPAS-Model Makefile, such as PR #170. This may not be a sustainable way.
So we would like to transition to use CMake to compile MPAS-Model as early as possible for rrfs-workflow. This will be more aligned with modern software engineering practices, more consistent with other UFS applications, much easier to maintain.
The following is a comparison between CMake and make using current default settings (i.e. make is optimized while CMake needs adding fine-tuned optimizations). Both ran on gaeac6, 12h cold start forecasts on 10 nodes (ppn=160).
Conclusion: CMake completed in 3097s while make completed in 3024s, a difference of 73 seconds slower. I’d consider this within a reasonable margin of variability.
We will add fine-tuned optimization options to CMake and see how it goes.
CMake
timer_name total calls min max avg pct_tot pct_par par_eff
1 total time 3097.11304 1 3097.02368 3097.11304 3097.05566 100.00 0.00 1.00
2 initialize 53.16598 1 53.01339 53.16598 53.03205 1.72 1.72 1.00
3 read_ICs 7.81630 1 7.81389 7.81630 7.81475 0.25 14.70 1.00
2 diagnostic_fields 79.28183 5762 0.00218 2.54850 0.01267 2.56 2.56 0.92
2 stream_output 276.86783 2881 0.00000 25.98127 0.09607 8.94 8.94 1.00
2 stream_input 16.86766 1 16.85745 16.86766 16.85988 0.54 0.54 1.00
2 time integration 2601.04175 2880 0.86035 2.93290 0.90244 83.98 83.98 1.00
3 physics driver 457.91266 2880 0.07227 2.21594 0.11411 14.79 17.60 0.72
4 cal_mynncld 0.02754 48 0.00012 0.00185 0.00033 0.00 0.01 0.57
4 rrtmg_swrad 18.24232 48 0.00001 1.36461 0.19843 0.59 3.98 0.52
4 rrtmg_lwrad 31.55703 48 0.47152 0.70754 0.53042 1.02 6.89 0.81
4 sf_mynnsfclay 3.16672 2880 0.00050 0.01856 0.00069 0.10 0.69 0.63
4 sf_ruc 9.87083 2880 0.00005 0.00533 0.00118 0.32 2.16 0.34
4 bl_mynnedmf 252.37164 2880 0.03262 0.10403 0.04327 8.15 55.11 0.49
4 bl_ugwp_gwdo 64.89875 2880 0.01186 0.03679 0.01697 2.10 14.17 0.75
3 atm_rk_integration_setup 23.18732 2880 0.00083 0.00983 0.00387 0.75 0.89 0.48
3 atm_compute_moist_coefficients 16.44854 2880 0.00126 0.00838 0.00278 0.53 0.63 0.49
3 physics_get_tend 99.27243 2880 0.00166 1.31308 0.01293 3.21 3.82 0.38
3 atm_compute_vert_imp_coefs 16.58818 8640 0.00070 0.00510 0.00119 0.54 0.64 0.62
3 atm_compute_dyn_tend 308.98950 25920 0.00414 0.03390 0.00948 9.98 11.88 0.80
3 small_step_prep 31.60984 25920 0.00033 0.00357 0.00068 1.02 1.22 0.56
3 atm_bdy_adjust_dynamics_relaxzone_tend 39.64612 25920 0.00002 0.00438 0.00010 1.28 1.52 0.07
3 atm_advance_acoustic_step 142.87941 34560 0.00103 0.01378 0.00275 4.61 5.49 0.67
3 atm_divergence_damping_3d 33.90297 34560 0.00032 0.00333 0.00064 1.09 1.30 0.65
3 atm_recover_large_step_variables 179.86221 25920 0.00079 0.01022 0.00373 5.81 6.92 0.54
3 atm_compute_solve_diagnostics 157.25038 25920 0.00084 0.01267 0.00359 5.08 6.05 0.59
3 atm_rk_dynamics_substep_finish 34.66234 8640 0.00030 0.00642 0.00226 1.12 1.33 0.56
3 atm_advance_scalars 145.20296 5760 0.01480 0.04295 0.02192 4.69 5.58 0.87
3 atm_advance_scalars_mono 485.40430 2880 0.13404 0.42939 0.16317 15.67 18.66 0.97
3 microphysics 299.64453 2880 0.02061 0.26858 0.06058 9.67 11.52 0.58
4 mp_tempo 235.16112 2880 0.00946 0.22619 0.03726 7.59 78.48 0.46
make
timer_name total calls min max avg pct_tot pct_par par_eff
1 total time 3024.70166 1 3024.59546 3024.70166 3024.60938 100.00 0.00 1.00
2 initialize 47.04760 1 46.98198 47.04760 47.00020 1.56 1.56 1.00
3 read_ICs 3.34079 1 3.33413 3.34079 3.33882 0.11 7.10 1.00
2 diagnostic_fields 72.75267 5762 0.00218 2.19514 0.01200 2.41 2.41 0.95
2 stream_output 266.47327 2881 0.00000 21.64403 0.09248 8.81 8.81 1.00
2 stream_input 16.09141 1 16.08182 16.09141 16.08426 0.53 0.53 1.00
2 time integration 2574.39526 2880 0.85349 2.91890 0.89322 85.11 85.11 1.00
3 physics driver 452.84753 2880 0.06559 2.20191 0.10717 14.97 17.59 0.68
4 cal_mynncld 0.03018 48 0.00012 0.00191 0.00031 0.00 0.01 0.50
4 rrtmg_swrad 18.08443 48 0.00001 1.36585 0.19733 0.60 3.99 0.52
4 rrtmg_lwrad 35.15728 48 0.46342 0.77106 0.52213 1.16 7.76 0.71
4 sf_mynnsfclay 3.00286 2880 0.00048 0.00439 0.00066 0.10 0.66 0.63
4 sf_ruc 9.67783 2880 0.00004 0.00521 0.00113 0.32 2.14 0.34
4 bl_mynnedmf 239.33063 2880 0.02623 0.09847 0.03679 7.91 52.85 0.44
4 bl_ugwp_gwdo 66.13377 2880 0.01185 0.44685 0.01687 2.19 14.60 0.73
3 atm_rk_integration_setup 23.49383 2880 0.00118 0.35943 0.00416 0.78 0.91 0.51
3 atm_compute_moist_coefficients 16.65304 2880 0.00127 0.36100 0.00280 0.55 0.65 0.48
3 physics_get_tend 98.10201 2880 0.00168 1.34139 0.01309 3.24 3.81 0.38
3 atm_compute_vert_imp_coefs 16.41025 8640 0.00070 0.31947 0.00119 0.54 0.64 0.62
3 atm_compute_dyn_tend 321.95660 25920 0.00417 0.42781 0.00948 10.64 12.51 0.76
3 small_step_prep 30.59839 25920 0.00033 0.31577 0.00069 1.01 1.19 0.59
3 atm_bdy_adjust_dynamics_relaxzone_tend 38.88117 25920 0.00002 0.03143 0.00010 1.29 1.51 0.06
3 atm_advance_acoustic_step 135.33881 34560 0.00104 0.04513 0.00277 4.47 5.26 0.71
3 atm_divergence_damping_3d 38.36580 34560 0.00037 0.00373 0.00070 1.27 1.49 0.63
3 atm_recover_large_step_variables 177.47621 25920 0.00080 0.01080 0.00369 5.87 6.89 0.54
3 atm_compute_solve_diagnostics 160.50436 25920 0.00084 0.01115 0.00358 5.31 6.23 0.58
3 atm_rk_dynamics_substep_finish 34.15577 8640 0.00030 0.00642 0.00227 1.13 1.33 0.57
3 atm_advance_scalars 146.68997 5760 0.01433 0.04096 0.02192 4.85 5.70 0.86
3 atm_advance_scalars_mono 485.76904 2880 0.12722 0.18854 0.16303 16.06 18.87 0.97
3 microphysics 291.05881 2880 0.02013 0.12466 0.05969 9.62 11.31 0.59
4 mp_tempo 229.48993 2880 0.00936 0.10149 0.03712 7.59 78.85 0.47
The run directories:
/gpfs/f6/arfs-gsl/world-shared/gge/rrfs2/regressionTests/3km133tquv/stmp/20240506/rrfs_fcst_00_v2.1.2/det.CMake/fcst_00
and
/gpfs/f6/arfs-gsl/world-shared/gge/rrfs2/regressionTests/3km133tquv/stmp/20240506/rrfs_fcst_00_v2.1.2/det.make/fcst_00
This issue is copied at NOAA-EMC/rrfs-workflow#1094.