Skip to content

Conversation

@wilfonba
Copy link
Contributor

@wilfonba wilfonba commented Jan 13, 2026

User description

User description

This PR adds support for Tuo with a modules entry and template. The following test cases fail still:

Debug Failure on Tuo

 Failed test tests/41AD0140: 1D -> Chemistry -> MultiComponent Diffusion after 1 attempt(s).
 Test tests/41AD0140: 1D -> Chemistry -> MultiComponent Diffusion: Variable n°64 (1-indexed) in D/cons.1.00.000050.dat is not within tolerance:
  - Candidate:   0.93960977572303
  - Golden:      0.93960977469851
  - Error:       abs: 1.02E-09, rel: 1.09E-09
  - Tolerance:   abs: 1.00E-09, rel: 1.00E-09

Diagnostics - Maximum absolute error among FAILING variables:
 - File: D/cons.3.00.000050.dat
 - Variable n°101
 - Candidate: 4442186.893565808
 - Golden: 4442191.638619191
 - Absolute Error: 4.75E+00
 - Relative Error: 1.07E-06

Diagnostics - Maximum relative error among FAILING variables:
 - File: D/cons.2.00.000050.dat
 - Variable n°101
 - Candidate: 7.44588801e-06
 - Golden: 5.3e-13
 - Relative Error: 1.40E+07
 - Absolute Error: 7.45E-06

No Debug Failure on Tuo

 Failed test tests/AA9C637F: 3D -> 2 Fluid(s) -> capillary=T -> model_eqns=3 after 1 attempt(s).
 Test tests/AA9C637F: 3D -> 2 Fluid(s) -> capillary=T -> model_eqns=3: Variable n°4329 (1-indexed) in D/cons.10.00.000050.dat is not within tolerance:
  - Candidate:   0.79082801034946
  - Golden:      0.79047010302053
  - Error:       abs: 3.58E-04, rel: 4.53E-04
  - Tolerance:   abs: 1.00E-12, rel: 1.00E-12

Diagnostics - Maximum absolute error among FAILING variables:
 - File: D/cons.7.00.000050.dat
 - Variable n°6395
 - Candidate: 0.56686995042304
 - Golden: 0.5
 - Absolute Error: 6.69E-02
 - Relative Error: 1.34E-01

Diagnostics - Maximum relative error among FAILING variables:
 - File: D/cons.10.00.000050.dat
 - Variable n°6395
 - Candidate: 0.15135357145364
 - Golden: 0.17472070063246
 - Relative Error: 1.34E-01
 - Absolute Error: 2.34E-02

 Failed test tests/F0E6771E: 3D -> 2 Fluid(s) -> riemann_solver=2 -> model_eqns=3 after 1 attempt(s).
 Test tests/F0E6771E: 3D -> 2 Fluid(s) -> riemann_solver=2 -> model_eqns=3: Variable n°6053 (1-indexed) in D/cons.10.00.000050.dat is not within tolerance:
  - Candidate:   0.26623515460345
  - Golden:      0.21658043711463
  - Error:       abs: 4.97E-02, rel: 2.29E-01
  - Tolerance:   abs: 1.00E-12, rel: 1.00E-12

Diagnostics - Maximum absolute error among FAILING variables:
 - File: D/cons.10.00.000050.dat
 - Variable n°6053
 - Candidate: 0.26623515460345
 - Golden: 0.21658043711463
 - Absolute Error: 4.97E-02
 - Relative Error: 2.29E-01

Diagnostics - Maximum relative error among FAILING variables:
 - File: D/cons.10.00.000050.dat
 - Variable n°6053
 - Candidate: 0.26623515460345
 - Golden: 0.21658043711463
 - Relative Error: 2.29E-01
 - Absolute Error: 4.97E-02

I do not have a timeline for resolving these issues, but the vast majority of cases work, and the setup should be good enough for small tests and such that don't use these features.


PR Type

Bug fix, Enhancement


Description

  • Fix division by zero errors in levelset and boundary calculations

  • Correct energy index reference in pressure computation

  • Add Tuolumne (Tuo) HPC system support with modules and template

  • Refactor derived variables computation to pass field arrays as parameters

  • Update hipfort dependency version to rocm-6.3.1


Diagram Walkthrough

flowchart LR
  A["Numerical Stability Fixes"] --> B["Division by Zero Protection"]
  A --> C["Energy Index Correction"]
  D["Tuolumne HPC Support"] --> E["Module Configuration"]
  D --> F["Job Template"]
  G["Code Refactoring"] --> H["Derived Variables Parameters"]
  G --> I["Dependency Updates"]
Loading

File Walkthrough

Relevant files
Bug fix
3 files
m_compute_levelset.fpp
Prevent division by zero in levelset normal calculation   
+1/-2     
m_model.fpp
Add epsilon protection to edge tangent calculation             
+1/-1     
m_data_output.fpp
Fix energy index reference in pressure computation             
+1/-1     
Enhancement
6 files
m_variables_conversion.fpp
Simplify GPU routine decorator parameters                               
+1/-2     
m_derived_variables.fpp
Add field array parameters to derived variables subroutine
+4/-4     
m_start_up.fpp
Remove derived variables call and debug statements             
+0/-6     
m_time_steppers.fpp
Add derived variables module and call in time stepping loop
+3/-0     
modules.sh
Add Tuolumne system to module selection menu                         
+3/-2     
tuo.mako
Create Tuolumne job submission template                                   
+61/-0   
Dependencies
1 files
CMakeLists.txt
Update hipfort rocm version to 6.3.1                                         
+1/-1     
Configuration changes
1 files
modules
Add Tuolumne module configuration and dependencies             
+6/-0     


CodeAnt-AI Description

Add Tuolumne (tuo) support and improve runtime robustness and probe outputs

What Changed

  • Added support for the Tuolumne (tuo) system: new job template, modules entries, and selection option in the bootstrap script so users can target Tuolumne when preparing and running jobs.
  • Prevented rare divide-by-zero and normalization failures by clamping denominators to a small epsilon; fewer numerical instabilities and avoided NaNs in geometry and model calculations.
  • Ensure derived-variable probes run with correct data: derived-variable routine now receives the time-stage fields and is invoked during time-stepping when probes are enabled, producing consistent probe output files.
  • Use the correct energy field when computing pressure for output and updated GPU routine annotations and ROCm/hipfort tag for improved build compatibility.

Impact

✅ Can run workloads on Tuolumne (tuo)
✅ Fewer divide-by-zero crashes during geometry and model calculations
✅ More accurate and consistent probe/output data

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved numerical stability by preventing division by zero in levelset and boundary calculations.
  • New Features

    • Added support for OLCF Tuolumne computing system with GPU acceleration configuration.
  • Chores

    • Updated HIPFORT dependency to version rocm-6.3.1.
    • Refactored internal computational workflow for better modularity.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings January 13, 2026 16:01
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 13, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

📝 Walkthrough

Walkthrough

The PR introduces numeric stability improvements with epsilon-guarded divisions across levelset and boundary calculations, refactors the derived variables computation workflow to accept external variable containers and relocate computation from startup to time steppers module, and extends the toolchain to support the OLCF Tuolumne system with GPU configurations and updated hipfort dependency.

Changes

Cohort / File(s) Summary
Numeric Stability Guards
src/common/m_compute_levelset.fpp, src/common/m_model.fpp
Added epsilon-guarded division operations: normalize by max(norm, sgm_eps) in cylinder levelset calculation and guard edge tangent calculation against zero divisors to prevent NaN/Inf results in near-singular cases.
GPU Directive Simplification
src/common/m_variables_conversion.fpp
Simplified GPU_ROUTINE directive in s_compute_speed_of_sound from detailed annotation with function_name and cray_inline options to streamlined GPU_ROUTINE(parallelism='[seq]') form.
Derived Variables Workflow Refactoring
src/simulation/m_derived_variables.fpp, src/simulation/m_start_up.fpp, src/simulation/m_time_steppers.fpp
Relocated s_compute_derived_variables call from startup phase to time steppers module; updated subroutine signature to accept external containers (q_cons_vf, q_prim_ts1, q_prim_ts2); removed m_time_steppers dependency from derived variables module and added m_derived_variables dependency to time steppers module.
Probe Output Index Correction
src/simulation/m_data_output.fpp
Updated probe file pressure computation to use energy index (E_idx) instead of absolute array index when accessing conservative variables for 1D branches.
Toolchain System Support
toolchain/bootstrap/modules.sh, toolchain/modules, toolchain/templates/tuo.mako
Added OLCF Tuolumne (tuo) system configuration with GPU directives (gfx942), updated module selections, and introduced Mako template for batch job script generation with Flux directives and MPICH GPU support configuration.
Dependency Version Update
toolchain/dependencies/CMakeLists.txt
Updated hipfort ExternalProject_Add GIT_TAG from rocm-6.0.2 to rocm-6.3.1 in Cray-specific build branch.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

Review effort 4/5, size:XL, numeric-stability, refactoring, toolchain, gpu-support

Suggested reviewers

  • sbryngelson

Poem

A rabbit hops through epsilon's guard,
Dividing by max—no more zeros hard! 🐰
Derived variables shift stage by stage,
Tuolumne joins the toolchain's page,
Six point three rocm—a speedy new age! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Tuo debug' is vague and generic, using non-descriptive terms that do not clearly convey the main changes in the changeset. Replace with a more descriptive title that summarizes the primary changes, such as 'Add Tuolumne HPC system support and fix numerical stability issues' or 'Support Tuo system with division-by-zero fixes and derived variables refactoring'.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed PR description includes all major template sections with clear structure, but testing section lacks documented test execution evidence.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai bot added the size:M This PR changes 30-99 lines, ignoring generated files label Jan 13, 2026
@qodo-code-review
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The new edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2)) changes the behavior when boundary_edge(2) is negative: max will pick sgm_eps instead of the (negative) denominator, potentially flipping sign/magnitude and yielding incorrect normals for edges with negative y-direction. Consider using max(sgm_eps, abs(boundary_edge(2))) (and preserving sign separately) or guarding with an if on abs(boundary_edge(2)).

boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))

if (abs(boundary_edge(2)) < threshold_vector_zero) then
    if (edgetan > 0._wp) then
        ynormal = -1
Integration Risk

s_compute_derived_variables now takes q_cons_vf, q_prim_ts1, and q_prim_ts2 as arguments and is only invoked under probe_wrt from m_time_steppers. Please verify there are no other call paths expecting derived variables (e.g., COM files, coherent body info, or other outputs) when probes are disabled, and that all call sites were updated consistently to avoid stale host/device data or missing derived outputs.

subroutine s_compute_derived_variables(t_step, q_cons_vf, q_prim_ts1, q_prim_ts2)

    integer, intent(in) :: t_step
    type(scalar_field), dimension(:), intent(inout) :: q_cons_vf
    type(vector_field), dimension(:), intent(inout) :: q_prim_ts1, q_prim_ts2
Numerical Change

Switching the pressure call input from q_cons_vf(1)%sf(...) to q_cons_vf(E_idx)%sf(...) alters which conservative variable is treated as energy. This is likely correct for multi-equation setups, but it can change results and could explain some tolerance failures; please confirm E_idx is always set correctly for all models and that index 1 was not intentionally used for a specific formulation.

else
    call s_compute_pressure( &
        q_cons_vf(E_idx)%sf(j - 2, k, l), &
        q_cons_vf(alf_idx)%sf(j - 2, k, l), &
        dyn_p, pi_inf, gamma, rho, qv, rhoYks(:), pres, T)
end if

boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
edgetan = boundary_edge(1)/boundary_edge(2)
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Preserve the sign of boundary_edge(2) when preventing division by zero to avoid an incorrect sign flip in the edgetan calculation. [possible issue, importance: 9]

Suggested change
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
edgetan = boundary_edge(1)/(sign(1.0_wp, boundary_edge(2))*max(abs(boundary_edge(2)), sgm_eps))

boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
edgetan = boundary_edge(1)/boundary_edge(2)
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Sign-preservation bug: using max(sgm_eps, abs(...)) fixes magnitude but loses the original sign of boundary_edge(2) when sgm_eps is chosen; compute the denominator with the magnitude but preserve the original sign so the computed tangent keeps the correct orientation. [logic error]

Severity Level: Minor ⚠️

Suggested change
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
edgetan = boundary_edge(1)/sign(max(sgm_eps, abs(boundary_edge(2))), boundary_edge(2))
Why it matters? ⭐

This is the correct approach: clamp the magnitude to avoid tiny denominators but restore the original sign so the tangent keeps the correct orientation. The sign(...) wrapper yields a denominator with the clamped magnitude and the original sign, so downstream checks that rely on edgetan's sign remain valid. This directly addresses the visible logic bug in the PR diff.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/common/m_model.fpp
**Line:** 681:681
**Comment:**
	*Logic Error: Sign-preservation bug: using `max(sgm_eps, abs(...))` fixes magnitude but loses the original sign of `boundary_edge(2)` when `sgm_eps` is chosen; compute the denominator with the magnitude but preserve the original sign so the computed tangent keeps the correct orientation.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

subroutine s_compute_speed_of_sound(pres, rho, gamma, pi_inf, H, adv, vel_sum, c_c, c, qv)
$:GPU_ROUTINE(function_name='s_compute_speed_of_sound', &
& parallelism='[seq]', cray_inline=True)
$:GPU_ROUTINE(parallelism='[seq]')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The GPU annotation lacks an explicit function_name argument; other routines in this file provide function_name so the GPU toolchain may rely on it for symbol mapping—without it the offload/annotation processor may not correctly associate the pragma with this subroutine and the annotation may be ignored or misapplied. Add the function_name to match the pattern used elsewhere. [possible bug]

Severity Level: Critical 🚨

Suggested change
$:GPU_ROUTINE(parallelism='[seq]')
$:GPU_ROUTINE(function_name='s_compute_speed_of_sound', parallelism='[seq]', cray_inline=True)
Why it matters? ⭐

Other routines in this file (e.g. s_compute_pressure, s_compute_fast_magnetosonic_speed,
s_convert_species_to_mixture_variables_acc) include function_name and cray_inline=True in their
$:GPU_ROUTINE annotations. Keeping annotations consistent is actionable and low risk: adding
function_name='s_compute_speed_of_sound' (and optionally cray_inline=True) matches the established
pattern and helps the offload/annotation toolchain map symbols correctly. This does not fix a
correctness bug in the Fortran itself, but it is a sensible, minimal change to ensure the pragma
is applied as intended.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/common/m_variables_conversion.fpp
**Line:** 1380:1380
**Comment:**
	*Possible Bug: The GPU annotation lacks an explicit `function_name` argument; other routines in this file provide `function_name` so the GPU toolchain may rely on it for symbol mapping—without it the offload/annotation processor may not correctly associate the pragma with this subroutine and the annotation may be ignored or misapplied. Add the function_name to match the pattern used elsewhere.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

q_cons_vf(1)%sf(j - 2, k, l), &
q_cons_vf(E_idx)%sf(j - 2, k, l), &
q_cons_vf(alf_idx)%sf(j - 2, k, l), &
dyn_p, pi_inf, gamma, rho, qv, rhoYks(:), pres, T)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Passing rhoYks(:) creates an explicit array section expression; in some compilers this can create a temporary and change argument association semantics. Use the array name rhoYks (no (:)) to pass the array directly and avoid inadvertent temporaries or subtle differences in argument passing. [possible bug]

Severity Level: Critical 🚨

Suggested change
dyn_p, pi_inf, gamma, rho, qv, rhoYks(:), pres, T)
dyn_p, pi_inf, gamma, rho, qv, rhoYks, pres, T)
Why it matters? ⭐

Passing an explicit section (rhoYks(:)) can cause the compiler to create temporaries or change argument association semantics depending on the procedure interface. The final file already contains inconsistent uses (some calls pass rhoYks, others rhoYks(:)), so standardizing to pass the array name avoids accidental temporaries and is a sensible, low-risk change.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_data_output.fpp
**Line:** 1238:1238
**Comment:**
	*Possible Bug: Passing `rhoYks(:)` creates an explicit array section expression; in some compilers this can create a temporary and change argument association semantics. Use the array name `rhoYks` (no `(:)`) to pass the array directly and avoid inadvertent temporaries or subtle differences in argument passing.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.


if (probe_wrt) then
call s_time_step_cycling(t_step)
call s_compute_derived_variables(t_step, q_cons_ts(1)%vf, q_prim_ts1, q_prim_ts2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Missing GPU -> host synchronization before calling a host-side routine: s_compute_derived_variables reads arrays that may currently reside/been updated on the device (GPU) but were not updated back to the host, causing stale or invalid data to be used on the host and incorrect results or crashes. Insert a GPU update to ensure the host view of q_cons_ts(1)%vf, q_prim_ts1, and q_prim_ts2 is current before the call. [race condition]

Severity Level: Minor ⚠️

Suggested change
call s_compute_derived_variables(t_step, q_cons_ts(1)%vf, q_prim_ts1, q_prim_ts2)
$:GPU_UPDATE(host='[q_cons_ts(1)%vf,q_prim_ts1,q_prim_ts2]')
Why it matters? ⭐

The call to s_compute_derived_variables occurs after GPU kernels update q_cons_ts(1)%vf / related probe arrays.
Without a GPU->host synchronization the host routine can read stale/incomplete data. Adding a host GPU_UPDATE (as your improved_code shows) fixes a real potential data-race / stale-data bug.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_time_steppers.fpp
**Line:** 538:538
**Comment:**
	*Race Condition: Missing GPU -> host synchronization before calling a host-side routine: `s_compute_derived_variables` reads arrays that may currently reside/been updated on the device (GPU) but were not updated back to the host, causing stale or invalid data to be used on the host and incorrect results or crashes. Insert a GPU update to ensure the host view of `q_cons_ts(1)%vf`, `q_prim_ts1`, and `q_prim_ts2` is current before the call.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

# flux: --error="${name}.err"
# flux: --time=${walltime}
# flux: --exclusive
# flux:--setattr=thp=always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Logic/behaviour issue: the Flux directive comment is missing a space after the colon (# flux:--setattr=...), which will likely prevent Flux from recognizing the directive; add the missing space so the scheduler parses the directive. [logic error]

Severity Level: Minor ⚠️

Suggested change
# flux:--setattr=thp=always
# flux: --setattr=thp=always
Why it matters? ⭐

All other flux directives in the file use a space after "flux:". Many schedulers parse directives only when formatted consistently; adding the space makes the directive match the pattern used elsewhere and avoids it being ignored.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** toolchain/templates/tuo.mako
**Line:** 13:13
**Comment:**
	*Logic Error: Logic/behaviour issue: the Flux directive comment is missing a space after the colon (`# flux:--setattr=...`), which will likely prevent Flux from recognizing the directive; add the missing space so the scheduler parses the directive.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.


${helpers.template_prologue()}

ok ":) Loading modules:\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Runtime error: ok is not a standard shell builtin/command and will cause "command not found" when the generated script runs; replace it with a portable print command (use printf or echo) so the message is actually emitted. [possible bug]

Severity Level: Critical 🚨

Suggested change
ok ":) Loading modules:\n"
printf "%s\n" ":) Loading modules:"
Why it matters? ⭐

The template emits a shell script. "ok" is not a standard shell builtin/command and will cause "command not found" when the script runs. Replacing it with a portable print (printf or echo) fixes a real runtime error.

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** toolchain/templates/tuo.mako
**Line:** 25:25
**Comment:**
	*Possible Bug: Runtime error: `ok` is not a standard shell builtin/command and will cause "command not found" when the generated script runs; replace it with a portable print command (use `printf` or `echo`) so the message is actually emitted.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 13, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Sign/Denominator bug
    The newly added safe-division uses max(sgm_eps, boundary_edge(2)) as the denominator. That can change the sign of the denominator when boundary_edge(2) is negative (max will pick the positive sgm_eps), producing an incorrect edgetan sign and therefore wrong normal direction. This can produce incorrect normals for edges with negative y-component and lead to wrong geometry/normals.

  • GPU/Host data movement
    The new call to s_compute_derived_variables passes device-resident arrays (q_cons_ts(1)%vf, q_prim_ts1, q_prim_ts2). Verify that the called routine either runs on the device or that explicit host/device synchronization is performed before and after the call. Missing GPU updates could lead to stale data used on the host or lost updates on the device.

  • Interface mismatch risk
    The new subroutine signature for s_compute_derived_variables accepts q_cons_vf and two q_prim_ts* arguments and the code now calls s_write_probe_files(t_step, q_cons_vf, accel_mag). Verify that the types/dimensions expected by s_write_probe_files match what is being passed (e.g. previously the call used q_cons_ts(1)%vf). If s_write_probe_files expects a specific slice or a different shape, this can produce incorrect I/O data or subtle runtime misbehavior.

  • Normalization edge-case
    The new normalization uses a clamped denominator "max(norm2(xyz_local), sgm_eps)". This can produce a non-unit or unexpected normal when the true norm is below sgm_eps (the normal will have magnitude norm/sgm_eps instead of being zero or a safe fallback). Inspect numerical behavior for near-zero xyz_local and ensure the fallback is the intended behaviour (e.g., set normal to zero or use a stable unit fallback).

  • Possible Bug
    The PR replaces uses of the literal index 1 with E_idx when passing the "energy" field to s_compute_pressure. Ensure E_idx is always initialized and points to the intended energy variable in every build/configuration (sys_size variations, model_eqns, qbmm/bubbles variants, etc.). An out-of-range or wrong index here will feed incorrect energy values into the EOS and can silently alter pressure/primitive results.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 13, 2026

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for the OLCF Tuolumne (Tuo) system and includes several bug fixes related to numerical stability. The main additions are a new batch template for Flux scheduler, module configuration for the Tuo system using ROCm 6.3.1, and corrections to prevent division-by-zero errors.

Changes:

  • Added Tuo system support with Flux batch template and module configuration for ROCm 6.3.1
  • Fixed circular module dependency between m_time_steppers and m_derived_variables by refactoring function signatures
  • Added division-by-zero protection in geometry calculations using sgm_eps constant
  • Corrected energy index usage in pressure computation from q_cons_vf(1) to q_cons_vf(E_idx)

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
toolchain/templates/tuo.mako New Flux batch template for Tuo system with GPU and MPI configuration
toolchain/modules Added module definitions for Tuo system (ROCm 6.3.1, Cray compilers, AMD GPU support)
toolchain/dependencies/CMakeLists.txt Updated hipfort version from rocm-6.0.2 to rocm-6.3.1 for consistency
toolchain/bootstrap/modules.sh Added Tuo to system selection menu with improved alignment
src/simulation/m_time_steppers.fpp Added m_derived_variables module use and call to s_compute_derived_variables with required parameters
src/simulation/m_start_up.fpp Removed premature call to s_compute_derived_variables and cleaned up whitespace
src/simulation/m_derived_variables.fpp Removed circular dependency on m_time_steppers and updated function signature to accept parameters
src/simulation/m_data_output.fpp Fixed pressure computation to use E_idx instead of hardcoded index 1
src/common/m_variables_conversion.fpp Simplified GPU_ROUTINE annotation for s_compute_speed_of_sound
src/common/m_model.fpp Added division-by-zero protection using max(sgm_eps, boundary_edge(2))
src/common/m_compute_levelset.fpp Added division-by-zero protection using max(norm2(xyz_local), sgm_eps)

# flux: --error="${name}.err"
# flux: --time=${walltime}
# flux: --exclusive
# flux:--setattr=thp=always
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space after colon. The flux comment directive should be "# flux: --setattr=thp=always" to match the formatting of other flux directives in this template (lines 6-12, 14, 16, 19).

Suggested change
# flux:--setattr=thp=always
# flux: --setattr=thp=always

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/common/m_model.fpp (1)

677-695: max(sgm_eps, boundary_edge(2)) changes slope sign for negative boundary_edge(2) (geometry bug).

This “fix” can silently flip the tangent sign and distort normals. You already branch on abs(boundary_edge(2)) < threshold_vector_zero, so avoid dividing before that check (or clamp by magnitude while preserving sign).

Proposed fix (compute safely only when needed)
         do i = 1, boundary_edge_count
             boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
             boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
-            edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
-
             if (abs(boundary_edge(2)) < threshold_vector_zero) then
-                if (edgetan > 0._wp) then
+                if (boundary_edge(1) > 0._wp) then
                     ynormal = -1
                     xnormal = 0._wp
                 else
                     ynormal = 1
                     xnormal = 0._wp
                 end if
             else
+                edgetan = boundary_edge(1)/boundary_edge(2)
                 initial = boundary_edge(2)
                 ynormal = -edgetan*initial
                 xnormal = initial
             end if
src/common/m_compute_levelset.fpp (1)

542-558: Inconsistent normalization approach: use explicit near-zero check instead of clamping.

This normalization differs from s_circle_levelset and s_sphere_levelset, which explicitly check for near-zero distances and assign a deterministic fallback (0 or a fixed direction). The current clamping via max(norm2(xyz_local), sgm_eps) produces a non-unit normal (magnitude ~1e-16) when the distance is very small, which may cause downstream issues if unit normals are assumed.

Regarding device visibility: sgm_eps is a module parameter (defined in m_constants.fpp), so it is accessible on-device without requiring explicit copyin. However, the normalization semantics should match the pattern used elsewhere: explicitly handle the near-zero case with a clear fallback.

Proposed fix (explicit near-zero handling)
                     else
                         levelset%sf(i, j, k, ib_patch_id) = dist_surface
                         xyz_local = xyz_local*dist_surface_vec
-                        xyz_local = xyz_local/max(norm2(xyz_local), sgm_eps)
+                        if (f_approx_equal(norm2(xyz_local), 0._wp)) then
+                            xyz_local = 0._wp
+                        else
+                            xyz_local = xyz_local/norm2(xyz_local)
+                        end if
                         levelset_norm%sf(i, j, k, ib_patch_id, :) = matmul(rotation, xyz_local)
                     end if
🤖 Fix all issues with AI agents
In @src/common/m_variables_conversion.fpp:
- Around line 1379-1381: The GPU helper subroutine s_compute_speed_of_sound is
missing the explicit function_name in its $:GPU_ROUTINE decoration; update the
macro invocation for subroutine s_compute_speed_of_sound to include
function_name='s_compute_speed_of_sound' (i.e., restore the explicit naming in
the $:GPU_ROUTINE(...) parameters) so the GPU-callable helper matches the
documented decoration standard and improves clarity.

In @toolchain/bootstrap/modules.sh:
- Line 42: The "Tuolumne" module entry currently logs as "LLNL" but should be
labeled as OLCF/ORNL; update the log line that contains the string "Tuolumne
(tuo)" to use the OLCF/ORNL label instead of LLNL and relocate that line so it
sits with the other ORNL/OLCF entries (or merge it into the existing ORNL line)
in modules.sh to reflect the correct organization.

In @toolchain/templates/tuo.mako:
- Line 28: The module load command is using the wrong computer identifier ("-c
t") which doesn't match the module slug "tuo"; update the command in the
template (the line invoking "./mfc.sh load -c t -m ${'g' if gpu else 'c'}") to
use "-c tuo" so it passes the correct computer slug when invoking mfc.sh.
- Line 13: The Flux directive line contains "# flux:--setattr=thp=always" with a
missing space after the colon which breaks parsing; update that directive to
include a space after the colon so it reads "# flux: --setattr=thp=always"
(preserving the flag and value) to match the other flux directives in the
template.
🧹 Nitpick comments (1)
src/simulation/m_derived_variables.fpp (1)

119-177: Tighten s_compute_derived_variables interface (intent(in) / explicit shapes) to prevent accidental mutation + clarify contract.

Right now q_cons_vf, q_prim_ts1, q_prim_ts2 are intent(inout) but appear read-only here (at least in this file). Also, consider using dimension(sys_size) (or the appropriate explicit bounds) for q_cons_vf to match s_write_probe_files expectations and catch mismatches early. Based on learnings, keeping interfaces strict helps avoid GPU/host aliasing surprises.

Proposed signature tightening + doc touch-up
-    subroutine s_compute_derived_variables(t_step, q_cons_vf, q_prim_ts1, q_prim_ts2)
+    subroutine s_compute_derived_variables(t_step, q_cons_vf, q_prim_ts1, q_prim_ts2)

         integer, intent(in) :: t_step
-        type(scalar_field), dimension(:), intent(inout) :: q_cons_vf
-        type(vector_field), dimension(:), intent(inout) :: q_prim_ts1, q_prim_ts2
+        type(scalar_field), dimension(sys_size), intent(in) :: q_cons_vf
+        type(vector_field), dimension(:), intent(in) :: q_prim_ts1, q_prim_ts2

(And add !! @param q_cons_vf ..., !! @param q_prim_ts1 ..., !! @param q_prim_ts2 ....)

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31c938f and aca57a9.

📒 Files selected for processing (11)
  • src/common/m_compute_levelset.fpp
  • src/common/m_model.fpp
  • src/common/m_variables_conversion.fpp
  • src/simulation/m_data_output.fpp
  • src/simulation/m_derived_variables.fpp
  • src/simulation/m_start_up.fpp
  • src/simulation/m_time_steppers.fpp
  • toolchain/bootstrap/modules.sh
  • toolchain/dependencies/CMakeLists.txt
  • toolchain/modules
  • toolchain/templates/tuo.mako
💤 Files with no reviewable changes (1)
  • src/simulation/m_start_up.fpp
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{fpp,f90}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{fpp,f90}: Use 2-space indentation; continuation lines align beneath &
Use lower-case keywords and intrinsics (do, end subroutine, etc.)
Name modules with m_ pattern (e.g., m_transport)
Name public subroutines with s_ pattern (e.g., s_compute_flux)
Name public functions with f
_ pattern
Keep subroutine size ≤ 500 lines, helper subroutines ≤ 150 lines, functions ≤ 100 lines, files ≤ 1000 lines
Limit routine arguments to ≤ 6; use derived-type params struct if more are needed
Forbid goto statements (except in legacy code), COMMON blocks, and save globals
Every argument must have explicit intent; use dimension/allocatable/pointer as appropriate
Call s_mpi_abort() for errors, never use stop or error stop

**/*.{fpp,f90}: Indent 2 spaces; continuation lines align under &
Use lower-case keywords and intrinsics (do, end subroutine, etc.)
Name modules with m_<feature> prefix (e.g., m_transport)
Name public subroutines as s_<verb>_<noun> (e.g., s_compute_flux) and functions as f_<verb>_<noun>
Keep private helpers in the module; avoid nested procedures
Enforce size limits: subroutine ≤ 500 lines, helper ≤ 150, function ≤ 100, module/file ≤ 1000
Limit subroutines to ≤ 6 arguments; otherwise pass a derived-type 'params' struct
Avoid goto statements (except unavoidable legacy); avoid global state (COMMON, save)
Every variable must have intent(in|out|inout) specification and appropriate dimension / allocatable / pointer
Use s_mpi_abort(<msg>) for error termination instead of stop
Use !> style documentation for header comments; follow Doxygen Fortran format with !! @param and !! @return tags
Use implicit none statement in all modules
Use private declaration followed by explicit public exports in modules
Use derived types with pointers for encapsulation (e.g., pointer, dimension(:,:,:) => null())
Use pure and elemental attributes for side-effect-free functions; combine them for array ...

Files:

  • src/common/m_model.fpp
  • src/common/m_variables_conversion.fpp
  • src/common/m_compute_levelset.fpp
  • src/simulation/m_time_steppers.fpp
  • src/simulation/m_data_output.fpp
  • src/simulation/m_derived_variables.fpp
src/**/*.fpp

📄 CodeRabbit inference engine (.cursor/rules/mfc-agent-rules.mdc)

src/**/*.fpp: Use .fpp file extension for Fypp preprocessed files; CMake transpiles them to .f90
Start module files with Fypp include for macros: #:include 'macros.fpp'
Use the fypp ASSERT macro for validating conditions: @:ASSERT(predicate, message)
Use fypp macro @:ALLOCATE(var1, var2) for device-aware allocation instead of standard Fortran allocate
Use fypp macro @:DEALLOCATE(var1, var2) for device-aware deallocation instead of standard Fortran deallocate

Files:

  • src/common/m_model.fpp
  • src/common/m_variables_conversion.fpp
  • src/common/m_compute_levelset.fpp
  • src/simulation/m_time_steppers.fpp
  • src/simulation/m_data_output.fpp
  • src/simulation/m_derived_variables.fpp
src/simulation/**/*.{fpp,f90}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

src/simulation/**/*.{fpp,f90}: Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
Allocate large GPU arrays with managed memory or move them into persistent !$acc enter data regions at start-up
Avoid stop/error stop inside GPU device code
Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

src/simulation/**/*.{fpp,f90}: Mark GPU-callable helpers with $:GPU_ROUTINE(function_name='...', parallelism='[seq]') immediately after declaration
Do not use OpenACC or OpenMP directives directly; use Fypp macros from src/common/include/parallel_macros.fpp instead
Wrap tight loops with $:GPU_PARALLEL_FOR(private='[...]', copy='[...]') macro; add collapse=n for safe nested loop merging
Declare loop-local variables with private='[...]' in GPU parallel loop macros
Allocate large arrays with managed or move them into a persistent $:GPU_ENTER_DATA(...) region at start-up
Do not place stop or error stop inside device code

Files:

  • src/simulation/m_time_steppers.fpp
  • src/simulation/m_data_output.fpp
  • src/simulation/m_derived_variables.fpp
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Mark GPU-callable helpers with `$:GPU_ROUTINE(function_name='...', parallelism='[seq]')` immediately after declaration

Applied to files:

  • src/common/m_variables_conversion.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight GPU loops with !$acc parallel loop gang vector default(present) reduction(...); add collapse(n) when safe; declare loop-local variables with private(...)

Applied to files:

  • src/common/m_variables_conversion.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Wrap tight loops with `$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')` macro; add `collapse=n` for safe nested loop merging

Applied to files:

  • src/common/m_variables_conversion.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Ensure GPU code compiles with Cray ftn, NVIDIA nvfortran, GNU gfortran, and Intel ifx/ifort compilers

Applied to files:

  • src/common/m_variables_conversion.fpp
  • toolchain/dependencies/CMakeLists.txt
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Declare loop-local variables with `private='[...]'` in GPU parallel loop macros

Applied to files:

  • src/common/m_variables_conversion.fpp
  • src/simulation/m_derived_variables.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Do not use OpenACC or OpenMP directives directly; use Fypp macros from `src/common/include/parallel_macros.fpp` instead

Applied to files:

  • src/common/m_variables_conversion.fpp
  • src/simulation/m_derived_variables.fpp
📚 Learning: 2025-11-24T21:50:16.713Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T21:50:16.713Z
Learning: Applies to src/simulation/**/*.{fpp,f90} : Avoid stop/error stop inside GPU device code

Applied to files:

  • src/common/m_variables_conversion.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with GNU `gfortran` and Intel `ifx`/`ifort` for portability

Applied to files:

  • toolchain/modules
  • toolchain/dependencies/CMakeLists.txt
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Applies to **/*.{fpp,f90} : Keep private helpers in the module; avoid nested procedures

Applied to files:

  • src/simulation/m_derived_variables.fpp
📚 Learning: 2025-11-24T21:50:46.909Z
Learnt from: CR
Repo: MFlowCode/MFC PR: 0
File: .cursor/rules/mfc-agent-rules.mdc:0-0
Timestamp: 2025-11-24T21:50:46.909Z
Learning: Draft a step-by-step plan before making changes; build after each step using `./mfc.sh build -t pre_process simulation -j $(nproc)`

Applied to files:

  • toolchain/templates/tuo.mako
🧬 Code graph analysis (1)
toolchain/bootstrap/modules.sh (1)
toolchain/util.sh (2)
  • log (12-12)
  • log_n (13-13)
🪛 Shellcheck (0.11.0)
toolchain/bootstrap/modules.sh

[warning] 49-49: The surrounding quotes actually unquote this. Remove or escape them.

(SC2027)


[warning] 49-49: The surrounding quotes actually unquote this. Remove or escape them.

(SC2027)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: Oak Ridge | Frontier (CCE) (gpu-acc)
  • GitHub Check: Code Cleanliness Check
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Coverage Test on CodeCov
  • GitHub Check: Oak Ridge | Frontier (gpu-acc)
  • GitHub Check: Github (ubuntu, mpi, no-debug, true)
  • GitHub Check: Oak Ridge | Frontier (cpu)
  • GitHub Check: Github (ubuntu, mpi, debug, false)
  • GitHub Check: Github (ubuntu, mpi, debug, true)
  • GitHub Check: Oak Ridge | Frontier (gpu-omp)
  • GitHub Check: Github (macos, mpi, no-debug, false)
  • GitHub Check: Github (macos, mpi, debug, false)
  • GitHub Check: Github (ubuntu, no-mpi, single, no-debug, false)
  • GitHub Check: Github (ubuntu, mpi, no-debug, false)
  • GitHub Check: Agent
  • GitHub Check: Build & Publish
🔇 Additional comments (7)
toolchain/modules (1)

48-53: LGTM! Tuolumne module configuration is consistent with Frontier.

The module setup for OLCF Tuolumne looks correct:

  • ROCm 6.3.1 and CPE 25.03 align with Frontier's configuration
  • gfx942 is the correct target for MI300A GPUs
  • HSA_XNACK=0 is appropriate for performance

Note: There's no tuo-cpu entry defined. If CPU-only builds are intended to be supported on Tuolumne, you may need to add a tuo-cpu line (similar to how Frontier has f-gpu but also inherits from f-all). If GPU-only is intentional, this is fine as-is.

toolchain/bootstrap/modules.sh (1)

49-49: Shellcheck warning is a false positive for this color code pattern.

The SC2027 warnings flag the adjacent quote concatenation ($G""a$W), but this is an intentional pattern used throughout this file for embedding color codes. The pattern works correctly in bash to produce colored output.

toolchain/templates/tuo.mako (1)

1-61: Template structure and Flux integration look reasonable.

The overall template structure follows the pattern of other MFC templates with proper prologue/epilogue handling, conditional GPU support, and Flux scheduler integration appropriate for Tuolumne.

A few observations:

  • HSA_XNACK=0 is set both in toolchain/modules (line 52) and here (line 39). The duplication is harmless but could be consolidated.
  • The --coral2-hugepages=512GB directive on line 14 is a large allocation—verify this is the intended configuration for Tuolumne workloads.
toolchain/dependencies/CMakeLists.txt (1)

131-141: LGTM! HIPFORT version aligned with ROCm module.

The update to rocm-6.3.1 correctly aligns the HIPFORT dependency with the ROCm version. This change is appropriately scoped to the Cray compiler path only (line 130).

src/simulation/m_data_output.fpp (1)

1228-1239: Good fix using E_idx—but elasticity branch likely needs the same correction.

Switching the non-elastic call to q_cons_vf(E_idx)%sf(...) matches the energy argument of s_compute_pressure. However, the elasticity path still passes q_cons_vf(1)%sf(...) as energy, which looks inconsistent and likely wrong unless elasticity remaps indices. Please verify and align both branches if energy is always at E_idx.

Likely follow-up fix (if indices are consistent under elasticity)
-                        call s_compute_pressure( &
-                            q_cons_vf(1)%sf(j - 2, k, l), &
+                        call s_compute_pressure( &
+                            q_cons_vf(E_idx)%sf(j - 2, k, l), &
                             q_cons_vf(alf_idx)%sf(j - 2, k, l), &
                             dyn_p, pi_inf, gamma, rho, qv, rhoYks(:), pres, T, &
                             q_cons_vf(stress_idx%beg)%sf(j - 2, k, l), &
                             q_cons_vf(mom_idx%beg)%sf(j - 2, k, l), G_local)
src/simulation/m_time_steppers.fpp (2)

49-50: LGTM!

The module import follows the existing pattern in this file and is required for the new s_compute_derived_variables call.


536-539: Interface signature is compatible; call placement is correct.

The call to s_compute_derived_variables(t_step, q_cons_ts(1)%vf, q_prim_ts1, q_prim_ts2) matches the subroutine signature in m_derived_variables.fpp. All four arguments align with their corresponding parameters (integer for t_step, scalar_field array for q_cons_vf, and vector_field arrays for q_prim_ts1 and q_prim_ts2). The placement within the probe_wrt guard and after s_time_step_cycling is appropriate.

Comment on lines 1379 to 1381
subroutine s_compute_speed_of_sound(pres, rho, gamma, pi_inf, H, adv, vel_sum, c_c, c, qv)
$:GPU_ROUTINE(function_name='s_compute_speed_of_sound', &
& parallelism='[seq]', cray_inline=True)
$:GPU_ROUTINE(parallelism='[seq]')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for GPU_ROUTINE macro definition
fd -e fpp -e f90 | xargs grep -l "define.*GPU_ROUTINE" || true
rg "GPU_ROUTINE" --type f90 --type fpp -B 2 -A 5 | head -100

Repository: MFlowCode/MFC

Length of output: 84


🏁 Script executed:

# Look for macro definitions in includes or preprocessor files
fd "macros" --type f
fd -e fpp | xargs grep -l "GPU_ROUTINE"

Repository: MFlowCode/MFC

Length of output: 951


🏁 Script executed:

# Check for fypp macro definitions
fd -e fpp -path "*/macros*" 
rg "\#:def.*GPU_ROUTINE" --type fpp || rg "GPU_ROUTINE" --type fpp | head -50

Repository: MFlowCode/MFC

Length of output: 281


🏁 Script executed:

# Read the main macros file to find GPU_ROUTINE definition
cat src/common/include/macros.fpp | head -200

Repository: MFlowCode/MFC

Length of output: 6174


🏁 Script executed:

# Search for GPU_ROUTINE in all .fpp files
grep -r "GPU_ROUTINE" src/common/include/ src/simulation/ src/common/ --include="*.fpp" | head -50

Repository: MFlowCode/MFC

Length of output: 4576


🏁 Script executed:

# Look for the specific macro definition
grep -n "GPU_ROUTINE" src/common/include/macros.fpp src/common/include/parallel_macros.fpp src/common/include/acc_macros.fpp 2>/dev/null | head -30

Repository: MFlowCode/MFC

Length of output: 225


🏁 Script executed:

# Read the GPU_ROUTINE macro definition
sed -n '51,150p' src/common/include/parallel_macros.fpp

Repository: MFlowCode/MFC

Length of output: 4689


Include function_name to match documented GPU helper decoration standards.

The $:GPU_ROUTINE macro does support omitting function_name (it defaults to None), so this change is technically valid. However, the project guidance explicitly recommends marking GPU-callable helpers with the function name for consistency and clarity. Restore the explicit naming:

Suggested change
-        $:GPU_ROUTINE(parallelism='[seq]')
+        $:GPU_ROUTINE(function_name='s_compute_speed_of_sound', parallelism='[seq]')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
subroutine s_compute_speed_of_sound(pres, rho, gamma, pi_inf, H, adv, vel_sum, c_c, c, qv)
$:GPU_ROUTINE(function_name='s_compute_speed_of_sound', &
& parallelism='[seq]', cray_inline=True)
$:GPU_ROUTINE(parallelism='[seq]')
subroutine s_compute_speed_of_sound(pres, rho, gamma, pi_inf, H, adv, vel_sum, c_c, c, qv)
$:GPU_ROUTINE(function_name='s_compute_speed_of_sound', parallelism='[seq]')
🤖 Prompt for AI Agents
In @src/common/m_variables_conversion.fpp around lines 1379 - 1381, The GPU
helper subroutine s_compute_speed_of_sound is missing the explicit function_name
in its $:GPU_ROUTINE decoration; update the macro invocation for subroutine
s_compute_speed_of_sound to include function_name='s_compute_speed_of_sound'
(i.e., restore the explicit naming in the $:GPU_ROUTINE(...) parameters) so the
GPU-callable helper matches the documented decoration standard and improves
clarity.

# flux: --error="${name}.err"
# flux: --time=${walltime}
# flux: --exclusive
# flux:--setattr=thp=always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing space in Flux directive will cause parsing failure.

The directive # flux:--setattr is missing a space after the colon, inconsistent with all other flux directives in this file.

Proposed fix
-# flux:--setattr=thp=always
+# flux: --setattr=thp=always
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# flux:--setattr=thp=always
# flux: --setattr=thp=always
🤖 Prompt for AI Agents
In @toolchain/templates/tuo.mako at line 13, The Flux directive line contains "#
flux:--setattr=thp=always" with a missing space after the colon which breaks
parsing; update that directive to include a space after the colon so it reads "#
flux: --setattr=thp=always" (preserving the flag and value) to match the other
flux directives in the template.

ok ":) Loading modules:\n"
cd "${MFC_ROOT_DIR}"
% if engine == 'batch':
. ./mfc.sh load -c t -m ${'g' if gpu else 'c'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Module load command uses wrong computer identifier.

The command uses -c t but the module slug defined in toolchain/modules is tuo, not t. This will cause the module loading to fail with "Computer not recognized" error.

Proposed fix
-. ./mfc.sh load -c t -m ${'g' if gpu else 'c'}
+. ./mfc.sh load -c tuo -m ${'g' if gpu else 'c'}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
. ./mfc.sh load -c t -m ${'g' if gpu else 'c'}
. ./mfc.sh load -c tuo -m ${'g' if gpu else 'c'}
🤖 Prompt for AI Agents
In @toolchain/templates/tuo.mako at line 28, The module load command is using
the wrong computer identifier ("-c t") which doesn't match the module slug
"tuo"; update the command in the template (the line invoking "./mfc.sh load -c t
-m ${'g' if gpu else 'c'}") to use "-c tuo" so it passes the correct computer
slug when invoking mfc.sh.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 11 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="toolchain/templates/tuo.mako">

<violation number="1" location="toolchain/templates/tuo.mako:13">
P2: Missing space after `flux:` - inconsistent with all other flux directives in this file. This typo may cause the flux scheduler to not recognize this directive.</violation>
</file>

<file name="src/common/m_model.fpp">

<violation number="1" location="src/common/m_model.fpp:681">
P1: Division protection breaks for negative values. When `boundary_edge(2)` is negative, `max(sgm_eps, boundary_edge(2))` returns `sgm_eps`, causing incorrect sign and magnitude of `edgetan`. This will compute wrong normal directions. Consider using `sign(max(sgm_eps, abs(boundary_edge(2))), boundary_edge(2))` to preserve the sign while preventing division by zero.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
edgetan = boundary_edge(1)/boundary_edge(2)
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Division protection breaks for negative values. When boundary_edge(2) is negative, max(sgm_eps, boundary_edge(2)) returns sgm_eps, causing incorrect sign and magnitude of edgetan. This will compute wrong normal directions. Consider using sign(max(sgm_eps, abs(boundary_edge(2))), boundary_edge(2)) to preserve the sign while preventing division by zero.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/common/m_model.fpp, line 681:

<comment>Division protection breaks for negative values. When `boundary_edge(2)` is negative, `max(sgm_eps, boundary_edge(2))` returns `sgm_eps`, causing incorrect sign and magnitude of `edgetan`. This will compute wrong normal directions. Consider using `sign(max(sgm_eps, abs(boundary_edge(2))), boundary_edge(2))` to preserve the sign while preventing division by zero.</comment>

<file context>
@@ -678,7 +678,7 @@ contains
             boundary_edge(1) = boundary_v(i, 2, 1) - boundary_v(i, 1, 1)
             boundary_edge(2) = boundary_v(i, 2, 2) - boundary_v(i, 1, 2)
-            edgetan = boundary_edge(1)/boundary_edge(2)
+            edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
 
             if (abs(boundary_edge(2)) < threshold_vector_zero) then
</file context>
Suggested change
edgetan = boundary_edge(1)/max(sgm_eps, boundary_edge(2))
edgetan = boundary_edge(1)/sign(max(sgm_eps, abs(boundary_edge(2))), boundary_edge(2))

# flux: --error="${name}.err"
# flux: --time=${walltime}
# flux: --exclusive
# flux:--setattr=thp=always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing space after flux: - inconsistent with all other flux directives in this file. This typo may cause the flux scheduler to not recognize this directive.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At toolchain/templates/tuo.mako, line 13:

<comment>Missing space after `flux:` - inconsistent with all other flux directives in this file. This typo may cause the flux scheduler to not recognize this directive.</comment>

<file context>
@@ -0,0 +1,61 @@
+# flux: --error="${name}.err"
+# flux: --time=${walltime}
+# flux: --exclusive
+# flux:--setattr=thp=always
+# flux: --coral2-hugepages=512GB
+% if account:
</file context>
Suggested change
# flux:--setattr=thp=always
# flux: --setattr=thp=always

@wilfonba
Copy link
Contributor Author

Well, some of these AI suggestions seem valid, though the resulting code is somewhat gross. In particular, I'm referring to the preservation of the sign when preventing division by zero. I had to add these barriers because the run-time floating point exceptions were ocurring with debug builds on Tuo.

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 44.01%. Comparing base (31c938f) to head (aca57a9).

Files with missing lines Patch % Lines
src/common/m_compute_levelset.fpp 0.00% 1 Missing ⚠️
src/simulation/m_time_steppers.fpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1103      +/-   ##
==========================================
+ Coverage   44.00%   44.01%   +0.01%     
==========================================
  Files          71       71              
  Lines       20293    20283      -10     
  Branches     1982     1979       -3     
==========================================
- Hits         8929     8927       -2     
+ Misses      10227    10219       -8     
  Partials     1137     1137              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 3/5 size:M This PR changes 30-99 lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

1 participant