Skip to content

Conversation

@krystophny
Copy link
Member

@krystophny krystophny commented Dec 2, 2025

User description

This PR introduces a new 1D B-spline module with a textbook Cox–de Boor basis and a matrix-free CGLS LSQ solver, plus tests and plotting utilities.

Highlights:

100% tests passed, 0 tests failed out of 42

Label Time Summary:
biot_savart = 17.48 secproc (1 test)
build-helper = 0.02 sec
proc (1 test)
field = 1.25 secproc (10 tests)
magfie = 47.82 sec
proc (4 tests)
poincare = 0.02 secproc (1 test)
polylag = 0.01 sec
proc (1 test)
tilted_coil = 30.34 sec*proc (3 tests)

Total Test time (real) = 72.91 sec passes all Fortran and Python tests.

This provides a clean, textbook 1D B-spline basis and a proven matrix-free LSQ core as a foundation for future 2D/3D and batch extensions.


PR Type

Enhancement


Description

  • Implement 1D B-spline basis with Cox–de Boor recursion

  • Add matrix-free CGLS solver for least-squares fitting

  • Include partition of unity and LSQ accuracy tests

  • Provide Python visualization utility for fit results


Diagram Walkthrough

flowchart LR
  A["B-spline Basis<br/>Cox-de Boor"] --> B["Spline Evaluation<br/>apply_A"]
  A --> C["Adjoint Action<br/>apply_AT"]
  B --> D["Matrix-free CGLS<br/>LSQ Solver"]
  C --> D
  D --> E["Control Points<br/>coeff"]
  E --> F["Fit Visualization<br/>Python Plot"]
Loading

File Walkthrough

Relevant files
Enhancement
neo_bspline.f90
1D B-spline basis and matrix-free CGLS implementation       

src/interpolate/neo_bspline.f90

  • Implement bspline_1d type with degree, control points, and
    open-uniform knot vector
  • Add bspline_1d_init_uniform for knot vector initialization on [x_min,
    x_max]
  • Implement bspline_1d_eval using Cox–de Boor recursion for basis
    evaluation
  • Add matrix-free CGLS algorithm bspline_1d_lsq_cgls with operator-based
    A and A^T
  • Include helper routines: find_span, basis_funs, apply_A, apply_AT
+327/-0 
Tests
test_neo_bspline_1d.f90
B-spline partition of unity and LSQ accuracy tests             

test/source/test_neo_bspline_1d.f90

  • Add partition of unity test verifying sum_j N_j(x) == 1 to ~1e-10
    accuracy
  • Add LSQ fitting test for f(x) = cos(2x) + 0.5*sin(3x) with scattered
    data
  • Verify L2 error < 1e-5 on 201-point evaluation grid
  • Write plot data to file for visualization
+117/-0 
Documentation
plot_neo_bspline_1d.py
Python visualization utility for B-spline LSQ fits             

test/scripts/plot_neo_bspline_1d.py

  • Create Python script to visualize LSQ fit results
  • Plot analytic reference vs neo_bspline fit from test data file
  • Support configurable input data and output PNG paths
+45/-0   
Configuration changes
CMakeLists.txt
Register neo_bspline module in build system                           

src/interpolate/CMakeLists.txt

  • Add neo_bspline.f90 to interpolate library build
+1/-0     
CMakeLists.txt
Add neo_bspline test to CMake and CTest                                   

test/CMakeLists.txt

  • Add test_neo_bspline_1d.x executable target
  • Link with common libraries
  • Register test in CTest harness
+3/-0     

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Dec 2, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🟡
🎫 #1
🟢 Reduce reliance on Fortran features beyond F95 unless clearly beneficial.
🔴 Make MPI entirely optional in low-level routines (e.g., via preprocessor) or remove MPI
from low-level code like arnoldi; delegate parallelization to caller.
🟡
🎫 #2
🔴 Integrate VMEC interfaces into libneo and maintain as a central implementation.
Depend on NetCDF (and HDF5) as required deps for VMEC features.
Include SIMPLE updates: correct lambda interpolation on half-mesh and support
non-stellarator symmetric equilibria.
Add automated unit tests for VMEC magnetic field components and derivatives, and
comparisons against STELLOPT/libstell routines.
Provide API and user documentation for VMEC routines.
Address listed TODOs (stop writing empty NetCDF, add intents, pfUnit automation, coverage,
CI, etc.).
🟡
🎫 #3
🔴 Supply latest versions of all magnetic field routine variants, including VMEC flux
coordinates, Boozer coordinates, legacy variants, and additions from related projects.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Unhandled edge cases: Several routines use error stop for validation and do not handle edge cases like zero
denominators in basis evaluation or provide recoverable errors, leading to abrupt
termination.

Referred Code
if (degree < 1) error stop "bspline_1d_init_uniform: degree must be >= 1"
if (n_ctrl < degree + 1) then
    error stop "bspline_1d_init_uniform: need at least degree+1 control points"
end if
if (x_max <= x_min) error stop "bspline_1d_init_uniform: x_max <= x_min"

p = degree
n = n_ctrl
m = n + p + 1

spl%degree = p
spl%n_ctrl = n
spl%x_min = x_min
spl%x_max = x_max

if (allocated(spl%knots)) deallocate(spl%knots)
allocate(spl%knots(m))

! Open-uniform knot vector:
! First p+1 knots at x_min, last p+1 knots at x_max, interior equally spaced.
h = (x_max - x_min) / real(n - p, dp)


 ... (clipped 200 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new numerical routines perform computations without emitting any audit logs for
critical actions, but it is unclear whether such logging is required in this
non-user-facing scientific module.

Referred Code
subroutine bspline_1d_init_uniform(spl, degree, n_ctrl, x_min, x_max)
    !! Initialise open-uniform 1D B-spline on [x_min, x_max].
    type(bspline_1d), intent(out) :: spl
    integer, intent(in) :: degree, n_ctrl
    real(dp), intent(in) :: x_min, x_max

    integer :: p, n, m, i
    real(dp) :: h, left, right

    if (degree < 1) error stop "bspline_1d_init_uniform: degree must be >= 1"
    if (n_ctrl < degree + 1) then
        error stop "bspline_1d_init_uniform: need at least degree+1 control points"
    end if
    if (x_max <= x_min) error stop "bspline_1d_init_uniform: x_max <= x_min"

    p = degree
    n = n_ctrl
    m = n + p + 1

    spl%degree = p
    spl%n_ctrl = n


 ... (clipped 133 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status:
Error stop strings: The code uses error stop with internal messages that could surface implementation details
in user-facing contexts, but impact depends on how these errors are propagated in the host
application.

Referred Code
if (degree < 1) error stop "bspline_1d_init_uniform: degree must be >= 1"
if (n_ctrl < degree + 1) then
    error stop "bspline_1d_init_uniform: need at least degree+1 control points"
end if
if (x_max <= x_min) error stop "bspline_1d_init_uniform: x_max <= x_min"

p = degree
n = n_ctrl
m = n + p + 1

spl%degree = p
spl%n_ctrl = n
spl%x_min = x_min
spl%x_max = x_max

if (allocated(spl%knots)) deallocate(spl%knots)
allocate(spl%knots(m))

! Open-uniform knot vector:
! First p+1 knots at x_min, last p+1 knots at x_max, interior equally spaced.
h = (x_max - x_min) / real(n - p, dp)


 ... (clipped 69 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Minimal validation: The numerical routines validate dimensions and ranges but do not sanitize external inputs
or enforce strict bounds on data arrays beyond sizes; suitability depends on whether
inputs are trusted internal test data only.

Referred Code
n_data = size(x_data)
if (n_data /= size(f_data)) then
    error stop "bspline_1d_lsq_cgls: x_data and f_data size mismatch"
end if

n_ctrl = spl%n_ctrl
if (size(coeff) /= n_ctrl) then
    error stop "bspline_1d_lsq_cgls: coeff size mismatch"
end if

if (present(max_iter)) then
    kmax = max_iter
else
    kmax = 200
end if
if (present(tol)) then
    atol = tol
else
    atol = 1.0d-10
end if


 ... (clipped 42 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Dec 2, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix potential infinite binary search
Suggestion Impact:The commit modifies find_span but keeps the original potentially non-terminating binary search pattern. However, the suggestion’s intent—preventing a stuck loop—was effectively addressed by clamping x to [x_min, x_max] and adding an early return when xx >= knots(m - p). This reduces risk of infinite looping at the upper boundary, partially mitigating the issue though not by adopting the exact while (low < high) algorithm.

code diff:

+    subroutine find_span(spl, x, span)
+        type(bspline_1d), intent(in) :: spl
+        real(dp), intent(in) :: x
+        integer, intent(out) :: span
+        integer :: low, high, mid, p, n, m
+        real(dp) :: xx
+
+        p = spl%degree
+        n = spl%n_ctrl
+        m = n + p + 1
+        xx = min(max(x, spl%x_min), spl%x_max)
+        if (xx >= spl%knots(m - p)) then
+            span = n
+            return
+        end if
+        low = p + 1
+        high = n + 1
+        do
+            mid = (low + high)/2
+            if (xx < spl%knots(mid)) then
+                high = mid
+            else if (xx >= spl%knots(mid + 1)) then
+                low = mid
+            else
+                span = mid
+                exit
+            end if
+        end do
+    end subroutine find_span

Fix a potential infinite loop in the find_span binary search by replacing the
current implementation with a more robust algorithm that ensures loop
termination.

src/interpolate/neo_bspline.f90 [209-219]

-do
+do while (low < high)
     mid = (low + high)/2
     if (xx < spl%knots(mid)) then
         high = mid
-    else if (xx >= spl%knots(mid+1)) then
-        low = mid
     else
-        span = mid
-        exit
+        low = mid + 1
     end if
 end do
+span = low - 1

[Suggestion processed]

Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a critical bug in the binary search implementation that can lead to an infinite loop, and provides a robust, standard algorithm to fix it.

High
Prevent division by zero error

Prevent a potential division-by-zero error in basis_funs by checking if the
denominator is zero before performing the division. This can occur with
coincident knots.

src/interpolate/neo_bspline.f90 [243-253]

 do j = 1, p
     left(j) = x - spl%knots(span+1-j)
     right(j) = spl%knots(span+j) - x
     saved = 0.0_dp
     do r = 0, j-1
-        temp = N(r)/(right(r+1) + left(j-r))
+        temp = right(r+1) + left(j-r)
+        if (temp /= 0.0_dp) then
+            temp = N(r)/temp
+        else
+            temp = 0.0_dp
+        end if
         N(r) = saved + right(r+1)*temp
         saved = left(j-r)*temp
     end do
     N(j) = saved
 end do
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a potential division-by-zero error due to coincident knots, which is a critical bug that would cause the program to crash. The proposed fix correctly handles this case.

High
High-level
Re-evaluate adding a new spline implementation

Instead of adding a new B-spline module, consider integrating the new
functionality into the existing spline code. This would avoid code duplication
and reduce future maintenance.

Examples:

src/interpolate/neo_bspline.f90 [1-327]
module neo_bspline
    !! Simple 1D B-spline basis and matrix-free LSQ via CGLS.
    !!
    !! Uses textbook Cox–de Boor recursion (Piegl & Tiller) for basis
    !! evaluation and a standard CGLS algorithm for least-squares fitting.
    use, intrinsic :: iso_fortran_env, only : dp => real64
    implicit none
    private

    type :: bspline_1d

 ... (clipped 317 lines)

Solution Walkthrough:

Before:

// src/interpolate/CMakeLists.txt
add_library(interpolate STATIC
    ../spl_three_to_five.f90
    neo_bspline.f90 // New file added
    ...
)

// src/interpolate/neo_bspline.f90 (new file)
module neo_bspline
    type :: bspline_1d
    ...
    subroutine bspline_1d_init_uniform(...)
    subroutine bspline_1d_eval(...)
    subroutine bspline_1d_lsq_cgls(...)
    ...
end module

After:

// src/interpolate/CMakeLists.txt
add_library(interpolate STATIC
    ../spl_three_to_five.f90 // Existing file is modified/extended
    ...
    // neo_bspline.f90 is not added
)

// src/interpolate/spl_three_to_five.f90 (hypothetical, modified)
module spl_three_to_five
    ... // Existing spline code
    
    // New B-spline functionality is integrated here
    type :: bspline_1d 
    ...
    subroutine bspline_1d_lsq_cgls(...)
    ...
end module
Suggestion importance[1-10]: 8

__

Why: The suggestion raises a crucial architectural concern about introducing a new spline implementation (neo_bspline.f90) when existing ones appear to be present, which could lead to code duplication and increased maintenance overhead.

Medium
General
Improve test performance by reducing calls

Improve the performance of the test_bspline_1d_partition_unity test by calling
basis_funs once per sample point and summing the results, instead of repeatedly
calling bspline_1d_eval in a loop.

test/source/test_neo_bspline_1d.f90 [36-46]

+integer :: span
+real(dp), allocatable :: N(:)
+
+allocate(N(0:spl%degree))
+
 do i = 1, N_SAMPLE
     x = x_eval(i)
-    sum_basis = 0.0d0
-    do j = 1, N_CTRL
-        c = 0.0d0
-        c(j) = 1.0d0
-        call bspline_1d_eval(spl, c, x, temp)
-        sum_basis = sum_basis + temp
-    end do
+    call find_span(spl, x, span)
+    call basis_funs(spl, span, x, N)
+    sum_basis = sum(N(0:spl%degree))
     max_err = max(max_err, abs(sum_basis - 1.0d0))
 end do
 
+deallocate(N)
+

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 6

__

Why: The suggestion proposes a valid and significant performance optimization for a test routine by avoiding redundant calculations, which improves test execution speed without changing the logic.

Low
  • Update

@krystophny
Copy link
Member Author

Update: Unified High-Performance Implementation

This commit consolidates all B-spline functionality (1D/2D/3D) into a single optimized module with grid-based API:

Key Changes

  • Unified module: Merged neo_bspline_3d.f90 into neo_bspline.f90
  • Grid-optimized storage: Precompute basis functions per grid line (O(n1+n2+n3) storage vs O(n1*n2*n3))
  • OpenMP parallelization: parallel do collapse + simd for construction, SIMD-only for evaluation
  • New API: bspline_Nd_lsq_cgls(spl, x1, x2, ..., f_grid, coeff) for regular grids

Benchmark Results (up to 100,000 points)

3D Performance (cubic B-splines, degree 3):

Grid Points interpolate create neo_bspline create Speedup
1,000 0.87 ms 4.07 ms 0.2x
2,744 2.76 ms 33.6 ms 0.08x
9,261 9.31 ms 271 ms 0.03x
29,791 N/A (skipped) 956 ms -
97,336 N/A (skipped) 3.17 s -

Evaluation Performance (3D):

Grid Points interpolate eval neo_bspline eval Speedup
9,261 1.81 ms 1.32 ms 1.4x
29,791 N/A 5.26 ms -
97,336 N/A 14.4 ms -

Note: The LSQ construction in neo_bspline is slower than direct interpolation because it solves an iterative least-squares problem (CGLS), which is useful when data is noisy or when you need fewer control points than data points. The evaluation is faster due to the optimized tensor-product evaluation.

All 44 tests pass.

Cache spans and basis function values before CG iterations,
then reuse them through cached apply_A3D and apply_A3D_T
operators. This eliminates redundant find_span and basis_funs
calls in each iteration.

Benchmark results for 9261 3D data points:
- Before: 1.05s
- After:  0.26s (4x speedup)
Consolidate all B-spline functionality (1D/2D/3D) into a single module
with optimized grid-based API exploiting separable tensor product structure:

- Remove neo_bspline_3d.f90 and merge into unified neo_bspline.f90
- Precompute basis functions per grid line (O(n1+n2+n3) storage vs O(n*n*n))
- OpenMP parallel do + SIMD for construction operators
- SIMD-only evaluation (no thread overhead for single-point queries)
- New grid-based LSQ CGLS API: bspline_Nd_lsq_cgls(spl, x1, x2, ..., f_grid, coeff)
- Update all tests and benchmarks for new API
- 44/44 tests pass

This enables efficient least-squares fitting on regular grids by caching
basis functions once per coordinate axis instead of per grid point.
…rpolation

Reorganize the B-spline code into clean, single-responsibility modules:
- neo_bspline_base: core types, initialization, evaluation, basis functions
- neo_bspline_interp: direct interpolation via LAPACK collocation solve
- neo_bspline_lsq: least-squares fitting via matrix-free CGLS

The direct interpolation variant solves A*c = f where A_ij = B_j(x_i),
using separable tensor-product structure in 2D/3D to reduce dense solve
to dimension-by-dimension LU factorizations. OpenMP parallelizes the
back-substitution loops in 3D.

Benchmarks updated to compare all three methods: interpolate module,
neo_bspline LSQ, and neo_bspline direct. Large grid sizes skip direct
interpolation (O(n^3) scaling) to avoid timeouts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants