-
Notifications
You must be signed in to change notification settings - Fork 0
neo_bspline: 1D B-spline + matrix-free CGLS LSQ #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||||||||||
Update: Unified High-Performance ImplementationThis commit consolidates all B-spline functionality (1D/2D/3D) into a single optimized module with grid-based API: Key Changes
Benchmark Results (up to 100,000 points)3D Performance (cubic B-splines, degree 3):
Evaluation Performance (3D):
Note: The LSQ construction in All 44 tests pass. |
Cache spans and basis function values before CG iterations, then reuse them through cached apply_A3D and apply_A3D_T operators. This eliminates redundant find_span and basis_funs calls in each iteration. Benchmark results for 9261 3D data points: - Before: 1.05s - After: 0.26s (4x speedup)
Consolidate all B-spline functionality (1D/2D/3D) into a single module with optimized grid-based API exploiting separable tensor product structure: - Remove neo_bspline_3d.f90 and merge into unified neo_bspline.f90 - Precompute basis functions per grid line (O(n1+n2+n3) storage vs O(n*n*n)) - OpenMP parallel do + SIMD for construction operators - SIMD-only evaluation (no thread overhead for single-point queries) - New grid-based LSQ CGLS API: bspline_Nd_lsq_cgls(spl, x1, x2, ..., f_grid, coeff) - Update all tests and benchmarks for new API - 44/44 tests pass This enables efficient least-squares fitting on regular grids by caching basis functions once per coordinate axis instead of per grid point.
…rpolation Reorganize the B-spline code into clean, single-responsibility modules: - neo_bspline_base: core types, initialization, evaluation, basis functions - neo_bspline_interp: direct interpolation via LAPACK collocation solve - neo_bspline_lsq: least-squares fitting via matrix-free CGLS The direct interpolation variant solves A*c = f where A_ij = B_j(x_i), using separable tensor-product structure in 2D/3D to reduce dense solve to dimension-by-dimension LU factorizations. OpenMP parallelizes the back-substitution loops in 3D. Benchmarks updated to compare all three methods: interpolate module, neo_bspline LSQ, and neo_bspline direct. Large grid sizes skip direct interpolation (O(n^3) scaling) to avoid timeouts.
6b4b71e to
d5355ef
Compare
User description
This PR introduces a new 1D B-spline module with a textbook Cox–de Boor basis and a matrix-free CGLS LSQ solver, plus tests and plotting utilities.
Highlights:
[1/9] cd /home/ert/code/libneo/extra/MyMPILib && /home/ert/code/libneo/extra/MyMPILib/Scripts/do_versioning.sh
Versioning MyMPILib...
[2/9] Building Fortran preprocessed extra/MyMPILib/CMakeFiles/MyMPILib.dir/Specific/mpiprovider_module.f90-pp.f90
[3/9] Generating Fortran dyndep file extra/MyMPILib/CMakeFiles/MyMPILib.dir/Fortran.dd
[4/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Specific/mpiprovider_module.f90.o
[5/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Generic/genericWorkunit_module.f90.o
[6/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Generic/initWorkunit_module.f90.o
[7/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Internal/wuMergeWorkunit_module.f90.o
[8/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Internal/wuMergeChunk_module.f90.o
[9/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Internal/wuDataRequester_module.f90.o
[10/15] Building Fortran object extra/MyMPILib/CMakeFiles/MyMPILib.dir/Generic/scheduler_module.f90.o
[11/15] Linking Fortran static library extra/MyMPILib/libMyMPILib.a
[12/15] Building Fortran object test/CMakeFiles/test_mympilib.x.dir/source/derived_scheduler_module.f90.o
[13/15] Linking Fortran executable test/test_arnoldi.x
[14/15] Building Fortran object test/CMakeFiles/test_mympilib.x.dir/source/test_mympilib.f90.o
[15/15] Linking Fortran executable test/test_mympilib.x
cd build && ctest
Test project /home/ert/code/libneo/build
Start 1: test_binsrc
1/42 Test Make MPI optional and minimize dependencies on Fortran version and external libs #1: test_binsrc ....................... Passed 0.01 sec
Start 2: test_boozer_class
2/42 Test Integrate, document and test VMEC interfaces #2: test_boozer_class ................. Passed 0.02 sec
Start 3: test_efit_class
3/42 Test Supply latest version of all magnetic field routine variants #3: test_efit_class ................... Passed 0.02 sec
Start 4: test_geqdsk_tools
4/42 Test Ninjarun: handle Fortran module dependencies #4: test_geqdsk_tools ................. Passed 0.03 sec
Start 5: test_simpson
5/42 Test bdivfree routine: double precision should be changed back to double complex #5: test_simpson ...................... Passed 0.01 sec
Start 6: test_hdf5_tools
6/42 Test Remove aug from repository #6: test_hdf5_tools ................... Passed 2.02 sec
Start 7: test_util
7/42 Test Vmec #7: test_util ......................... Passed 0.01 sec
Start 8: test_neo_bspline_1d
8/42 Test Python VMEC magfie #8: test_neo_bspline_1d ............... Passed 0.02 sec
Start 9: test_interpolate
9/42 Test Adapting for Marconi Intel configuration #9: test_interpolate .................. Passed 1.86 sec
Start 10: test_batch_interpolate
10/42 Test Merge changes from magfie #10: test_batch_interpolate ............ Passed 1.86 sec
Start 11: test_collision_freqs
11/42 Test Move issues from magfie #11: test_collision_freqs .............. Passed 0.01 sec
Start 12: test_transport
12/42 Test Update build and CI/CD based on magfie #12: test_transport .................... Passed 0.01 sec
Start 13: test_vmec_modules
13/42 Test First eqdsk tests #13: test_vmec_modules ................. Passed 0.01 sec
Start 14: test_coordinate_systems
14/42 Test extract vacfield (and coil_tools) from MEPHIT #14: test_coordinate_systems ........... Passed 0.01 sec
Start 15: test_analytical_circular
15/42 Test Consistency check mixes geometric and logical coordinates #15: test_analytical_circular .......... Passed 0.06 sec
Start 16: test_ascot5_compare
16/42 Test Add check for existence of convexwall #16: test_ascot5_compare ............... Passed 17.08 sec
Start 17: test_arnoldi_setup
17/42 Test Evaluate equilibrium field only if specified #17: test_arnoldi_setup ................ Passed 0.35 sec
Start 18: test_arnoldi
18/42 Test Implement Biot-Savart variant #18: test_arnoldi ...................... Passed 0.06 sec
Start 19: test_mympilib
19/42 Test Replace pfile format by mgrid format #19: test_mympilib ..................... Passed 0.06 sec
Start 20: test_system_utility
20/42 Test Check maybe undefined behavior #20: test_system_utility ............... Passed 0.01 sec
Start 21: setup_test_jorek_field
21/42 Test Created converter for the conversion between poloidal and toroidal flux lable s_pol and s_tor #21: setup_test_jorek_field ............ Passed 0.01 sec
Start 30: test_jorek_field
22/42 Test Test even order in spline three to five #30: test_jorek_field .................. Passed 0.05 sec
Start 31: test_field
23/42 Test Integration test API #31: test_field ........................ Passed 0.05 sec
Start 22: cleanup_test_jorek_field
24/42 Test SPLIME #22: cleanup_test_jorek_field .......... Passed 0.01 sec
Start 23: test_biotsavart
25/42 Test Revised EQDSK class #23: test_biotsavart ................... Passed 0.39 sec
Start 24: test_example_field
26/42 Test Support CHEASE EQDSK files #24: test_example_field ................ Passed 0.01 sec
Start 25: test_biotsavart_field
27/42 Test Support FREEGS EQDSK files #25: test_biotsavart_field ............. Passed 0.02 sec
Start 26: test_mesh
28/42 Test Updated comments/naming #26: test_mesh ......................... Passed 0.01 sec
Start 27: test_field_mesh
29/42 Test branch psipol2phitor: Created converter between poloidal and toroidal #27: test_field_mesh ................... Passed 0.01 sec
Start 28: test_polylag_field
30/42 Test Add check for q profile for EQDSK #28: test_polylag_field ................ Passed 0.09 sec
Start 29: test_spline_field
31/42 Test Add routines to convert to harmonics in flux coordinates #29: test_spline_field ................. Passed 0.49 sec
Start 32: test_stretch_coords
32/42 Test Add Boozer Fourier transformations #32: test_stretch_coords ............... Passed 0.12 sec
Start 33: test_coil_tools_biot_savart
33/42 Test Add checks between Fortran and Python EQDSK routines #33: test_coil_tools_biot_savart ....... Passed 17.48 sec
Start 34: tilted_coil_generate_geometry
34/42 Test Add splines for covariant B components #34: tilted_coil_generate_geometry ..... Passed 0.14 sec
Start 35: tilted_coil_fourier_modes
35/42 Test Design MARS interface via OMFIT #35: tilted_coil_fourier_modes ......... Passed 26.39 sec
Start 36: tilted_coil_field_validation
36/42 Test Extend EQDSK interface to output B0 #36: tilted_coil_field_validation ...... Passed 3.81 sec
Start 37: test_polylag_5
37/42 Test Move all tests that depend on data in /proj/plasma to CODE #37: test_polylag_5 .................... Passed 0.01 sec
Start 38: test_poincare
38/42 Test Small workflow changes for development environment #38: test_poincare ..................... Passed 0.02 sec
Start 39: test_odeint_allroutines_context
39/42 Test Activate highest optimization for coil_tools #39: test_odeint_allroutines_context ... Passed 0.01 sec
Start 40: test_odeint_thread_safety
40/42 Test Splime2 #40: test_odeint_thread_safety ......... Passed 0.01 sec
Start 41: build_test_golden_record_odeint
41/42 Test Fix index errors in 3D splines #41: build_test_golden_record_odeint ... Passed 0.02 sec
Start 42: test_golden_record_odeint
42/42 Test Add derivatives to splines #42: test_golden_record_odeint ......... Passed 0.19 sec
100% tests passed, 0 tests failed out of 42
Label Time Summary:
biot_savart = 17.48 secproc (1 test)
build-helper = 0.02 secproc (1 test)
field = 1.25 secproc (10 tests)
magfie = 47.82 secproc (4 tests)
poincare = 0.02 secproc (1 test)
polylag = 0.01 secproc (1 test)
tilted_coil = 30.34 sec*proc (3 tests)
Total Test time (real) = 72.91 sec passes all Fortran and Python tests.
This provides a clean, textbook 1D B-spline basis and a proven matrix-free LSQ core as a foundation for future 2D/3D and batch extensions.
PR Type
Enhancement
Description
Implement 1D B-spline basis with Cox–de Boor recursion
Add matrix-free CGLS solver for least-squares fitting
Include partition of unity and LSQ accuracy tests
Provide Python visualization utility for fit results
Diagram Walkthrough
File Walkthrough
neo_bspline.f90
1D B-spline basis and matrix-free CGLS implementationsrc/interpolate/neo_bspline.f90
bspline_1dtype with degree, control points, andopen-uniform knot vector
bspline_1d_init_uniformfor knot vector initialization on [x_min,x_max]
bspline_1d_evalusing Cox–de Boor recursion for basisevaluation
bspline_1d_lsq_cglswith operator-basedA and A^T
find_span,basis_funs,apply_A,apply_ATtest_neo_bspline_1d.f90
B-spline partition of unity and LSQ accuracy teststest/source/test_neo_bspline_1d.f90
accuracy
data
plot_neo_bspline_1d.py
Python visualization utility for B-spline LSQ fitstest/scripts/plot_neo_bspline_1d.py
CMakeLists.txt
Register neo_bspline module in build systemsrc/interpolate/CMakeLists.txt
neo_bspline.f90to interpolate library buildCMakeLists.txt
Add neo_bspline test to CMake and CTesttest/CMakeLists.txt
test_neo_bspline_1d.xexecutable target