Skip to content

Conversation

@Jclavo
Copy link

@Jclavo Jclavo commented Dec 3, 2025

No description provided.

@Jclavo Jclavo self-assigned this Dec 3, 2025
rbonifacio and others added 22 commits December 8, 2025 20:35
…a command line.

+ Several refactrogins to the JSVA and Graph related classes.
Replace brittle file-based test discovery with robust reflection mechanism:
- Scan classpath resources (files and JARs) for MicroTestCase implementations
- Dynamically load classes by name, bypassing SBT classpath limitations
- Generalize solution across all Securibench test suites
- Fix 'Found 0 files for package' issue in SBT test execution

This resolves the test discovery problem where getJavaFilesFromPackage()
returned empty lists during SBT execution due to incomplete classpath.
Extract rule actions from JSVFA into separate package for better separation of concerns:
- Create RuleActions.scala with standalone rule action implementations
- Define SVFAContext trait to abstract SVFA operations needed by rule actions
- Implement ContextAwareRuleAction interface for clean context injection
- Add CopyFromBaseObjectToLocal action for base object to local taint flow
- Update RuleFactory to instantiate new standalone rule actions

This improves code maintainability by decoupling DSL rule logic from the main JSVFA class
and provides a cleaner architecture for extending taint propagation rules.
Consolidate Statement and StatementNode into single GraphNode case class:
- Remove redundant node hierarchy layers for better performance
- Add SourceLocation case class for unique conflict identification
- Implement findUniqueConflictingPaths to deduplicate conflicts by source location
- Add mergeNodesWithSameSourceLocation and getConflictSignature helpers
- Remove duplicate addEdge method and unused LambdaLabel trait

This simplification reduces memory overhead and improves conflict reporting
accuracy by avoiding duplicate paths from the same source location.
Fix duplicate rule names that prevented DSL parsing:
- Rename cookieMethods → cookieGetName, cookieGetValue, cookieGetComment
- Rename sessionMethods → setAttributeOfSession, getAttributeOfSession
- This resolves 'FAILURE: end of input expected' and 'Loaded 0 method rules'

Add String.concat() taint propagation support:
- Add stringConcat rule using CopyFromMethodCallToLocal() action
- Create Basic22Concat test to verify taint flow through s.concat('abc')
- Confirm String.concat() uses direct char array manipulation, not StringBuilder

Both String concatenation methods now supported: '+' operator and .concat() method.
Problem: Inter1 and most interprocedural tests failing (1/14 → 9/14 passing)
Root cause: Spark cs-demand:true not loading method bodies for private methods

Spark configuration improvements:
- Disable on-demand analysis (cs-demand:false) for complete call graph
- Add simulate-natives:true and simple-edges-bidirectional:false
- Ensure all reachable methods have proper call graph edges

JSVFA interprocedural analysis enhancements:
- Add fallback to force retrieve active body when missing
- Remove hasActiveBody check that was skipping valid methods
- Improve error handling for phantom vs. missing body methods
- Refactor method organization and add comprehensive documentation

Results:
- Inter1 test: 0/1 → 1/1 conflicts detected ✅
- Overall Inter suite: 1/14 → 9/14 tests passing (64% improvement)
- Fixed: Inter1, Inter2, Inter3, Inter8, Inter10, Inter11, Inter13, Inter14
- Interprocedural taint flow now works for private intra-class method calls
Update MethodBasedSVFATest to work with new GraphNode structure:
- Adapt to simplified node hierarchy (Statement/StatementNode → GraphNode)
- Fix compilation errors from SVFAContext interface changes
- Ensure test compatibility with refactored JSVFA architecture

This resolves compilation issues that arose from the DSL rule action
refactoring and graph node hierarchy simplification.
This commit introduces major improvements to the Securibench testing infrastructure,
making it more user-friendly, flexible, and comprehensive. In particular, it
decouples the metrics generation from test execution.
- Enhanced run-securibench-tests.sh to accept callgraph parameter
  - Supports spark (default), cha, and spark_library algorithms
  - Usage: ./scripts/run-securibench-tests.sh [suite] [callgraph] [clean|--help]
  - Passes -Dsecuribench.callgraph=<algorithm> to SBT for configuration

- Enhanced compute-securibench-metrics.sh to accept callgraph parameter
  - Auto-executes missing tests with specified call graph algorithm
  - Includes call graph algorithm in output filenames for differentiation
  - Usage: ./scripts/compute-securibench-metrics.sh [suite] [callgraph] [clean|--help]

- Updated documentation in README.md and USAGE_SCRIPTS.md
  - Added call graph algorithm descriptions and usage examples
  - Documented performance trade-offs between algorithms
  - Updated output file naming conventions

- Improved error handling with validation for call graph algorithms
- Maintains backward compatibility (defaults to SPARK call graph)

This enables easy comparison of SVFA analysis results across different
call graph algorithms for research and performance evaluation.
- Extended CallGraphAlgorithm enum to include RTA and VTA variants
  - Added RTA (Rapid Type Analysis) via SPARK with rta:true option
  - Added VTA (Variable Type Analysis) via SPARK with vta:true option
  - Added descriptions and performance characteristics for each algorithm

- Updated Soot configuration classes to support RTA/VTA
  - Enhanced ConfigurableJavaSootConfiguration with RTA/VTA phase options
  - Updated legacy JavaSootConfiguration with RTA/VTA case objects
  - Proper SPARK option configuration for each algorithm variant

- Enhanced test scripts with RTA/VTA support
  - Updated run-securibench-tests.sh to accept rta/vta parameters
  - Updated compute-securibench-metrics.sh with matching support
  - Improved error handling and validation for all 5 algorithms

- Comprehensive documentation updates
  - Updated README.md and USAGE_SCRIPTS.md with RTA/VTA examples
  - Created CALL_GRAPH_ALGORITHMS.md with detailed algorithm comparison
  - Added performance characteristics and usage guidelines
  - Documented precision vs. performance trade-offs

- Algorithm characteristics:
  - CHA: Fastest, least precise (class hierarchy only)
  - RTA: Fast, moderate precision (instantiated types)
  - VTA: Balanced speed/precision (field-based analysis)
  - SPARK: High precision, slower (full points-to analysis)
  - SPARK_LIBRARY: Most comprehensive, slowest (with libraries)

This provides researchers and developers with flexible call graph options
for different analysis scenarios, from quick prototyping (CHA/RTA) to
high-precision research analysis (SPARK/SPARK_LIBRARY).
- Created run_securibench_tests.py as Python alternative to run-securibench-tests.sh
  - Enhanced error handling with proper exception management
  - Colored terminal output with ANSI codes for better UX
  - Verbose mode with detailed progress information
  - Cross-platform compatibility (Windows/macOS/Linux)
  - Structured code with type hints and clear organization
  - Same command-line interface as bash version for compatibility

- Created compute_securibench_metrics.py as Python alternative to compute-securibench-metrics.sh
  - Automatic test execution for missing results
  - Native JSON processing for test result parsing
  - Built-in CSV generation with proper formatting
  - Rich console output with formatted metrics tables
  - Better error handling and timeout management
  - Structured metrics computation with TestResult and SuiteMetrics classes

- Key advantages of Python versions:
  - Maintainability: Clear class hierarchies, structured functions
  - Error handling: Proper exceptions vs bash error codes
  - Cross-platform: Works identically on all platforms
  - Features: Colored output, verbose mode, better argument parsing
  - Testing: Easy to unit test individual components
  - IDE support: Full autocomplete, debugging, refactoring

- Minimal dependencies approach:
  - Only Python 3.6+ standard library required
  - No external packages needed for core functionality
  - Optional enhancements can be added later (tqdm, etc.)

- Comprehensive documentation:
  - Added PYTHON_SCRIPTS.md with detailed comparison and usage
  - Updated README.md to showcase both bash and Python options
  - Updated USAGE_SCRIPTS.md with version comparison table
  - Included migration strategy and future enhancement plans

Both bash and Python versions coexist, allowing users to choose based on
their preferences and requirements. Python versions recommended for new
users and cross-platform deployments.
- Fixed line 337 in run_securibench_tests.py where {args.callgraph} was not
  being interpolated due to missing 'f' prefix
- Now correctly displays call graph algorithm name in output message
- Example: 'using rta call graph' instead of 'using {args.callgraph} call graph'
- Added count_test_results() function to parse JSON result files
- Enhanced execute_suite() to report both passed and failed test counts
- Updated output format from '14 tests executed in 16s using spark call graph'
  to '14 tests executed in 16s using spark call graph (9 passed, 5 failed)'
- Added JSON import for parsing test result files
- Improved visibility into test execution outcomes for better debugging

This provides immediate feedback on test success rates without needing
to run the separate metrics computation step.
…cripts

- Fixed count_test_results() in run_securibench_tests.py to calculate pass/fail
  based on expectedVulnerabilities == foundVulnerabilities comparison
- Fixed TestResult.from_json() in compute_securibench_metrics.py with same logic
- The JSON files don't contain a 'passed' field, so we need to derive it

Before fix:
- Inter suite: 14 tests executed (0 passed, 14 failed) ❌ WRONG
- Metrics showed all tests as failed regardless of actual results

After fix:
- Inter suite: 14 tests executed (9 passed, 5 failed) ✅ CORRECT
- Metrics now accurately reflect SVFA analysis results

This provides accurate immediate feedback on test success rates and ensures
metrics computation reflects the true analysis quality.
…e analysis

This enhancement allows users to execute tests and compute metrics across all 5
call graph algorithms (CHA, RTA, VTA, SPARK, SPARK_LIBRARY) in a single command.

Key Features:
✨ Execute all test suites with all call graph algorithms sequentially
✨ Generate two CSV reports: detailed (per-test) and aggregate (per-suite)
✨ Progress indicators showing current call graph being processed
✨ Execution order: CHA → RTA → VTA → SPARK → SPARK_LIBRARY (fastest to slowest)
✨ Stop execution on first failure for reliability
✨ Combined with --clean option for fresh analysis

Usage:
  python3 scripts/run_securibench_tests.py --all-call-graphs
  python3 scripts/compute_securibench_metrics.py --all-call-graphs

Output Files:
- securibench-all-callgraphs-detailed-YYYYMMDD-HHMMSS.csv
- securibench-all-callgraphs-aggregate-YYYYMMDD-HHMMSS.csv

Performance:
- ~15-25 minutes total execution time (5x longer than single call graph)
- Comprehensive comparison of all algorithms' precision and performance
- Ideal for research and algorithm evaluation

Both scripts support the new option with consistent behavior, auto-execution
of missing tests, and comprehensive error handling.
This comprehensive enhancement modernizes the SVFA testing infrastructure with
unified configuration and performance tracking capabilities.

🔧 UNIFIED CONFIGURATION SYSTEM:
✨ Introduced SVFAConfig case class replacing disparate trait-based configuration
✨ Centralized all SVFA settings: interprocedural, field sensitivity, taint propagation
✨ Added CallGraphAlgorithm sealed trait supporting CHA, RTA, VTA, SPARK, SPARK_LIBRARY
✨ Implemented ConfigurableAnalysis trait for runtime configuration changes
✨ Maintained full backward compatibility with existing trait-based code

🔧 CALL GRAPH ALGORITHM SUPPORT:
✨ Extended JavaSootConfiguration with RTA and VTA algorithm support
✨ Added proper Soot configuration for all 5 call graph algorithms
✨ Implemented ConfigurableJavaSootConfiguration for dynamic call graph selection
✨ Added comprehensive call graph configuration documentation

🔧 EXECUTION TIME METRICS:
✨ Added TotalExecutionTimeMs and AvgExecutionTimeMs columns to all aggregate CSV reports
✨ Enhanced TestResult class to properly extract and store execution time from JSON data
✨ Updated both run_securibench_tests.py and compute_securibench_metrics.py scripts
✨ Enables performance comparison between call graph algorithms (e.g., RTA ~2.7x slower than CHA)

🔧 CALL GRAPH RESULT ISOLATION:
✨ Fixed --all-call-graphs feature by isolating results per call graph algorithm
✨ Modified TestResultStorage to save results in call-graph-specific directories
✨ Results now saved to: target/test-results/{callgraph}/securibench/micro/{suite}/
✨ Prevents result overwriting when running multiple call graphs sequentially

🔧 ENHANCED TEST INFRASTRUCTURE:
✨ Updated JSVFATest, LineBasedSVFATest, MethodBasedSVFATest for new configuration
✨ Created SecuribenchConfig for command-line and environment variable configuration
✨ Added ConfigurableSecuribenchTest for flexible test configuration
✨ Implemented comprehensive test suites for configuration validation

🔧 COMPREHENSIVE DOCUMENTATION & CLEANUP:
✨ Created CALL_GRAPH_CONFIGURATION.md with algorithm comparison and usage guide
✨ Added CONFIGURATION_MODERNIZATION.md documenting the design and migration path
✨ Updated .gitignore to exclude sootOutput/, generated CSV files, Python cache, IDE artifacts

📊 RESEARCH IMPACT:
- Enables systematic comparison of call graph algorithms (accuracy AND performance)
- Supports reproducible research with consistent configuration
- Provides foundation for advanced SVFA research and experimentation
- Facilitates easy algorithm selection via command line or environment variables
- Comprehensive CSV reports support algorithm evaluation and research

This modernization maintains full backward compatibility while providing a robust
foundation for future SVFA research and development with comprehensive performance analysis.
@Jclavo Jclavo marked this pull request as ready for review December 24, 2025 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants