Skip to content

Berisch/Science

Repository files navigation

Ratio Optimization Tool

A Python-based tool for selecting the optimal ratio between indicators based on configurable criteria for powder/material characterization.

Overview

This tool analyzes multiple samples with different mixing ratios and evaluates them against optimal criteria for various physical and chemical parameters. It provides:

  • Discrete Optimization: Select the best ratio from measured data points (1:1, 1:2, 1:3, etc.)
  • Continuous Optimization: Find the true optimal ratio through interpolation (e.g., 1:4.268, 1:1.67)
  • Automated scoring based on proximity to optimal values
  • Detailed reports with parameter-by-parameter analysis
  • Visual comparisons through multiple chart types
  • Flexible configuration without hardcoded parameters
  • International data support: Handles both US (0.5) and European (0,5) decimal formats

Results

Discrete Optimization (table1.tsv)

Selects best from existing ratios:

  • Best Ratio: 1:1 with a score of 0.8661

Continuous Optimization (table2.tsv)

Finds optimal ratio through interpolation:

  • Optimal Ratio: 1:4.268 with a score of 0.4280
  • Improvement: +1.2% over best discrete ratio (1:4 at 0.4228)
  • Method: Automatic interpolation with bounded optimization

The continuous optimizer can find any ratio (e.g., 1:0.5, 1:1.67, 1:3.2) as optimal, not just the discrete measured points!

Project Structure

Science/
├── table1.tsv                               # Input data file (US format)
├── table2.tsv                               # Input data file (European format)
├── config.yaml                              # Optimization criteria (configurable)
├── optimize_ratios.py                       # Discrete optimization script
├── continuous_optimizer.py                  # Continuous optimization script (NEW!)
├── visualize_results.py                     # Discrete visualization module
├── visualize_continuous.py                  # Continuous visualization module (NEW!)
├── requirements.txt                         # Python dependencies
├── README.md                                # This file
├── .venv/                                   # Virtual environment (created by uv)
└── results/                                 # Output directory
    ├── optimization_report.txt              # Discrete optimization report
    ├── optimization_results.csv             # Discrete results (CSV)
    ├── optimization_results.json            # Discrete results (JSON)
    ├── continuous_optimization_report.txt   # Continuous optimization report (NEW!)
    ├── continuous_optimization_results.json # Continuous results (JSON) (NEW!)
    └── plots/                               # Visualization directory
        ├── radar_chart.png                  # Discrete: radar comparison
        ├── score_comparison.png             # Discrete: score bars
        ├── heatmap.png                      # Discrete: score heatmap
        ├── parameter_distributions.png      # Discrete: parameter values
        ├── parameter_curves.png             # Continuous: fitted curves (NEW!)
        ├── score_landscape.png              # Continuous: score vs ratio (NEW!)
        ├── optimal_ratio_summary.png        # Continuous: dashboard (NEW!)
        └── validation_results.png           # Continuous: accuracy (NEW!)

Setup

Prerequisites

  • Python 3.8 or higher
  • uv (fast Python package installer)

Installation

  1. The virtual environment has already been created. To activate it:

Windows:

.venv\Scripts\activate

Linux/Mac:

source .venv/bin/activate
  1. If you need to install dependencies again:
uv pip install -r requirements.txt

Usage

Method 1: Discrete Optimization (Select from measured ratios)

1. Configure Optimal Criteria

Edit config.yaml to adjust optimal values for each parameter:

criteria:
  bulk_density:
    type: exact           # Score based on exact target
    target: 0.5
    weight: 1.0

  carrs_index:
    type: threshold       # Score based on maximum threshold
    max: 25
    weight: 1.0

  moisture_content:
    type: range           # Score based on acceptable range
    min: 0.5
    max: 3.5
    weight: 1.0

Parameter Types:

  • exact: Parameters with a specific target value (e.g., bulk density = 0.5)
  • threshold: Parameters with a maximum limit (e.g., Carr's index < 25)
  • range: Parameters with an acceptable range (e.g., moisture content: 0.5-3.5%)

Weights: Adjust the importance of each parameter (default: 1.0)

2. Run Discrete Optimization

python optimize_ratios.py
# or
.venv\Scripts\python.exe optimize_ratios.py

This will:

  • Read data from the configured TSV file (default: table2.tsv)
  • Calculate scores for each measured ratio
  • Select the best ratio from available data points
  • Generate detailed reports in results/

3. Generate Discrete Visualizations

python visualize_results.py

Creates four types of plots:

  1. Radar Chart: Multi-dimensional parameter comparison
  2. Score Comparison: Bar charts with scores and violations
  3. Heatmap: Color-coded matrix of all parameter scores
  4. Parameter Distributions: Values vs. optimal ranges

Method 2: Continuous Optimization (Find true optimal ratio)

1. Configure Interpolation Settings

In config.yaml, adjust interpolation settings:

interpolation:
  method: auto              # auto, cubic, pchip, linear, rbf
  search_range: [0.5, 6.0]  # Min and max ratio to search
  extrapolation_penalty: 0.1 # Penalty outside observed range

continuous_optimization:
  method: bounded           # bounded or global
  tolerance: 1e-5
  validate_models: true

2. Run Continuous Optimization

python continuous_optimizer.py
# or
.venv\Scripts\python.exe continuous_optimizer.py

This will:

  • Fit interpolation curves to each parameter
  • Search for the optimal ratio continuously (not just discrete points)
  • Validate interpolation accuracy with cross-validation
  • Find ratios like 1:4.268, 1:1.67, etc. (not limited to 1:1, 1:2, 1:3...)
  • Generate continuous optimization reports

Output:

Optimal Ratio: 1:4.268
Optimal Score: 0.4280
Improvement: +1.2% over best discrete ratio

3. Generate Continuous Visualizations

python visualize_continuous.py

Creates four advanced plots:

  1. Parameter Curves: Shows fitted interpolation curves vs. actual data
  2. Score Landscape: Complete score profile across all ratios
  3. Optimal Ratio Summary: Comprehensive dashboard with multiple views
  4. Validation Results: Interpolation accuracy metrics

Quick Start Examples

Run everything:

# Discrete optimization
.venv\Scripts\python.exe optimize_ratios.py
.venv\Scripts\python.exe visualize_results.py

# Continuous optimization
.venv\Scripts\python.exe continuous_optimizer.py
.venv\Scripts\python.exe visualize_continuous.py

Using uv:

uv run continuous_optimizer.py
uv run visualize_continuous.py

Input Data Format

The TSV files should be tab-separated with:

  • First column: Parameter names
  • Subsequent columns: Sample data with ratios in parentheses (e.g., "No1(1:1)")
  • Values: Can be either:
    • Mean ± standard deviation (e.g., "0.51 ± 0.01" or "0,51 ± 0,01")
    • Single values (e.g., "2.5" or "2,5")
  • Decimal format: Automatically handles both US (period) and European (comma) formats

Example (US format):

Parameter                          No1(1:1)      No2(1:2)
Bulk density (ρ_b), g/ml          0.51 ± 0.01   0.48 ± 0.02
Hausner ratio                      1.03 ± 0.01   1.15 ± 0.02

Example (European format):

Parameter                          No1(1:1)      No2(1:2)
Bulk density (ρ_b), g/ml          0,51 ± 0,01   0,48 ± 0,02
Hausner ratio                      1,03 ± 0,01   1,15 ± 0,02

Both formats are supported automatically - no configuration needed!

Scoring System

Exact Target Scoring

For parameters with exact targets, scores decrease exponentially with distance:

score = exp(-sensitivity × (value - target)²)

Threshold Scoring

For "less than" criteria:

  • Full score (1.0) if value ≤ threshold
  • Linear penalty if value > threshold

Range Scoring

For acceptable ranges:

  • Full score (1.0) if within range
  • Exponential penalty based on distance outside range

Continuous Optimization Methods

The continuous optimizer uses interpolation to predict parameter values at any ratio:

Interpolation Methods

Auto Selection (default): Automatically chooses the best method per parameter:

  • PCHIP (Piecewise Cubic Hermite Interpolating Polynomial): For monotonic parameters
  • Cubic Spline: For smooth, non-monotonic parameters
  • Linear: For datasets with < 4 points

Manual Methods (configurable in config.yaml):

  • cubic: Cubic spline interpolation (smooth curves)
  • pchip: Monotonicity-preserving interpolation
  • linear: Linear interpolation (simple, stable)
  • rbf: Radial basis functions (for noisy data)
  • spline: Smoothing spline (adjustable smoothing)

Optimization Strategy

  1. Fit Models: Interpolation curves fitted to each parameter
  2. Predict Values: Calculate parameter values at any ratio
  3. Optimize: Use scipy.optimize to find ratio with maximum score
  4. Validate: Cross-validation ensures interpolation accuracy
  5. Extrapolation Handling: Penalties applied for ratios outside observed range

Validation

Leave-One-Out Cross-Validation:

  • Removes one data point
  • Fits model on remaining points
  • Predicts the removed point
  • Calculates prediction error
  • Repeats for all points

Metrics:

  • Mean Absolute Error (MAE)
  • Relative Error (% of mean value)
  • Root Mean Square Error (RMSE)

Output Files

Discrete Optimization Outputs

optimization_report.txt Human-readable text report with:

  • Best ratio identification
  • Complete rankings
  • Detailed parameter scores
  • Status indicators (+ excellent, ~ acceptable, - poor)

optimization_results.csv Spreadsheet-compatible format with:

  • All parameter values
  • Individual parameter scores
  • Overall scores for each ratio

optimization_results.json Machine-readable format containing:

  • Complete scoring details
  • Configuration used
  • Statistical information

Continuous Optimization Outputs

continuous_optimization_report.txt Comprehensive report including:

  • Optimal ratio (e.g., 1:4.268)
  • Predicted parameter values at optimal ratio
  • Comparison with best discrete ratio
  • Improvement percentage
  • Score landscape analysis (local maxima)
  • Interpolation validation metrics
  • Extrapolation warnings (if applicable)

continuous_optimization_results.json Machine-readable format with:

  • Optimal ratio and score
  • Predicted parameters with extrapolation flags
  • Validation metrics per parameter
  • Observed ratio range
  • Improvement over discrete optimization

Visualization Outputs

Discrete Plots (results/plots/):

  • radar_chart.png: Multi-parameter comparison
  • score_comparison.png: Score bars and violations
  • heatmap.png: Parameter score matrix
  • parameter_distributions.png: Values vs. targets

Continuous Plots (results/plots/):

  • parameter_curves.png: Fitted curves vs. actual data points
  • score_landscape.png: Total score vs. ratio + gradient
  • optimal_ratio_summary.png: Comprehensive dashboard
  • validation_results.png: Interpolation accuracy metrics

Customization

Adjust Scoring Sensitivity

In config.yaml:

scoring:
  exact_sensitivity: 10.0      # Higher = stricter for exact targets
  range_sensitivity: 5.0        # Higher = stricter for ranges
  threshold_penalty: 0.5        # Penalty factor for threshold violations

Modify Parameter Weights

Give more importance to specific parameters:

criteria:
  bulk_density:
    weight: 2.0    # Double importance
  moisture_content:
    weight: 0.5    # Half importance

Configure Continuous Optimization

Control interpolation and optimization behavior:

interpolation:
  method: auto              # auto, cubic, pchip, linear, rbf, spline
  search_range: [0.5, 6.0]  # Ratio search bounds
  extrapolation_penalty: 0.1 # Penalty for extrapolation
  polynomial_degree: 3       # For polynomial fitting
  smoothing_factor: 0.1      # For spline smoothing

continuous_optimization:
  method: bounded            # bounded (fast) or global (thorough)
  tolerance: 1e-5            # Convergence tolerance
  max_iterations: 1000       # Maximum optimization iterations
  validate_models: true      # Run cross-validation
  bootstrap_samples: 100     # For uncertainty estimation

Interpolation Methods:

  • auto: Automatically selects best method per parameter
  • cubic: Smooth cubic splines
  • pchip: Preserves monotonicity
  • linear: Simple linear interpolation
  • rbf: Radial basis functions
  • spline: Smoothing spline

Optimization Methods:

  • bounded: Fast local optimization (recommended)
  • global: Slower but finds global optimum

Add New Parameters

  1. Add parameter to your TSV file
  2. Add criteria to config.yaml:
  your_new_parameter:
    type: exact     # or threshold, or range
    target: 10.0
    weight: 1.0

Switch Data Files

Change the data file in the optimizer script:

optimize_ratios.py (line 19):

def __init__(self, data_file='table2.tsv', config_file='config.yaml'):

continuous_optimizer.py (line 19):

def __init__(self, data_file='table2.tsv', config_file='config.yaml'):

Or pass as parameter when creating the optimizer instance.

Optimal Criteria Reference

Current optimal values (as configured):

Parameter Optimal Value/Range
Bulk density 0.5 g/ml
Bulk density (compacted) 0.52 g/ml
Hausner ratio 1.01
Carr's index < 25
Angle of repose < 25°
Compressibility index 1.15
Moisture content 0.5-3.5%
Vibration compaction coefficient 1.5-4.5
Coefficient of uniformity 2-5
Natural slope angle 30-40°
Angle of collapse 30-40°

Troubleshooting

ModuleNotFoundError

Ensure you're using the virtual environment:

.venv\Scripts\python.exe optimize_ratios.py

No results found

For discrete visualization: Run optimize_ratios.py before visualize_results.py For continuous visualization: Run continuous_optimizer.py before visualize_continuous.py

Unicode errors

The scripts have been updated to use ASCII characters for Windows compatibility

Decimal format issues

Both US (0.5) and European (0,5) decimal formats are automatically detected and handled. No configuration needed.

Continuous optimization warnings

When Continuous Optimization May Fail

Continuous optimization through interpolation may produce unreliable results in the following scenarios:

1. Erratic or Non-smooth Data

  • When parameters show highly variable behavior across ratios
  • Multiple direction changes (non-monotonic with many inflection points)
  • High coefficient of variation (>50% relative to mean)
  • Solution: The system will automatically detect this and suggest using discrete optimization

2. Insufficient Data Points

  • Less than 5 measured ratios makes interpolation unreliable
  • Sparse data in critical regions
  • Solution: Measure more ratios or use discrete optimization

3. Outliers or Measurement Errors

  • Single outlier measurements can distort interpolation curves
  • High measurement uncertainty (large standard deviations)
  • Solution: Review and clean data, or increase measurement replicates

4. Non-physical Extrapolation

  • Continuous optimization may find optima far outside measured range
  • Extrapolated values may violate physical constraints
  • Solution: Set appropriate search_range bounds in config.yaml

Automatic Safeguards

The improved continuous optimizer includes several safeguards:

  1. Data Quality Assessment

    • Automatically evaluates if your dataset is suitable for interpolation
    • Warns about parameters with high variability or erratic behavior
    • Set check_data_quality: true in config.yaml (enabled by default)
  2. Fallback to Discrete

    • If continuous optimization produces worse results than discrete, automatically falls back
    • Prevents accepting suboptimal interpolated results
    • Set reject_if_worse: true in config.yaml (enabled by default)
  3. Confidence Scoring

    • Each interpolated prediction includes a confidence score
    • Low confidence warnings for uncertain predictions
    • Confidence based on: distance to data points, validation errors, extrapolation
  4. Improved Interpolation Selection

    • Automatically selects conservative methods for erratic data
    • Uses linear interpolation for high-variability parameters
    • Preserves monotonicity when detected

Common Warning Messages

"Data quality too poor for reliable continuous optimization"

  • Overall data quality score below 0.3
  • Multiple parameters show erratic behavior
  • Action: Use discrete optimization results instead

"Continuous optimization score worse than discrete"

  • Interpolation produced suboptimal results
  • System automatically falls back to best discrete ratio
  • Action: Accept the discrete result or investigate data quality

"High variability detected"

  • Parameter shows coefficient of variation > 50%
  • Multiple direction changes in parameter values
  • Action: Review if parameter truly varies smoothly with ratio

"Optimal ratio at search boundary"

  • The optimal ratio is at the edge of your search range
  • May indicate true optimum is beyond current bounds
  • Action: Consider widening search_range in config.yaml

"Optimal ratio outside observed range"

  • The optimal ratio requires extrapolation beyond measured data
  • Predictions may be less reliable
  • Action: Measure additional ratios or accept discrete optimum

High interpolation errors (>50% RMSE)

  • Validation shows poor interpolation accuracy
  • Parameter behavior too complex for selected method
  • Action: System automatically uses conservative linear interpolation

ImportError: cannot import 'RatioOptimizer'

Make sure optimize_ratios.py is in the same directory as continuous_optimizer.py

Dependencies

  • pandas - Data manipulation
  • numpy - Numerical computations
  • scipy - Statistical functions and interpolation
  • matplotlib - Basic plotting
  • seaborn - Advanced visualizations
  • pyyaml - Configuration file parsing
  • tabulate - Table formatting

All dependencies are already installed via uv - no additional packages needed!

Features Summary

Discrete Optimization

✅ Select best ratio from measured data points ✅ Score all parameters against configurable criteria ✅ Detailed reports with parameter-by-parameter analysis ✅ 4 visualization types (radar, bars, heatmap, distributions) ✅ CSV, JSON, and text output formats

Continuous Optimization (NEW!)

✅ Find optimal ratio anywhere (e.g., 1:4.268, not just 1:4 or 1:5) ✅ Multiple interpolation methods (auto, cubic, PCHIP, RBF, etc.) ✅ Automatic method selection per parameter ✅ Cross-validation of interpolation accuracy ✅ Extrapolation detection and warnings ✅ Local maxima detection in score landscape ✅ 4 advanced visualization types (curves, landscape, summary, validation) ✅ Improvement metrics vs. discrete optimization

General Features

✅ No hardcoded parameters - fully configurable ✅ International decimal format support (US and European) ✅ Handles missing standard deviations ✅ Flexible scoring system (exact targets, thresholds, ranges) ✅ Customizable parameter weights ✅ Windows compatible (no Unicode issues) ✅ Fast execution with uv environment

When to Use Each Method

Use Discrete Optimization when:

  • You only have a few measured ratios
  • You want a quick, straightforward answer
  • You need to select from existing experimental conditions
  • Your parameters don't vary smoothly across ratios

Use Continuous Optimization when:

  • You have 4+ measured ratios
  • You want to find the true optimal (not limited to measured points)
  • Your parameters vary smoothly across ratios
  • You need higher precision in the optimal ratio
  • You want to validate interpolation accuracy
  • You're planning additional experiments at the optimal ratio

Best Practice: Run both methods and compare results!

Key Insights from Example Analysis

From table2.tsv continuous optimization:

  1. Optimal ratio found: 1:4.268 (vs discrete 1:4)
  2. Improvement: Only +1.2% over discrete, suggesting measured ratios were well-chosen
  3. Two local maxima: Found at 1:1.08 and 1:4.27 (global optimum)
  4. Interpolation accuracy: Varies by parameter (excellent for compressibility index, fair for angle of collapse)
  5. No extrapolation: Optimal is within observed range (1-5), increasing confidence

Citation

If you use this tool in your research, please cite:

Ratio Optimization Tool (2025)
Python-based continuous and discrete optimization for material characterization
https://github.com/yourusername/ratio-optimization

License

This tool is provided as-is for scientific and educational purposes.

Support

For issues or questions:

  • Check results/optimization_report.txt for detailed discrete analysis
  • Check results/continuous_optimization_report.txt for continuous analysis
  • Review console output for error messages
  • Examine validation plots for interpolation quality
  • Verify your TSV file format matches the examples

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages