HamSCI Madrigal Loader

A Python pipeline for downloading and processing HamSCI HF propagation spot data from the Madrigal database daily HDF5 files. It loads one or more days, applies filters (time, region, frequency, distance, dataset/source), converts to Polars, and can generate 2D histograms (time x distance). Includes optional Parquet caching for faster reruns.

Downloads daily HDF5 files from Madrigal (via madrigalWeb / globalDownload.py)
Loads daily HDF5 files named like rsdYYYY-MM-DD.01.hdf5
Filters by:
- Date/time range
- Geographic bounds (lat/lon midpoint)
- Frequency range (single or multiple ranges)
- Distance range (km)
- Dataset/source (RBN, WSPR, PSK)
Produces:
- Filtered Polars dataframe (Parquet/CSV/HDF5)
- 2D histogram (time vs distance) with metadata (Parquet/NetCDF/HDF5)
Caches:
- Dataframes and histograms as Parquet for faster iterative runs

Repo layout

run_loader.py
CLI entrypoint. Reads a JSON config and runs the pipeline day-by-day.
scripts/madrigal_loader.py
MadrigalHamSpotLoader implementation
scripts/json_loader.py
Config loading utilities.
scripts/regions.py
Named region bounding boxes.
scripts/utils_freq.py
Named frequency ranges and labels.
config/
Example config(s).
download_madrigal_daily_hdf5.sh
Helper script to download daily Madrigal HDF5 files via globalDownload.py.

Requirements

Python 3.10+ recommended
Dependencies are listed in requirements.txt

Install

pip install -r requirements.txt

Download Madrigal data (HDF5)

This project expects daily Madrigal HDF5 files named like:

rsdYYYY-MM-DD.01.hdf5

We download these using the Madrigal remote Python API package madrigalWeb (included in requirements ), which installs command-line tools such as globalDownload.py.

Download script (daily HDF5)

download_madrigal_daily_hdf5.sh loops day-by-day and calls globalDownload.py (installed by madrigalWeb) to fetch daily HDF5 files.

In the script, you typically only need to set:

startDate / endDate
--outputDir
--user_fullname
--user_email
--user_affiliation

Run:

chmod +x download_madrigal_daily_hdf5.sh ./download_madrigal_daily_hdf5.sh

Quick start (processing)

Put your HDF5 files in a directory, for example:

data/madrigal/rsd2019-12-01.01.hdf5 data/madrigal/rsd2019-12-02.01.hdf5
Create and edit JSON files in config folder:

config/example.json
Run:

python3 run_loader.py -p config/example.json

Config format

Example config/example.json:

{
  "data_dir": "data/madrigal",
  "cache_dir": "cache",
  "use_cache": true,
  "chunk_size": 100000,

  "sDate": "2019-12-01T00:00:00",
  "eDate": "2019-12-03T23:59:59",

  "filters": {
    "region_name": "CONUS",
    "freq": [7, 14],
    "distance_range": { "min_dist": 0, "max_dist": 3000 },
    "datasets": ["RBN", "WSPR"]
  },

  "output": {
    "output_dir": "output",
    "dataframe": { "generate": true, "formats": ["csv"] },
    "histogram": { "generate": true, "formats": ["csv"] }
  }
}

Config notes

filters.freq can be:
- a single key string (example: "7MHz")
- a list of key strings (example: ["7MHz","14MHz"])
filters.datasets is optional:
- If omitted or null, all datasets are included.
- Valid values: ["RBN","WSPR","PSK"]
Region and frequency keys must exist in scripts/regions.py and scripts/utils_freq.py.

Outputs

The pipeline runs day-by-day across the requested date range.

Outputs are written under your configured output.output_dir (default: output/), typically into:

output/dataframes/
output/histograms/

Histogram Parquet outputs include metadata stored in the Parquet schema under heatmap_meta.

Caching

When use_cache is true:

Dataframes are cached in: cache/dataframes/
Histograms are cached in: cache/heatmaps/

Cache filenames incorporate:

date range
region bounds
frequency range(s)
distance range
dataset selection

If you change filters and rerun, it creates a new cache entry automatically.

Contributing

PRs are welcome. Please keep large data and generated outputs out of git (see .gitignore).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HamSCI Madrigal Loader

Repo layout

Requirements

Install

Download Madrigal data (HDF5)

Download script (daily HDF5)

Run:

Quick start (processing)

Config format

Config notes

Outputs

Caching

Contributing

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
download_madrigal		download_madrigal
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_loader.py		run_loader.py

License

HamSCI/hamsci_madrigal_loader

Folders and files

Latest commit

History

Repository files navigation

HamSCI Madrigal Loader

Repo layout

Requirements

Install

Download Madrigal data (HDF5)

Download script (daily HDF5)

Run:

Quick start (processing)

Config format

Config notes

Outputs

Caching

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages