Scripts to fetch specific binaries to be synced to TSD's s3-api dir
This repository contains Snakemake workflows to download and unpack bioinformatics binaries including:
- PLINK1 - Whole genome association analysis toolset (version 20250819)
- PLINK2 - Next generation of PLINK (version 20251019)
- regenie - Fast whole genome regression modelling (version 4.1)
- Python 3.6+
- Snakemake
- curl (for downloading)
- unzip (for unpacking archives)
Install Snakemake using pip in the current Python environment:
pip install snakemakeOr using conda/mamba:
mamba install -c conda-forge -c bioconda snakemakeOr using the provided conda environment file:
conda env create -f environment.yaml
conda activate tsd_softwareThe easiest way to use this workflow is via the provided Makefile:
# Download and unpack all binaries
make
# Download specific binaries
make plink1
make plink2
make regenie
# Show what would be downloaded without downloading
make dry-run
# Clean all downloaded files
make clean
# Show available commands
make helpRun the workflow to download and unpack all binaries:
snakemake --cores 1Download specific binaries:
snakemake bin/plink1 --cores 1
snakemake bin/plink2 --cores 1
snakemake bin/regenie --cores 1Perform a dry run to see what would be downloaded:
snakemake --cores 1 --dry-runClean downloaded files:
rm -rf bin/ downloads/After downloading, verify the binaries are correctly installed:
./verify_binaries.shThis will check that all binaries exist, are executable, and display version information.
Binary versions and download URLs are configured in config.yaml. To update a binary version:
- Edit
config.yaml - Update the
versionandurlfields for the desired binary - Run
make cleanto remove old binaries - Run
maketo download the new version
Example configuration entry:
binaries:
plink1:
version: "20231211"
url: "https://s3.amazonaws.com/plink1-assets/plink_linux_x86_64_20231211.zip"
executable: "plink"
description: "PLINK 1.9 - Whole genome association analysis toolset"Binaries are downloaded to the bin/ directory in the repository root:
bin/plink1- PLINK 1.9 executablebin/plink2- PLINK 2.0 executablebin/regenie- regenie executable
Downloaded archives are stored in the downloads/ directory (can be safely deleted after unpacking).
The Snakemake workflow consists of:
- download_binary - Downloads binary archives from configured URLs
- unpack_binary - Unpacks archives and renames executables as needed
- all - Default rule that downloads and unpacks all binaries
The workflow automatically:
- Creates necessary directories (
bin/,downloads/) - Downloads archives only if not already present
- Unpacks binaries only if not already present
- Makes binaries executable
- Handles renaming of executables to standard names
If downloads fail, check:
- Internet connectivity
- The URLs in
config.yamlare still valid - You have write permissions in the repository directory
If a binary exists but is not executable, run:
chmod +x bin/plink1 bin/plink2 bin/regenieTo force re-download of binaries:
make clean
makeSee LICENSE file for details.