🎵 BioFoundation: Foundation Models for Bioacoustics

A Comparative Review of Foundation Models for Bioacoustics 🤗

A comprehensive evaluation framework for foundation models in bioacoustic analysis

🔍 Overview

This repository contains the official implementation and evaluation framework for our paper "Foundation Models for Bioacoustics: A Comparative Review". We present a systematic comparison of state-of-the-art foundation models across multiple bioacoustic benchmarks, providing insights into their effectiveness for animal sound classification and analysis.

🎯 Key Features

Comprehensive Evaluation: Systematic comparison of 12+ foundation models
Multiple Benchmarks: Evaluation on BEANS and BirdSet datasets
Flexible Framework: Easy-to-use scripts for reproducing experiments
Standardized Protocols: Linear probing, attentive probing, and fine-tuning evaluations
Rich Documentation: Detailed configuration and setup instructions

📊 Supported Models

Our framework evaluates the following foundation models:

Baseline General Audio Models:

AudioMAE
BEATs
EAT

Bioacoustic Foundation Models:

AVES
BEATs NLM
BioLingual
Bird AVES
BirdMAE
ConvNeXt_BS
Perch
ProtoCLR
SurfPerch
ViT INS

🗂️ Datasets

BEANS: Benchmark of Animal Sounds
- Watkins Marine Mammal Dataset (31 classes)
- Bat Calls (10 classes)
- CBI Bird Dataset (264 classes)
- Dog Barks (10 classes)
- HumBugDB Mosquito Dataset (14 classes)
BirdSet: Comprehensive bird sound benchmark
- 8 datasets: PER, POW, NES, UHH, HSN, NBP, SSW, SNE

🚀 Quick Start

Installation

Using Devcontainer (Recommended)

We provide a preconfigured development container for easy setup:

git submodule update --init --recursive

Manual Installation

Install dependencies using Poetry:

poetry install
poetry shell

🧪 Running Experiments

BirdSet Experiments

Use our convenient run_birdset.sh script to evaluate models on BirdSet datasets:

# Run all models on all BirdSet datasets
./projects/biofoundation/scripts/run_birdset.sh

# Run specific models
./projects/biofoundation/scripts/run_birdset.sh --models perch,aves,audiomae

# Run on specific datasets
./projects/biofoundation/scripts/run_birdset.sh --datasets PER,POW,NES

# Custom configuration
./projects/biofoundation/scripts/run_birdset.sh --models perch --datasets PER --seeds 1,2,3 --gpu 0

BEANS Experiments

Use our run_beans.sh script for BEANS benchmark evaluation:

# Run all models on all BEANS datasets
./projects/biofoundation/scripts/run_beans.sh

# Run specific models
./projects/biofoundation/scripts/run_beans.sh --models perch,aves

# Run on specific datasets
./projects/biofoundation/scripts/run_beans.sh --datasets beans_watkins,beans_cbi

# Custom configuration
./projects/biofoundation/scripts/run_beans.sh --models perch --datasets beans_watkins --seeds 1,2,3 --gpu 0

Manual Experiment Execution

For more granular control, you can run individual experiments:

# BirdSet linear probing
./projects/biofoundation/train.sh experiment=birdset/linearprobing/{model_name}

# BEANS linear probing  
./projects/biofoundation/train.sh experiment=beans/linearprobing/{model_name}

📊 Results and Analysis

Generating Results Tables

We provide automated table generation for our comprehensive results analysis:

# Download results data from WandB report
# https://wandb.ai/deepbirddetect/BioFoundation/reports/Latex-Table-Data--VmlldzoxMjEyODQ0Ng

# Generate LaTeX tables
python projects/biofoundation/results/latex/new_table.py

The script requires beans.csv and birdset.csv files in the same directory, which can be downloaded from our WandB Report.

Hyperparameter Optimization with WandB Sweeps

We use Weights & Biases Sweeps for systematic hyperparameter optimization:

# Start a sweep
wandb sweep sweeps/base_grid.yaml

# Run sweep agents
wandb agent <sweep_id>

# Multi-GPU sweep execution
projects/biofoundation/sweeps/sweep.sh <gpu_id> <sweep_id>

Available sweep configurations:

sweeps/base_grid.yaml: Grid search for basic parameters
sweeps/classifier.yaml: Bayesian optimization for classifier architectures

📝 Configuration

BEANS Dataset Configuration

To run experiments on specific BEANS datasets, modify the experiment configuration:

datamodule:
  dataset:
    dataset_name: beans_watkins # Choose dataset
    hf_path: DBD-research-group/beans_watkins # HuggingFace path
    hf_name: default
    n_classes: 31 # Number of classes

Available BEANS Datasets:

Dataset	Classes	Description
`beans_watkins`	31	Marine mammal vocalizations
`beans_bats`	10	Bat echolocation calls
`beans_cbi`	264	Cornell Bird Identification
`beans_dogs`	10	Dog bark classifications
`beans_humbugdb`	14	Mosquito wing-beat sounds

Name		Name	Last commit message	Last commit date
Latest commit History 1,744 Commits
.devcontainer @ d541b19		.devcontainer @ d541b19
.github/workflows		.github/workflows
.vscode		.vscode
birdset		birdset
configs		configs
dataset/beans/notebooks		dataset/beans/notebooks
notebooks		notebooks
projects		projects
resources		resources
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
LICENSE		LICENSE
README.md		README.md
croissant.json		croissant.json
poetry.lock		poetry.lock
pylintrc		pylintrc
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎵 BioFoundation: Foundation Models for Bioacoustics

A Comparative Review of Foundation Models for Bioacoustics 🤗

🔍 Overview

🎯 Key Features

📊 Supported Models

🗂️ Datasets

🚀 Quick Start

Installation

Using Devcontainer (Recommended)

Manual Installation

🧪 Running Experiments

BirdSet Experiments

BEANS Experiments

Manual Experiment Execution

📊 Results and Analysis

Generating Results Tables

Hyperparameter Optimization with WandB Sweeps

📝 Configuration

BEANS Dataset Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 13

Uh oh!

Languages

License

DBD-research-group/BioFoundation

Folders and files

Latest commit

History

Repository files navigation

🎵 BioFoundation: Foundation Models for Bioacoustics

A Comparative Review of Foundation Models for Bioacoustics 🤗

🔍 Overview

🎯 Key Features

📊 Supported Models

🗂️ Datasets

🚀 Quick Start

Installation

Using Devcontainer (Recommended)

Manual Installation

🧪 Running Experiments

BirdSet Experiments

BEANS Experiments

Manual Experiment Execution

📊 Results and Analysis

Generating Results Tables

Hyperparameter Optimization with WandB Sweeps

📝 Configuration

BEANS Dataset Configuration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages