ViolinDiff

This model is provided for non-commercial, research use only.

Official PyTorch implementation of "ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning". Keywords: Violin Synthesis, Neural Audio Synthesis, Pitch Bend Modeling, Expressive Performance, Diffusion Models This work has been accepted at ICASSP 2025.

Overview

This repository provides the official PyTorch codebase for ViolinDiff, a diffusion-based model that focuses on generating expressive violin performances via pitch bend modeling.

ViolinDiff is divided into two main modules:

Bend Module : Predict the pitch bend roll from MIDI.
Synth (Synthesis) Module : Converts pitch and bend information, along with other performance controls, into the final violin audio signal.

Running on Google Colab

If you prefer not to install everything locally, you can run ViolinDiff directly in Google Colab:

ViolinDiff on Colab

Just open the link, make sure to enable GPU (Runtime → Change runtime type → Hardware accelerator: GPU), and execute the provided cells in order.

Getting Started

Installation

Clone this repository.

git clone https://github.com/daewoung/ViolinDiff.git
cd ViolinDiff

Create a new Conda environment.

conda create -n VD python=3.10
conda activate VD

Install PyTorch

conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia

Install other dependencies
```
pip install -r requirements.txt
```

Download Pre-trained Models

Pretrained checkpoints (bend.pt, synth.pt) are available on Hugging Face:dawokim/ViolinDiff

1) Using Git + Git LFS

Make sure you have Git LFS installed:

git lfs install
git clone https://huggingface.co/dawokim/ViolinDiff

2) Using wget

wget https://huggingface.co/dawokim/ViolinDiff/resolve/main/bend.pt
wget https://huggingface.co/dawokim/ViolinDiff/resolve/main/synth.pt

Inference

We provide a script called inference.py to generate violin audio (.wav) from a given MIDI file.
By default, it expects the following arguments:

Example Usage

  python3 inference.py \
  --synth_pth synth.pt \
  --bend_pth bend.pt \
  --midi_pth example.mid \
  --save_pth example_out.wav \
  --performer 13 \
  --device cuda

--synth_pth: Path to the Synth checkpoint (default: synth.pt)
--bend_pth: Path to the Bend checkpoint (default: bend.pt)
--bend_cfg: CFG scale for the bend model (default: 3.0)
--synth_cfg: CFG scale for the synth model (default: 1.25)
--midi_pth: Path to the input MIDI file (default: thais.mid)
--save_pth: Path to save the output WAV file (default: thais.wav)
--performer: Performer ID (int), default: 0 (currently up to 21 performers supported)
--device: Device to run on (cuda or cpu), default: cuda

Training

Data Preparation

MIDI Files: We recommend downloading violin MIDI files from MUSC_violin. This repository provides various violin pieces in MIDI format.
Audio Files: You will need to obtain corresponding audio recordings separately, as they are not provided in the above repo.
Directory Structure: Organize your data such that each composer (or dataset split) resides in a folder. For example:

/data/train/
├── Kayser/
│   ├── piece1.mid
│   ├── piece1.wav
│   ├── piece2.mid
│   ├── piece2.wav
│   └── ...

Ensure that each .mid file has a matching .wav file of the same piece.

Model Configuration

All training hyperparameters, file paths, and other settings are defined in the config/ folder. Each .yaml file corresponds to different modules or training configurations (e.g., synth.yaml, bend.yaml).

  python3 bend_train.py
  python3 synth_train.py

bend_train.py: Trains the Bend module (to predict pitch bend envelopes).
synth_train.py: Trains the Synthesis module (to generate mel spectrogram, conditioned on pitch/bend).

Citation

If you use ViolinDiff in your research, please cite:

@article{kim2024violindiff,
  title={ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning},
  author={Kim, Daewoong and Dong, Hao-Wen and Jeong, Dasaem},
  journal={arXiv preprint arXiv:2409.12477},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
config		config
data_preprocess		data_preprocess
model		model
static		static
trainer		trainer
.gitignore		.gitignore
README.md		README.md
bend_train.py		bend_train.py
inference.py		inference.py
inference_preprocess.py		inference_preprocess.py
requirements.txt		requirements.txt
synth_train.py		synth_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ViolinDiff

Overview

Running on Google Colab

Getting Started

Installation

Download Pre-trained Models

1) Using Git + Git LFS

2) Using wget

Inference

Example Usage

Training

Data Preparation

Model Configuration

Citation

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

daewoung/ViolinDiff

Folders and files

Latest commit

History

Repository files navigation

ViolinDiff

Overview

Running on Google Colab

Getting Started

Installation

Download Pre-trained Models

1) Using Git + Git LFS

2) Using wget

Inference

Example Usage

Training

Data Preparation

Model Configuration

Citation

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages