Skip to content

Application for sound source localization using machine learning

Notifications You must be signed in to change notification settings

kacperx0m/SSL-ML

Repository files navigation

Sound Source Localization (SSL) using Machine Learning (ML)

This application was developed as part of an engineering thesis titled "Aplikacja do estymacji położenia mówcy w oparciu o dźwiękowe sygnały binauralne" ("Application for Estimating Speaker Location Based on Binaural Audio Signals") by Kacper Góralczyk at Bialystok University of Technology.

This is a SSL App with a GUI app for better operability and visualization. It supports single sound source in a horizontal plane using Neumar KU100 dummy head. It's based on Duplex Theory by Lord Rayleigh for localization using ILD and ITD as cues.

To run it, it's necessary to have a trained model and a standard scaler. Exemplary ones are provided in this repository. Model was trained on synthesized database based on spatial data from Sadie York 2 HRTF Database in SOFA format. This database can be downloaded from https://www.york.ac.uk/sadie-project/database.html

To crate personal training dataset for a model, prepare mono sound databse of desired sounds, change paths to valid ones and use provided code. With the provided code one can automatically generate and save normalized training dictionary with unified length with ITD and ILD cues calculated in proper frequency ranges, train and save model, personalize data scaler. All of that so that the user can provide necessary files for main app.

About

Application for sound source localization using machine learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages