Environmental Sound Recognition using CNNs

Project Overview

This project explores the effectiveness of different input features in Convolutional Neural Network (CNN) architectures for Environmental Sound Recognition. We compare three different CNN models:

MelNet: Uses log-mel feature input
RawNet: Processes raw waveform inputs
MFCC Model: Based on Mel-frequency cepstral coefficients

The models are benchmarked on the ESC-50 dataset, a widely used dataset for environmental sound classification.

Key Features

Implementation of three distinct CNN architectures
Comparison of log-mel features, raw waveform input, and MFCCs
Use of a five-layer stacked CNN network with decreasing filter sizes
End-to-end stacked CNN model for raw waveform processing
Evaluation using the ESC-50 dataset

Results

Our experiments yielded interesting results, with the MFCC model outperforming the others:

Model	Test Accuracy
MFCC	72% ± 3
MelNet	69% ± 3
RawNet	62% ± 3

Methodology

Data preprocessing for each model type
Training with dynamic learning rate and batch size optimization
Evaluation using majority voting rule
Use of "Sparse Categorical Crossentropy" as the loss function

Conclusions

The study demonstrates the significance of selecting appropriate input features for enhancing model performance in environmental sound recognition tasks. The MFCC model showed the best performance, suggesting that MFCC features provide a more robust representation of sound characteristics for this task.

Future Work

Explore hybrid models combining strengths of different input features
Investigate impact of data augmentation techniques
Apply transfer learning to improve model performance

Technologies Used

Python
TensorFlow
Librosa
NumPy
Matplotlib

Contributors

Mattia Varagnolo
Giacomo Schiavo

License

[Include license information]

Acknowledgements

This project builds upon the work of Shaobo Li et al. and other researchers in the field of environmental sound classification.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.gitignore		.gitignore
CNN_final.ipynb		CNN_final.ipynb
README.md		README.md
hda_FINAL_report.pdf		hda_FINAL_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Environmental Sound Recognition using CNNs

Project Overview

Key Features

Results

Methodology

Conclusions

Future Work

Technologies Used

Contributors

License

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

MaatVO/Environmental-Sound-Classification

Folders and files

Latest commit

History

Repository files navigation

Environmental Sound Recognition using CNNs

Project Overview

Key Features

Results

Methodology

Conclusions

Future Work

Technologies Used

Contributors

License

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages