Download dataset


FSD50K Dataset containing 51,197 audio clips from Freesound, totalling over 100h of audio manually labeled using 200 classes drawn from the AudioSet Ontology. To our knowledge, this is the largest fully-open dataset of human-labeled sound events ever released. Check all the details in our paper.
FSDKaggle2018 Dataset containing 11k audio clips and 18 hours of training data unequally distributed in 41 classes of the AudioSet Ontology. It was collected for the DCASE Challenge 2018 Task 2, which was run as the Kaggle competition Freesound General-Purpose Audio Tagging Challenge. Described in our DCASE 2018 paper.
FSDnoisy18k Dataset collected with the aim of fostering the investigation of label noise in sound event classification. It contains 42.5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data. Described in its companion site and in our ICASSP 2019 paper.