FSD50K

Welcome to the companion site for the FSD50K dataset. Here you will find basic information and links to the download page and the paper describing the dataset. You can also explore the audio content of FSD50K.

FSD50K is an open dataset of human-labeled sound events. Here is summary of the main characteristics:

The dataset contains 51,197 audio clips from Freesound totalling over 100 hours of audio
The audio content is manually labeled using 200 classes drawn from the AudioSet Ontology
FSD50K is provided in two sets: development and evaluation
The dataset includes additional Freesound metadata
All the content is licensed under Creative Commons licenses
FSD50K can be downloaded from Zenodo along with detailed information

If you use this dataset in your work please cite our paper and check it out for more information:
FSD50K: An Open Dataset of Human-Labeled Sound Events.
E. Fonseca, X. Favory, J. Pons, F. Font & X. Serra
in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 829-852, 2022, doi: 10.1109/TASLP.2021.3133208.


You can explore FSD50K by audio category. Once inside a category, you can inspect the audio samples and report labelling errors that will be amended in future versions of the dataset.

Loading data...

Loading data...