FSD is a large-scale, general-purpose audio dataset
Thousands of audio samples from Freesound organised following the AudioSet Ontology
FSD: a dataset of everyday sounds
The AudioSet Ontology is a hierarchical collection of over 600 sound classes and we have filled them with 297,144 audio samples from Freesound. This process generated 685,403 candidate annotations that express the potential presence of sound sources in audio clips. FSD includes a variety of everyday sounds, from human and animal sounds to music and sounds made by things, all under Creative Commons licenses.
Tape hiss
Gunshot, gunfire
Propeller, airscrew
Truck
Frying (food)
Microphone
Croak
Aircraft engine
Shatter
Duck
New release out!
Explore FSD50KCrowdsourcing annotations
By creating this dataset, we seek promoting research that will enable machines to hear and interpret sound similarly to humans. But to make FSD reliable enough for research, we need to verify the generated annotations. So we are now crowdsourcing annotations to build the first FSD release, which will include waveforms, audio features, ground truth and additional metadata. Our first goal is to gather at least 100 verified samples per category (whenever available). Wanna contribute?
Explore FSD
The table shows some current basic stats of FSD. Hover your mouse over the headers for more info.