From 1fef2872c3318d14a33b079a4e36292682baddbc Mon Sep 17 00:00:00 2001 From: David Doukhan Date: Thu, 6 Jun 2024 12:40:42 +0200 Subject: [PATCH 1/2] adding inaGVAD --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 5222d58..7b2527d 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ There are two main types of audio datasets: speech datasets and audio event/musi * [Flickr Audio Caption](https://groups.csail.mit.edu/sls/downloads/flickraudio/) - 40,000 spoken captions of 8,000 natural images, 4.2 GB in size. * [GEMEP corpus](https://www.unige.ch/cisa/gemep) - 10 actors portraying 10 states; 12 emotions: amusement, anxiety, cold anger (irritation), despair, hot anger (rage), fear (panic), interest, joy (elation), pleasure(sensory), pride, relief, and sadness. Plus, 5 additional emotions: admiration, contempt, disgust, surprise, and tenderness. * [IEMOCAP](https://sail.usc.edu/iemocap/iemocap_release.htm) - 12 hours of audiovisual data by 10 actors; 5 emotions: happiness, anger, sadness, frustration and neutral. +* [inaGVAD](https://github.com/ina-foss/InaGVAD) - a challenging French TV and Radio dataset annotated for voice activity detection (VAD) and Speaker Gender Segmentation (SGS) with detailed annotation scheme detailing non-speech event type, speaker traits and speech quality. * [ISOLET Data Set](https://data.world/uci/isolet) - This 38.7 GB dataset helps predict which letter-name was spoken — a simple classification task. * [JL corpus](https://www.kaggle.com/tli725/jl-corpus) - 2400 recording of 240 sentences by 4 actors (2 males and 2 females); 5 primary emotions: angry, sad, neutral, happy, excited. 5 secondary emotions: anxious, apologetic, pensive, worried, enthusiastic. * [Keio-ESD](http://research.nii.ac.jp/src/en/Keio-ESD.html) - A set of human speech with vocal emotion spoken by a Japanese male speaker; 47 emotions including angry, joyful, disgusting, downgrading, funny, worried, gentle, relief, indignation, shameful, etc. From 3bacfaf168813d3ba3327ce9f283e758fc2694fb Mon Sep 17 00:00:00 2001 From: David Doukhan Date: Thu, 6 Jun 2024 12:42:07 +0200 Subject: [PATCH 2/2] better description --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7b2527d..875e576 100644 --- a/README.md +++ b/README.md @@ -41,7 +41,7 @@ There are two main types of audio datasets: speech datasets and audio event/musi * [Flickr Audio Caption](https://groups.csail.mit.edu/sls/downloads/flickraudio/) - 40,000 spoken captions of 8,000 natural images, 4.2 GB in size. * [GEMEP corpus](https://www.unige.ch/cisa/gemep) - 10 actors portraying 10 states; 12 emotions: amusement, anxiety, cold anger (irritation), despair, hot anger (rage), fear (panic), interest, joy (elation), pleasure(sensory), pride, relief, and sadness. Plus, 5 additional emotions: admiration, contempt, disgust, surprise, and tenderness. * [IEMOCAP](https://sail.usc.edu/iemocap/iemocap_release.htm) - 12 hours of audiovisual data by 10 actors; 5 emotions: happiness, anger, sadness, frustration and neutral. -* [inaGVAD](https://github.com/ina-foss/InaGVAD) - a challenging French TV and Radio dataset annotated for voice activity detection (VAD) and Speaker Gender Segmentation (SGS) with detailed annotation scheme detailing non-speech event type, speaker traits and speech quality. +* [inaGVAD](https://github.com/ina-foss/InaGVAD) - a challenging French TV and Radio dataset annotated for voice activity detection (VAD) and Speaker Gender Segmentation (SGS) with evaluation scripts and detailed annotation scheme detailing non-speech event type, speaker traits and speech quality. * [ISOLET Data Set](https://data.world/uci/isolet) - This 38.7 GB dataset helps predict which letter-name was spoken — a simple classification task. * [JL corpus](https://www.kaggle.com/tli725/jl-corpus) - 2400 recording of 240 sentences by 4 actors (2 males and 2 females); 5 primary emotions: angry, sad, neutral, happy, excited. 5 secondary emotions: anxious, apologetic, pensive, worried, enthusiastic. * [Keio-ESD](http://research.nii.ac.jp/src/en/Keio-ESD.html) - A set of human speech with vocal emotion spoken by a Japanese male speaker; 47 emotions including angry, joyful, disgusting, downgrading, funny, worried, gentle, relief, indignation, shameful, etc.