From dd197b2c98d6418dfdae79731ce0bf06d00bbaed Mon Sep 17 00:00:00 2001 From: "Linke, Julian" <linke@tugraz.at> Date: Thu, 27 Jul 2023 20:12:56 +0200 Subject: [PATCH] Update file README.md --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index f2e3a79..2bb194b 100644 --- a/README.md +++ b/README.md @@ -30,13 +30,13 @@ The repository includes data folders which you need to prepare. The repository a - Various speaker (spkID1, spkID2, ...) folders - Various .wav or .flac files (fs=16kHz) -As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. Make also sure that all spkIDs are unique identifiers to prevent ambiguities (e.g., same speakers in RS and CS component?). The audio files should have a sampling rate of 16kHZ and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command: +As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. The audio files should have a sampling rate of 16kHz and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies (see below) you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command: ``` ./run.sh BEAGR 3 ``` -The command automatically generates the experiment folder ```exp_BEAGR```. +The command automatically generates the experiment folder ```exp_BEAGR```. Stae ```0``` would run everything in a row. ## Reproduction The following steps are necessary to reproduce the experiment. At first you need to create a conda envrionment and install the necessary packages. Second you have to clone the fairseq repository and modify the file ```path.sh``` to export necessary environment variables. @@ -48,6 +48,8 @@ conda create -n speechcodebookanalysis python=3.8 conda activate speechcodebookanalysis pip install fairseq pip install matplotlib +pip install scikit-learn +pip install faiss-cpu ``` ### Fairseq Repository -- GitLab