Update file README.md

dd197b2c · Linke, Julian · 645734e6 · dd197b2c
Commit dd197b2c authored 1 year ago by Linke, Julian
--- a/README.md
+++ b/README.md
@@ -30,13 +30,13 @@ The repository includes data folders which you need to prepare. The repository a
  - Various speaker (spkID1, spkID2, ...) folders
    - Various .wav or .flac files (fs=16kHz)

-As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. Make also sure that all spkIDs are unique identifiers to prevent ambiguities (e.g., same speakers in RS and CS component?). The audio files should have a sampling rate of 16kHZ and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command:
+As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. The audio files should have a sampling rate of 16kHz and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies (see below) you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command:

 ```
 ./run.sh BEAGR 3
 ```

-The command automatically generates the experiment folder ```exp_BEAGR```.
+The command automatically generates the experiment folder ```exp_BEAGR```. Stae ```0``` would run everything in a row.

 ## Reproduction
 The following steps are necessary to reproduce the experiment. At first you need to create a conda envrionment and install the necessary packages. Second you have to  clone the fairseq repository and modify the file ```path.sh``` to export necessary environment variables. 
@@ -48,6 +48,8 @@ conda create -n speechcodebookanalysis python=3.8
 conda activate speechcodebookanalysis
 pip install fairseq
 pip install matplotlib
+pip install scikit-learn
+pip install faiss-cpu
 ```

 ### Fairseq Repository