From dd197b2c98d6418dfdae79731ce0bf06d00bbaed Mon Sep 17 00:00:00 2001
From: "Linke, Julian" <linke@tugraz.at>
Date: Thu, 27 Jul 2023 20:12:56 +0200
Subject: [PATCH] Update file README.md

---
 README.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index f2e3a79..2bb194b 100644
--- a/README.md
+++ b/README.md
@@ -30,13 +30,13 @@ The repository includes data folders which you need to prepare. The repository a
   - Various speaker (spkID1, spkID2, ...) folders
     - Various .wav or .flac files (fs=16kHz)
 
-As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. Make also sure that all spkIDs are unique identifiers to prevent ambiguities (e.g., same speakers in RS and CS component?). The audio files should have a sampling rate of 16kHZ and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command:
+As you can see ```BEAGR``` includes the subfolders ```data_BEA_CS``` (BEA Spontaneous Speech), ```data_BEA_RS``` (BEA Read Speech), ```data_GR_CS``` (GRASS Conversational Speech) and ```data_GR_RS``` (GRASS Read Speech). **Please make sure that folders are named like this: ```data_{corpus}_{speakingstyle}```**. The audio files should have a sampling rate of 16kHz and can be .wav or .flac files. Given this structure and after installing/preparing all dependencies (see below) you should be able to run the experiment. To run a specific stage of the script for a specific dataset, provide the directory where all you data is stored (here ```BEAGR```) and an integer as an argument to the `./run.sh` command. For instance, to run stage ```3``` for the example dataset, you would use the following command:
 
 ```
 ./run.sh BEAGR 3
 ```
 
-The command automatically generates the experiment folder ```exp_BEAGR```.
+The command automatically generates the experiment folder ```exp_BEAGR```. Stae ```0``` would run everything in a row.
 
 ## Reproduction
 The following steps are necessary to reproduce the experiment. At first you need to create a conda envrionment and install the necessary packages. Second you have to  clone the fairseq repository and modify the file ```path.sh``` to export necessary environment variables. 
@@ -48,6 +48,8 @@ conda create -n speechcodebookanalysis python=3.8
 conda activate speechcodebookanalysis
 pip install fairseq
 pip install matplotlib
+pip install scikit-learn
+pip install faiss-cpu
 ```
 
 ### Fairseq Repository
-- 
GitLab