diff --git a/README.md b/README.md index b964e8be0c05360f9ecc2443deed164e8154d6e0..a088e7110bab5ac2bf5cb1451ddd7299e79c09d0 100644 --- a/README.md +++ b/README.md @@ -15,12 +15,34 @@ Stay tuned for updates and we appreciate your interest in our work. Please conti - matplotlib ## Repository structure +The repository includes data folders which you need to prepare. The repository also includes example files from the BEA corpus (Hungarian) and the GRASS corpus (Austrian German) which makes it possible to run an example from scratch. The speech data should ne stored in the folder ```BEAGR``` and should look like this: + +- BEAGR/data_BEACS + - Various speaker (spkID1, spkID2, ...) folders + - Various .wav files +- BEAGR/data_BEARS + - Various speaker (spkID1, spkID2, ...) folders + - Various .wav files +- BEAGR/data_GRCS + - Various speaker (spkID1, spkID2, ...) folders + - Various .wav files +- BEAGR/data_GRRS + - Various speaker (spkID1, spkID2, ...) folders + - Various .wav files + +As you can see ```BEAGR``` includes the subfolders ```data_BEACS``` (BEA Spontaneous Speech), ```data_BEARS``` (BEA Read Speech), ```data_GRCS``` (GRASS Conversational Speech) and ```data_GRRS``` (GRASS Read Speech). Make sure that all spkIDs are unique identifiers to prevent ambiguities (e.g., same speakers in RS and CS component?). Given this structure and after installing/preparing all dependencies you should be able to run all stages with the command + +``` +./run.sh BEAGR stage +``` + +where ```stage``` is an integer. The command automatically generates the experiment folder ```exp_BEAGR```. ## Reproduction +The following steps are necessary to reproduce the experiment. At first you need to create a conda envrionment and install the necessary packages. Second you have to clone the fairseq repository and modify the file ```path.sh``` to export necessary environment variables. ### Conda environment You need to install the following packages: - ``` conda create -n speechcodebookanalysis python=3.8 conda activate speechcodebookanalysis @@ -29,7 +51,7 @@ pip install matplotlib ``` ### Fairseq Repository -You need to clone the fairseq repository in another directory (e.g., ```../fairseq```). +You need to clone the fairseq repository to another directory (e.g., ```../fairseq```). The file ```path.sh``` needs to modified in order to export the necessary environment variables. ``` git clone https://github.com/facebookresearch/fairseq.git