creapy − a Python-based tool for the automatic detection of creak in conversational speech
creapy [ˈkɻiːpʰaɪ] is a tool to detect creak in speech signals.
Prerequisites
Git
The most convenient way to work with creapy is to clone
this repository using Git. Find a handy tutorial here.
Python
creapy is written in the programming language Python 3. You can download the most recent version (as of 02/10/2023) for your operating system using the following links
If you are a user of Windows make sure to check the mark Add Python 3.10 to PATH to add python to the PATH so that you can find the python
-keyword that links to the python
-executable globally in your terminal.
To check if you successfully installed Python open up a new Terminal. Press the Windows key, type cmd
and hit enter.
A new terminal window opens where you can type python --version
which should give you the latest version of python 3.10 installed on your system.
macOS-users can do the same thing in their terminal by typing python3 --version
instead.
JupyterLab
We recommend using JupyterLab which provides an interactive python environment and is also needed to open the demo-notebook at examples/creapy_demo.ipynb
.
To install, type
pip install jupyterlab
in your terminal or
pip3 install jupyterlab
if you are using macOS. You can open jupyter lab in your terminal by typing
jupyter lab
Installation
creapy will be available on PyPI once published.
Until then, you may clone the repository from git.
Before cloning you should navigate to your desired directory in the terminal like C:/Users/myusername/<path_where_repo_will_be_cloned_into>
. With the following command you can clone the repository either with ssh
git clone git@gitlab.tugraz.at:speech/creapy.git
or html
git clone https://gitlab.tugraz.at/speech/creapy.git
After cloning, a new folder should be present: C:/Users/myusername/<path_to_creapy_repository>
.
To finally install creapy you need to navigate into the new folder using your terminal (The command cd <folder>
changes the directory to a folder given with <folder>
). Then, execute
pip install -e .
Disclaimer
Please note that creapy will modify your hand-labelled .TextGrid
files in an automated fashion. While the tool should generally only add additional tiers to your TextGrid, these new tiers might overpopulate quite soon, especially after messing with the tool a bit and processing whole folders. Make sure to copy your files beforhand to assure a backup and the originality of your precious files.
Basic Usage
Classifying an audio file
After you imported creapy with
import creapy
you can classify creak on your own audio file calling the function process_file
with the path to the audio- and respective TextGrid-file
X_test, y_pred, sr = creapy.process_file(
audio_path='<path_to_your_audio_file>',
textgrid_path='<path_to_your_textgrid_file>')
creapy will add a new tier to the file at textgrid_path
containing the detected creak intervals. Note: We recommend to work on a copy of your original TextGrid file.
Choosing different models
Depending on the speaker you may choose another pre-trained model given by the paramter gender_model
. Per default, creapy uses the model trained on the genders male and female (=all
). This is an example of the classification using a model trained on female speakers only
X_test, y_pred, sr = creapy.process_file(
audio_path='<path_to_your_audio_file>',
textgrid_path='<path_to_your_textgrid_file>',
gender_model='female')
process_file
returns a tuple
where the first element (X_test
) are the calculated feature values for each block, the second element is the calculated creak probability (y_pred
) and the third element (sr
) is the samplingrate in Samples/s
.
The TextGrid file that is saved to textgrid_path
could look like this in Praat:
Modified TextGrid in Praat with new creak tier |
Classifying a folder containing multiple audio files
You can perform the classification on a whole folder using the function process_folder
.
creapy.process_folder(
audio_directory='<path_to_your_audio_directory>',
textgrid_directory='<path_to_your_textgrid_directory>')
Here the folder structure is expected to be as follows:
audio_directory
├── speaker_1.wav
├── speaker_2.wav
├── ...
└── speaker_n.wav
textgrid_directory
├── speaker_1.TextGrid
├── speaker_2.TextGrid
├── ...
└── speaker_n.TextGrid
i.e. the maximum folder level is 1 and the filenames for the respective TextGrid files are expected to be the same as for the audio-files. Note: The audio directory and textgrid directory can be the same, but it is really important that the names and suffixes are correct.
Set your configuration/Change parameters
All of the functions above use the default configuration. If you want to change various parameters you can do so in the configuration file. This file will be saved to C:/Users/myusername/.creapy/config.yaml
after you called the function set_config
once.
creapy.set_config()
A detailed description on the changeable parameters is given in the following table:
Parameter | Explanation | default value |
---|---|---|
audio_directory |
Path to the directory containing audio files like C:/Users/myusername/Documents/audio . |
null |
textgrid_directory |
Path to the directory containing the respective TextGrid files C:/Users/myusername/Documents/textgrid . This is optional. It will also work without texgrid files. |
null |
csv_directory |
Path to a folder where the csv-files containing the classification results should be stored C:/Users/myusername/Documents/results . |
null |
audio_start |
Starttime in seconds where the audiofile should be analysed. | 0 |
audio_end |
endtime in seconds until the audiofile should be analysed (if -1 the file gets processed until the end). | -1 |
audio_suffix |
suffix of the audio-file(s) (compressed audio formats like .mp3 are not supported). |
.wav |
textgrid_suffix |
suffix of the textgrid-file(s). | .TextGrid |
gender_model |
The gender model chosen for creak-classification. Can be all , male or female . We recommend using the all model for male speakers and the female model for female speakers. |
all |
tier_name |
The tiername of the new tier with creapy's annotations in the textgrid file. | creapy |
block_size |
Classification blocksize in seconds. Smaller blocks are computationally more expensive but achieve a better time resolution. | 0.04 |
hop_size |
Classification hopsize in seconds. Should be in the range of \frac{1}{4}\texttt{block\_size}\leq\texttt{hop\_size}\leq\frac{1}{2}\texttt{block\_size} . |
0.01 |
creak_threshold |
Probability-threshold of the classifier where creak is classified. Can be a decimal value between 0 and 1. If you get too many false positives try increasing this value. If a lot of creaky voice gets missed try decreasing it. | 0.75 |
zcr_threshold |
Threshold for the zero-crossing-rate pre-elimination feature. Blocks with a \text{zcr}\geq\text{zcr\_threshold} get eliminated. For female speakers we achieved better results with a higher value ranging around 0.10-0.18. For male speakers a value around 0.10-0.15 will yield good results. Note: This is highly speaker dependent. |
0.10 |
ste_threshold |
Threshold for the short-term-energy pre-elimination feature. This value does not normally need to be changed. It mostly eliminiates blocks of silence or noise. | 1\cdot10^{-5} |
You can change these parameters in the config.yaml
file itself or by using the function set_config
, e.g.
creapy.set_config(block_size=0.05, creak_threshold=0.7)
If you want to reset the configuration file, you can do so, using the function reset_config
, e.g.
creapy.reset_config()
This will set all parameters to their default values as seen in the table above.
Especially the zcr_threshold
, ste_threshold
and creak_threshold
are parameters which can change the result the most.
Plotting
While not a dependency for this tool, plotly is needed in order to evoke the plotting function provided by creapy:
pip install plotly
The function call
creapy.plot(X_test, y_pred, sr)
on the example results in this plot
The plot is interactive: one can check specific timeranges and select the features to display. In this way, a good estimate of the values for the parameters can be obtained.
How to cite
Please cite the following paper if you use creapy.
@inproceedings{creapy2023,
title={Creapy: A Python-based tool for the detection of creak in conversational speech},
author={Paierl, Michael and R{\"o}ck, Thomas and Wepner, Saskia and Kelterer, Anneliese and Schuppler, Barbara},
booktitle={20th International Congress on Phonetic Sciences},
year={2023}
}