Skip to content
Snippets Groups Projects

creapy − a Python-based tool for the automatic detection of creak in conversational speech

creapy [ˈkɻiːpʰaɪ] is a tool to detect creak in speech signals.

Prerequisites

Git

The most convenient way to work with creapy is to clone this repository using Git. Find a handy tutorial here.

Python

creapy is written in the programming language Python 3. You can download the most recent version (as of 02/10/2023) for your operating system using the following links

If you are a user of Windows make sure to check the mark Add Python 3.10 to PATH to add python to the PATH so that you can find the python-keyword that links to the python-executable globally in your terminal.

To check if you successfully installed Python open up a new Terminal. Press the Windows key, type cmd and hit enter.

A new terminal window opens where you can type python --version which should give you the latest version of python 3.10 installed on your system.

macOS-users can do the same thing in their terminal by typing python3 --version instead.

JupyterLab

We recommend using JupyterLab which provides an interactive python environment and is also needed to open the demo-notebook at examples/creapy_demo.ipynb. To install, type

pip install jupyterlab

in your terminal or

pip3 install jupyterlab

if you are using macOS. You can open jupyter lab in your terminal by typing

jupyter lab

Installation

creapy will be available on PyPI once published.

Until then, you may clone the repository from git.

Before cloning you should navigate to your desired directory in the terminal like C:/Users/myusername/<path_where_repo_will_be_cloned_into>. With the following command you can clone the repository either with ssh

git clone git@gitlab.tugraz.at:speech/creapy.git

or html

git clone https://gitlab.tugraz.at/speech/creapy.git

After cloning, a new folder should be present: C:/Users/myusername/<path_to_creapy_repository>.

To finally install creapy you need to navigate into the new folder using your terminal (The command cd <folder> changes the directory to a folder given with <folder>). Then, execute

pip install -e .

Disclaimer

Please note that creapy will modify your hand-labelled .TextGrid files in an automated fashion. While the tool should generally only add additional tiers to your TextGrid, these new tiers might overpopulate quite soon, especially after messing with the tool a bit and processing whole folders. Make sure to copy your files beforhand to assure a backup and the originality of your precious files.

Basic Usage

Classifying an audio file

After you imported creapy with

import creapy

you can classify creak on your own audio file calling the function process_file with the path to the audio- and respective TextGrid-file

X_test, y_pred, sr = creapy.process_file(
    audio_path='<path_to_your_audio_file>', 
    textgrid_path='<path_to_your_textgrid_file>')

creapy will add a new tier to the file at textgrid_path containing the detected creak intervals. Note: We recommend to work on a copy of your original TextGrid file.

Choosing different models

Depending on the speaker you may choose another pre-trained model given by the paramter gender_model. Per default, creapy uses the model trained on the genders male and female (=all). This is an example of the classification using a model trained on female speakers only

X_test, y_pred, sr = creapy.process_file(
    audio_path='<path_to_your_audio_file>', 
    textgrid_path='<path_to_your_textgrid_file>',
    gender_model='female')

process_file returns a tuple where the first element (X_test) are the calculated feature values for each block, the second element is the calculated creak probability (y_pred) and the third element (sr) is the samplingrate in Samples/s.

The TextGrid file that is saved to textgrid_path could look like this in Praat:

Modified TextGrid in Praat with new creak tier

Classifying a folder containing multiple audio files

You can perform the classification on a whole folder using the function process_folder.

creapy.process_folder(
    audio_directory='<path_to_your_audio_directory>',
    textgrid_directory='<path_to_your_textgrid_directory>')

Here the folder structure is expected to be as follows:

audio_directory
├── speaker_1.wav
├── speaker_2.wav
├── ...
└── speaker_n.wav

textgrid_directory
├── speaker_1.TextGrid
├── speaker_2.TextGrid
├── ...
└── speaker_n.TextGrid

i.e. the maximum folder level is 1 and the filenames for the respective TextGrid files are expected to be the same as for the audio-files. Note: The audio directory and textgrid directory can be the same, but it is really important that the names and suffixes are correct.

Set your configuration/Change parameters

All of the functions above use the default configuration. If you want to change various parameters you can do so in the configuration file. This file will be saved to C:/Users/myusername/.creapy/config.yaml after you called the function set_config once.

creapy.set_config()

A detailed description on the changeable parameters is given in the following table:

Parameter Explanation default value
audio_directory Path to the directory containing audio files like C:/Users/myusername/Documents/audio. null
textgrid_directory Path to the directory containing the respective TextGrid files C:/Users/myusername/Documents/textgrid. This is optional. It will also work without texgrid files. null
csv_directory Path to a folder where the csv-files containing the classification results should be stored C:/Users/myusername/Documents/results. null
audio_start Starttime in seconds where the audiofile should be analysed.
0
audio_end endtime in seconds until the audiofile should be analysed (if -1 the file gets processed until the end).
-1
audio_suffix suffix of the audio-file(s) (compressed audio formats like .mp3 are not supported). .wav
textgrid_suffix suffix of the textgrid-file(s). .TextGrid
gender_model The gender model chosen for creak-classification. Can be all, male or female. We recommend using the all model for male speakers and the female model for female speakers. all
tier_name The tiername of the new tier with creapy's annotations in the textgrid file. creapy
block_size Classification blocksize in seconds. Smaller blocks are computationally more expensive but achieve a better time resolution.
0.04
hop_size Classification hopsize in seconds. Should be in the range of
\frac{1}{4}\texttt{block\_size}\leq\texttt{hop\_size}\leq\frac{1}{2}\texttt{block\_size}
.
0.01
creak_threshold Probability-threshold of the classifier where creak is classified. Can be a decimal value between 0 and 1. If you get too many false positives try increasing this value. If a lot of creaky voice gets missed try decreasing it.
0.75
zcr_threshold Threshold for the zero-crossing-rate pre-elimination feature. Blocks with a
\text{zcr}\geq\text{zcr\_threshold}
get eliminated. For female speakers we achieved better results with a higher value ranging around 0.10-0.18. For male speakers a value around 0.10-0.15 will yield good results. Note: This is highly speaker dependent.
0.10
ste_threshold Threshold for the short-term-energy pre-elimination feature. This value does not normally need to be changed. It mostly eliminiates blocks of silence or noise.
1\cdot10^{-5}

You can change these parameters in the config.yaml file itself or by using the function set_config, e.g.

creapy.set_config(block_size=0.05, creak_threshold=0.7)

If you want to reset the configuration file, you can do so, using the function reset_config, e.g.

creapy.reset_config()

This will set all parameters to their default values as seen in the table above.

Especially the zcr_threshold, ste_threshold and creak_threshold are parameters which can change the result the most.

Plotting

While not a dependency for this tool, plotly is needed in order to evoke the plotting function provided by creapy:

pip install plotly

The function call

creapy.plot(X_test, y_pred, sr)

on the example results in this plot

The plot is interactive: one can check specific timeranges and select the features to display. In this way, a good estimate of the values for the parameters can be obtained.

How to cite

Please cite the following paper if you use creapy.

@inproceedings{creapy2023,
  title={Creapy: A Python-based tool for the detection of creak in conversational speech},
  author={Paierl, Michael and R{\"o}ck, Thomas and Wepner, Saskia and Kelterer, Anneliese and Schuppler, Barbara},
  booktitle={20th International Congress on Phonetic Sciences},
  year={2023}
}