Skip to content
Snippets Groups Projects
Commit 134269fb authored by Thomas Röck's avatar Thomas Röck
Browse files

initial commit

parents
No related branches found
No related tags found
No related merge requests found
Showing
with 1565 additions and 0 deletions
*.wav
.ipynb_checkpoints
__pycache__
creapy.egg-info
!audio/example.wav
.vscode
.python-version
.DS_Store
README.md 0 → 100644
# creapy − a Python-based tool for the automatic detection of creak in conversational speech
creapy [ˈkriːpaɪ] is a tool to detect creak in speech.
## Prerequisites
### git
The most convenient way to work with `creapy` right now is to `clone` this repository using `git`. Find a handy git tutorial [here](https://rogerdudler.github.io/git-guide/).
### python
`creapy` is written in the programming language `Python 3`. You can download the most recent version (as of 02/10/2023) for your operating system using the following links
- [Windows Installer (64-bit)](https://www.python.org/ftp/python/3.10.10/python-3.10.10-amd64.exe)
- [macOS 64-bit universal2 installer](https://www.python.org/ftp/python/3.10.10/python-3.10.10-macos11.pkg)
If you are a user of Windows make sure to check the mark **Add Python 3.10 to PATH** to add python to the PATH so that you can find the `python`-keyword that links to the `python`-executable globally in your terminal.
![](https://i.imgur.com/qfydlwD.png)
To check if you successfully installed Python open up a new Terminal. Press the Windows key, type `cmd` and hit enter.
A new terminal window opens where you can type `python --version` which should give you the latest version of python 3.10 installed on your system.
![](https://i.imgur.com/A3fHfcS.png)
macOS-users can do the same thing in their terminal by typing `python3 --version` instead.
### jupyter lab
We recommend using jupyter lab which provides an interactive python environment and is also needed to open the demo-notebook at `examples/creapy_demo.ipynb`.
To install, type
```bash
pip install jupyterlab
```
in your terminal or
```bash
pip3 install jupyterlab
```
if you are using macOS. You can open jupyter lab in your terminal by typing
```bash
jupyter lab
```
## Installation
`creapy` will be available on PyPI once published.
<!--
```
pip install creapy
```
-->
Until then, you may clone the repository from git.
<!-- For now as creapy is not yet published the git repository can be cloned from Git. -->
Before cloning you should navigate to your desired directory in the terminal like `C:/Users/myusername/<path_where_repo_will_be_cloned_into>`. With the following command you can clone the repository either with ssh
```bash
git clone git@git.spsc.tugraz.at:troeck/ti-project-creak.git
```
or html
```bash
git clone https://git.spsc.tugraz.at/troeck/ti-project-creak.git
```
After cloning, a new folder should be present: `C:/Users/myusername/<path_to_creapy_repository>`.
To finally install creapy you need to navigate into the new folder using your terminal (The command `cd <folder>` **c**hanges the **d**irectory to a folder given with `<folder>`). To install `creapy` execute
```bash
pip install -e .
```
<!-- To check if the installation was succesfull you can try to run a python script or a jupyter notebook with:
```python
import creapy
``` -->
## Disclaimer
Please note that `creapy` will modify your hand-labelled `TextGrid` files in an automated fashion. While the tool should generally only add additional tiers to your `TextGrid`, these new tiers might overpopulate quite soon, especially after messing with the tool a bit and processing whole folders. Make sure to **copy** your files beforhand to assure a backup and the originality of your precious files.
## Basic Usage
### Classifying an audio file
After you imported `creapy` with
```python
import creapy
```
you can classify creak on your own audio file calling the function `process_file` with the path to the audio- and respective TextGrid-file
```python!
X_test, y_pred, sr = creapy.process_file(
audio_path='<path_to_your_audio_file>',
textgrid_path='<path_to_your_textgrid_file>')
```
`creapy` will add a new tier to the file at `textgrid_path` containing the detected creak intervals. **Note**: We recommend to work on a copy of your original TextGrid file.
### Choosing different models
Depending on the speaker you may choose another pre-trained model given by the paramter `gender_model`. Per default, `creapy` uses the model trained on the genders male and female (=`all`). This is an example of the classification using a model trained on female speakers only
```python!
X_test, y_pred, sr = creapy.process_file(
audio_path='<path_to_your_audio_file>',
textgrid_path='<path_to_your_textgrid_file>',
gender_model='female')
```
`process_file` returns a `tuple` where the first element (`X_test`) are the calculated feature values for each block, the second element is the calculated creak probability (`y_pred`) and the third element (`sr`) is the samplingrate in `Samples/s`.
<!-- The calculated creak probability is shown in the following plot:
|![](examples/creapy_creak_probability_example.png 'creak probability')|
| - |
|*Creak probability over time (blue) and hand labelled intervals (red)*| -->
<!-- The function `get_time_vector` returns an array containing the timesteps for each block in seconds. -->
The `TextGrid` File written at `textgrid_path` could look like this in Praat:
|![](examples/creapy_creak_example_praat.PNG 'creak probability')|
| :-: |
|*Modified TextGrid in Praat with new creak tier*|
### Classifying a folder containing multiple audio files
You can perform the classification on a whole folder using the function `process_folder`.
```python!
creapy.process_folder(
audio_directory='<path_to_your_audio_directory>',
textgrid_directory='<path_to_your_textgrid_directory>')
```
Here the folder structure is expected to be as follows:
```
audio_directory
├── speaker_1.wav
├── speaker_2.wav
├── ...
└── speaker_n.wav
textgrid_directory
├── speaker_1.TextGrid
├── speaker_2.TextGrid
├── ...
└── speaker_n.TextGrid
```
i.e. the maximum folder level is 1 and the filenames for the respective `TextGrid`-files are expected to be the same as for the audio-files. **Note**: The audio directory and textgrid directory can be the same, but it is really important that the names and suffixes are correct.
## Set your configuration/Change parameters
All of the functions above use the default configuration. If you want to change various parameters you can do so in the configuration file. This file will be saved to `C:/Users/myusername/.creapy/config.yaml` after you **called the function** `set_config` **once**.
```python!
creapy.set_config()
```
A detailed description on the changeable parameters is given in the following table:
| Parameter | Explanation| default value |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --- |
| `audio_directory`| Path to the directory containing audio files like `C:/Users/myusername/Documents/audio`.| `null`|
| `textgrid_directory` | Path to the directory containing the respective `TextGrid` files `C:/Users/myusername/Documents/textgrid`. This is optional. It will also work without texgrid files. |`null`|
| `csv_directory`| Path to a folder where the csv-files containing the classification results should be stored `C:/Users/myusername/Documents/results`.| `null`|
| `audio_start`| Starttime in seconds where the audiofile should be analysed. |$0$|
| `audio_end`| endtime in seconds until the audiofile should be analysed (if -1 the file gets processed until the end).|$-1$|
|`audio_suffix`| suffix of the audio-file(s) (compressed audio formats like `.mp3` are not supported). | `.wav`|
|`textgrid_suffix`|suffix of the textgrid-file(s).|.TextGrid|
|`gender_model`|The gender model chosen for creak-classification. Can be `all`, `male` or `female`. We recommend using the `all` model for male speakers and the `female` model for female speakers. |`all`|
|`tier_name`|The tiername of the new tier with creapy's annotations in the textgrid file.| `creapy`|
|`block_size`| Classification blocksize in seconds. Smaller blocks are computationally more expensive but achieve a better time resolution. |$0.04$|
|`hop_size`|Classification hopsize in seconds. Should be in the range of $\frac{1}{4}\texttt{block\_size}\leq\texttt{hop\_size}\leq\frac{1}{2}\texttt{block\_size}$.|$0.01$|
|`creak_threshold`|Probability-threshold of the classifier where creak is classified. Can be a decimal value between 0 and 1. If you get too many false positives try increasing this value. If a lot of creaky voice gets missed try decreasing it.|$0.75$|
|`zcr_threshold`|Threshold for the zero-crossing-rate pre-elimination feature. Blocks with a $\text{zcr}\geq\text{zcr\_threshold}$ get eliminated. For female speakers we achieved better results with a higher value ranging around 0.08-0.12. For male speakers a value <0.08 will yield good results. **Note:** This is highly speaker dependent. |$0.08$|
|`ste_threshold`|Threshold for the short-term-energy pre-elimination feature. This value does not normally need to be changed. It mostly eliminiates blocks of silence or noise. |$1\cdot10^{-5}$|
You can change these parameters in the `config.yaml` file itself or by using the function `set_config`, e.g.
```python!
creapy.set_config(block_size=0.05, creak_threshold=0.7)
```
If you want to reset the configuration file, you can do so, using the function `reset_config`, e.g.
```python!
creapy.reset_config()
```
This will set all parameters to their default values as seen in the table [above](#set-your-configurationchange-parameters).
Especially the `zcr_threshold`, `ste_threshold` and `creak_threshold` are parameters which can change the result the most.
<!-- With the `plot` function those parameters get visualized.
```python!
creapy.plot(X_test, y_pred, sr)
``` -->
### Plotting
While not a dependency for this tool, `plotly` is needed in order to evoke the plotting function provided by `creapy`:
```bash
pip install plotly
```
The function call
```python
creapy.plot(X_test, y_pred, sr)
```
on the example results in this plot
![](https://i.imgur.com/1nZUR8L.png)
The plot is interactive: one can check specific timeranges and select the features to display. In this way, a good estimate of the values for the parameters can be obtained.
<!-- ## Useful links (TEMP)
If you have a problem with setting up a SSH connection to GitHub the following tutorial should help:
https://docs.github.com/en/authentication/connecting-to-github-with-ssh
We recommend working in a Jupyter enviroment. If you do not have one yet following link might help you:
https://jupyter.org/install
Python (brauch ma des?)
https://wiki.python.org/moin/BeginnersGuide/Download -->
<!-- On Windows, replace the Backslashes `\` or `\\` in your path with single slashes `/` so that you get something like `C:/Users/myusername/Documents/creak` -->
<!-- ### Classify audio file
You can use the pretrained models to detect creak in your speech signals. For more information on the training process, see [Pretrained Model](#pretrained-model).
1. change paths as described [above](#set-your-configuration).
2. run script `xyz.py` and wait; note: takes approx. 3 times as long as the audio file on a computer with processor xyz, ...; so for longer sound files, you need to be patient -->
File added
from .feature_extraction import *
from .utils import *
from .model import *
__version__ = '0.0.2'
# USER:
# audio_directory: null
# textgrid_directory: null
# csv_directory: null
# audio_start: 0
# audio_end: -1
# audio_suffix: .wav
# textgrid_suffix: .TextGrid
# gender_model: all
# tier_name: creapy
# block_size: 0.040 # seconds
# hop_size: 0.010 # in seconds
# creak_threshold: 0.75
# zcr_threshold: 0.08
# ste_threshold: 0.00001
DEFAULT:
sample_dir: audio/samples
audio_suffix: .wav # file name extension of audio file(s)
audio_re: (?P<speakers>[a-zA-Z\d]+)_(?P<single_speaker>\d{3}[a-zA-Z])_.+\.wav # [optional] define regex to match your files
re_speaker_group: single_speaker # ??
FEATURE_EXTRACTION: # choose which features should be extracted
cpp: false
hnr: true
jitter: true
h1h2: true
shimmer: true
f0mean: true
zcr: true
ste: true
VALUES: # set parameters/thresholds for the features
HNR: 1
JITTER: 1
H1H2: 1
SHIMMER: 1
F0MEAN: # fundamental frequency
fmin: 60 # pitch range
fmax: 500 # pitch range
ZCR:
window: hamming
STE:
window: hamming
CLASSIFICATION: # set parameters for the (random forest) classification
impute_strategy: median # mean, most_frequent, null
random_state: 42 # int, null
test_size: 0.33 # float
MODEL:
PREPROCESSING:
impute_strategy: median # mean, most_frequent, null
impute_at_fit: true
test_size: 0.33 # int, Pfloat
block_size: 0.040 # in seconds
hop_size: 0.010 # in seconds
random_state: 42
UNVOICED_EXCLUSION:
zcr: true
ste: true
VALUES:
ZCR:
threshold: 0.08
replace_value: 0.0
operator: ">="
normalize: true
STE:
threshold: 0.00001
replace_value: 0.0
operator: "<="
normalize: false
POSTPROCESSING:
MAVG:
mavg: false
VALUES:
length: 3
mode: same
INTERVALS:
creak_threshold: 0.75
min_creak_length: 0.03
max_gap: 0.015
CLASSIFIER:
clf: rfc # rfc, mlp
VALUES:
RFC:
kwargs:
n_estimators: 99
random_state: 42 # int, null
MLP:
kwargs:
solver: lbfgs
hidden_layer_sizes: [5, 5, 5, 2]
random_state: 42
max_iter: 500
predict_proba: true # true, false
target_name: "c"
FEATURES:
for_classification: ['hnr', 'jitter', 'h1h2', 'shimmer', 'f0mean']
model_location: model/training_models/model
save_pickle: false
target_label: class
PRAAT:
creak_tier_name: creapy
interval_text: c
creak_re: (?P<speaker>[a-zA-Z\d]+)-creak
creak_fstr: '{}-creak'
from .feature_extraction import calculate_features, calculate_features_for_folder, get_feature_list, WINDOW_MAPPING, blockwise_feature_calculation
from __future__ import annotations
from pathlib import Path
from typing import Optional
import numpy as np
# import opensmile
import pandas as pd
import parselmouth as pm
import soundfile as sf
from scipy import interpolate
from scipy.signal.windows import hamming, hann, kaiser
from rich.progress import track
from ..utils import get_config
def _cpp(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
"""Calculates the cepstral peak prominence using praat
Args:
data (np.ndarray): The sound data
sr (int): The sampling rate
config (dict, optional): The default configuration file. Defaults to None.
Returns:
float: The cepstral peak prominence
"""
if config is None:
config = get_config()
spectrum = sound.to_spectrum()
power_cepstrum = pm.praat.call(spectrum, 'To PowerCepstrum')
*args, = map(config["FEATURE_EXTRACTION"]["VALUES"]["CPP"].get, [
"fmin", "fmax", "interpolation", "qmin", "qmax", "trend_type", "fit_method"
])
cpp = pm.praat.call(power_cepstrum, "Get peak prominence...", *args)
return cpp
def _h1_h2(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
# if config is None:
# config = get_config()
try:
pitch = sound.to_pitch(sound.duration)
spectrum = sound.to_spectrum()
h1 = pitch.selected_array[0][0]
h2 = h1 * 2
x = np.arange(0, spectrum.nf) * spectrum.df
y = np.sqrt(spectrum.values[0]**2 + spectrum.values[1]**2)
f = interpolate.interp1d(x, y, 'quadratic')
h1_amp = f(h1)
h2_amp = f(h2)
except:
[h1_amp, h2_amp] = [np.nan, np.nan]
return h1_amp - h2_amp
def _hnr(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
try:
harmonicity = sound.to_harmonicity()
except pm.PraatError:
hnr = np.nan
else:
# taken from
# https://parselmouth.readthedocs.io/en/stable/examples/batch_processing.html?highlight=harmonicity#Batch-processing-of-files
hnr = harmonicity.values[harmonicity.values != -200].mean()
return hnr
def _jitter(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
try:
pointProcess = pm.praat.call(
sound, "To PointProcess (periodic, cc)", 75, 500)
local_jitter = pm.praat.call(
pointProcess, "Get jitter (local)", 0, 0, 0.0001, 0.02, 1.3)
except:
local_jitter = np.nan
return local_jitter
def _shimmer(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
try:
pointProcess = pm.praat.call(
sound, "To PointProcess (periodic, cc)", 75, 500)
local_shimmer = pm.praat.call(
[sound, pointProcess], "Get shimmer (local)", 0, 0, 0.0001, 0.02, 1.3, 1.6)
except:
local_shimmer = np.nan
return local_shimmer
def _f0mean(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None) -> float:
try:
pitch = sound.to_pitch(sound.duration).selected_array[0][0]
except:
pitch = np.nan
return pitch
def _zcr(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None):
"""
calculates the Zero-Crossing-Rate (ZCR)
@param data: buffer signal data with blocklength N in form of an array of shape N data num_blocks
@param w: Window with length N
@return Zero-Crossing-rate
"""
# if config is None:
# config = get_config()
# w = WINDOW_MAPPING[config["FEATURE_EXTRACTION"]
# ["VALUES"]["ZCR"]["window"]](data.shape[0])
# sgn = lambda data: 1 if data >= 0 else -1
sign_arr = np.sign(data)
sign_arr[sign_arr == 0] = 1
N = data.shape[0]
return 0.5 * np.sum(np.abs(np.diff(sign_arr))) / N
def _ste(data: np.ndarray, sound: pm.Sound, sr: int,
config: Optional[dict] = None):
"""
calculates the short-term-energy (STE)
@param x: buffer signal x with blocklength N in form of an array of shape N x num_blocks
@param w: Window with length N, shape (N,)
@return short-term-energy
"""
# assert len(w) == x.shape[0], "Dimension Mismatch: Windowlength != blocklength"
# assert len(x.shape) == 2, "Signal must be already buffered (2D)"
# assert len(w.shape) == 1, "Window must be 1D"
# if config is None:
# config = get_config()
# if w is None:
# w = WINDOW_MAPPING[config["FEATURE_EXTRACTION"]
# ["VALUES"]["STE"]["window"]](data.shape[0])
N = data.shape[0]
return np.sum(data ** 2) / N
# return np.sum(np.abs(data)) / N
def get_feature_list(config: Optional[dict] = None) -> list[str]:
if config is None:
config = get_config()
return [key for key, value in config["FEATURE_EXTRACTION"].items() if value is True]
def calculate_features(data: np.ndarray, sr: int,
return_header: bool = False,
config: Optional[dict] = None,
features: Optional[list[str]] = None) -> tuple[np.ndarray, list] | np.ndarray:
if config is None:
config = get_config()
if features is None:
features = get_feature_list(config)
sound = pm.Sound(values=data, sampling_frequency=sr)
result = np.array([FEATURE_MAPPING[feature](data, sound, sr, config=config)
for feature in features])
return (result, features) if return_header is True else result
def blockwise_feature_calculation(data: np.ndarray, sr,
feature, config: Optional[dict] = None):
if config is None:
config = get_config()
sounds = [pm.Sound(values=block, sampling_frequency=sr) for block in data]
function = FEATURE_MAPPING[feature]
res = []
# for block, sound in zip(track(data, description=f"Calculating {feature}"), sounds):
# res.append(function(block, sound, sr))
res = [function(block, sound, sr) for block, sound in zip(data, sounds)]
return np.array(res)
def calculate_features_for_folder(path: str,
file_suffix: str = ".wav",
features=None) -> tuple[pd.DataFrame, pd.Series, list[str]]:
files = list(Path(path).glob(f"**/*{file_suffix}"))
config = get_config()
if features is None:
features = get_feature_list(config)
feature_matrix = np.array(
[calculate_features(*sf.read(file_), config=config,
features=features) for file_ in files]
)
# TODO put regex in config file
target_vector = np.array(
[str(x).split('.')[0].split('_')[-1] for x in files])
return pd.DataFrame(feature_matrix, columns=features), pd.Series(target_vector)
FEATURE_MAPPING = {
"cpp": _cpp,
"hnr": _hnr,
"h1h2": _h1_h2,
"jitter": _jitter,
"shimmer": _shimmer,
"f0mean": _f0mean,
"zcr": _zcr,
"ste": _ste,
}
WINDOW_MAPPING = {
"hann": hann,
"kaiser": kaiser,
"hamming": hamming,
"rect": lambda N: np.ones((N)) / N
}
from re import S
from .model import Model, load_model
from .preprocessing import impute, split_data, buffer
from .postprocessing import moving_average
from .classify import process_file, process_folder
from __future__ import annotations
import time
import warnings
from functools import partial
from pathlib import Path
# from threading import Thread
from typing import Optional
import os
import numpy as np
import pandas as pd
from rich.progress import track
from scipy.signal.windows import hann
from ..feature_extraction import WINDOW_MAPPING
from ..feature_extraction.feature_extraction import (
blockwise_feature_calculation, calculate_features, get_feature_list)
from ..utils.config import get_config
from ..utils.helpers import (ThreadWithReturnValue, get_creak_intervals,
get_root, get_time_vector, intervals_to_textgrid,
intervals_to_csv)
from ..utils.read_wav import read_wav
from .model import load_model
from .postprocessing import thresholding
from .preprocessing import buffer
def process_file(audio_path,
textgrid_path: Optional[str] = None,
csv_folder_path: Optional[str] = None,
gender_model: Optional[str] = None):
_config = get_config()
verbosity = _config['USER']['verbose']
start, end = _config['USER']['audio_start'], _config['USER']['audio_end']
data, sr = read_wav(audio_path, start=start, end=end)
w = hann(int(_config['USER']["block_size"] * sr))
creak_data_buff = buffer(data, sr, window=w)
PREPROCESSING_FEATURES = [key for key, val in _config['MODEL']
['PREPROCESSING']['UNVOICED_EXCLUSION'].items() if val is True]
t = time.time()
threads = [ThreadWithReturnValue(target=blockwise_feature_calculation, args=[
creak_data_buff.T.copy(), sr, feature, _config]) for feature in PREPROCESSING_FEATURES]
for thread in threads:
thread.start()
elimination_chunks = np.array([thread.join() for thread in threads]).T
preprocessing_values = _config["MODEL"]['PREPROCESSING']['UNVOICED_EXCLUSION']["VALUES"]
preprocessing_values['ZCR']['threshold'] = _config['USER']['zcr_threshold']
preprocessing_values['STE']['threshold'] = _config['USER']['ste_threshold']
thresholds = np.array(
[thresholding(series=chunk, **preprocessing_values
[feature.upper()]) for chunk, feature in zip(elimination_chunks.T, PREPROCESSING_FEATURES)]
)
included_indices = thresholds.sum(axis=0) == 0
features_for_classification = _config["MODEL"]["FEATURES"]["for_classification"]
t = time.time()
threads = [ThreadWithReturnValue(target=blockwise_feature_calculation, args=[
creak_data_buff.T[included_indices].copy(), sr, feature, _config.copy()]) for feature in features_for_classification]
for thread in threads:
thread.start()
_X_test = np.array([thread.join() for thread in threads]).T
# print("time ellapsed:", time.time() - t)
_X_test = pd.DataFrame(_X_test, columns=features_for_classification,
index=np.argwhere(included_indices).ravel())
X_test = pd.concat((pd.DataFrame(elimination_chunks,
columns=PREPROCESSING_FEATURES), _X_test), axis=1)
y_pred = np.zeros((creak_data_buff.shape[1]))
if not any(included_indices):
warnings.warn("Did not make classification. Consider setting new values "
"for Zero-Crossing-Rate (zcr) or Short-Term-Energy (ste).")
return X_test, y_pred
# Load model
if gender_model is not None:
gender_model = gender_model.lower().strip()
if gender_model not in ("male", "female", "all"):
raise ValueError(
f'Gender must be \"male\", \"female\", \"all\" or None is {gender_model}')
else:
gender_model = _config['USER']['gender_model']
model_path = get_root() / _config["MODEL"]["model_location"]
model_path = (model_path.parent /
(f"{model_path.stem}_{gender_model.upper()}")).with_suffix(".csv")
model = load_model(model_path)
y_pred[included_indices] = model.predict(_X_test)
if textgrid_path is not None:
tier_name = _config['USER']['tier_name']
filename_extension = _config['USER']['filename_extension']
if filename_extension:
_textgrid_path = Path(textgrid_path)
new_filename = _textgrid_path.stem + filename_extension + _textgrid_path.suffix
result_path = str(_textgrid_path.parent / new_filename)
else:
result_path = textgrid_path
intervals = get_creak_intervals(
y_pred, get_time_vector(y_pred, sr, start), tgt_intervals=True)
intervals_to_textgrid(
intervals=intervals,
textgrid_path=textgrid_path,
# result_path=textgrid_path if textgrid_dst is None else textgrid_dst,
result_path=result_path,
tier_name=tier_name,
verbose=verbosity
)
if csv_folder_path is not None:
intervals = get_creak_intervals(
y_pred, get_time_vector(y_pred, sr, start), tgt_intervals=True)
intervals_to_csv(
intervals=intervals,
csv_dst=csv_folder_path
)
pass
return X_test, y_pred, sr
def process_folder(audio_directory: Optional[str] = None,
textgrid_directory: Optional[str] = None,
csv_directory: Optional[str] = None):
_config = get_config()
if audio_directory is None:
audio_directory = _config['USER']['audio_directory']
if not os.path.isdir(audio_directory):
raise ValueError(
f"Invalid path, given audio-directory \"{audio_directory}\" is not a directory.")
if textgrid_directory is None:
textgrid_directory = _config['USER']['textgrid_directory']
if textgrid_directory:
if not os.path.isdir(textgrid_directory):
raise ValueError(
f"Invalid path, given textgrid-directory \"{textgrid_directory}\" is not a directory.")
if csv_directory is None:
csv_directory = _config['USER']['csv_directory']
if csv_directory:
if not os.path.isdir(csv_directory):
raise ValueError(
f"Invalid path, given csv_directory \"{csv_directory}\" is not a directory.")
audio_suffix = _config['USER']['audio_suffix']
wav_files = list(Path(audio_directory).glob(f'**/*{audio_suffix}'))
if textgrid_directory:
wav_tg_map = dict()
textgrid_path = Path(textgrid_directory)
textgrid_suffix = _config['USER']['textgrid_suffix']
for wav_file in wav_files:
wav_tg_map[wav_file] = (
textgrid_path / wav_file.stem).with_suffix(textgrid_suffix)
# for wav_file in track(wav_files, "Processing folder..."):
for wav_file in wav_files:
process_file(
wav_file,
textgrid_path=wav_tg_map[wav_file] if textgrid_directory else None,
csv_folder_path=csv_directory if csv_directory else None
)
from __future__ import annotations
import os
import pickle
from pathlib import Path
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.impute import SimpleImputer
from ..utils import get_config, get_root
from .postprocessing import moving_average
from .preprocessing import impute
class NotFittedError(Exception):
pass
class Model:
"""The Model for creaky voice classification.
"""
def __init__(self):
self._config = get_config()["MODEL"]
self._X_train: pd.DataFrame
self._y_train: pd.Series
self._imputer: SimpleImputer
self._features = self._config["FEATURES"]["for_classification"]
self._fitted = False
_clf = self._config["CLASSIFIER"]["clf"]
self._clf = clfs[_clf](
**self._config["CLASSIFIER"]["VALUES"][_clf.upper()]["kwargs"])
def fit(self, X_train: pd.DataFrame, y_train: pd.DataFrame):
"""Function to fit the model with training data.
Args:
X_train (pd.DataFrame): Features of training data.
y_train (pd.Dataframe): Targets of training data (creak, no-creak).
"""
if isinstance(y_train, pd.DataFrame):
y_train = y_train.to_numpy()
if self._config["PREPROCESSING"]["impute_at_fit"] is True:
self._X_train, self._imputer = impute(
X_train=X_train.loc[:, self._features], return_imputer=True)
else:
self._X_train = X_train
self._y_train = pd.Series(y_train, name=self._config["target_label"])
self._clf.fit(
self._X_train.loc[:, self._features], self._y_train.ravel())
self._fitted = True
def predict(self, X_test: pd.DataFrame, predict_proba: bool=None) -> np.ndarray:
"""Predicts the given features.
Args:
X_test (pd.DataFrame): Features to be predicted.
predict_proba (bool, optional): If `True` the likelihood to be creak will be returned, else the predicted target.
Defaults to None.
Returns:
np.ndarray: Predicted targets, or probability of creak.
"""
self._config = get_config()["MODEL"]
if predict_proba is not None:
assert isinstance(predict_proba, bool)
else:
predict_proba = self._config["CLASSIFIER"]["predict_proba"]
if hasattr(self, "_imputer"):
X_test = pd.DataFrame(self._imputer.transform(
X_test.loc[:, self._features]), columns=self._X_train.columns, index=X_test.index)
if predict_proba is True:
_target_index = np.argwhere(
self._clf.classes_ == self._config["CLASSIFIER"]["target_name"]).item()
y_pred = self._clf.predict_proba(X_test[self._features])[
:, _target_index].flatten()
if self._config["POSTPROCESSING"]["MAVG"]["mavg"] is True:
length, mode = map(
self._config["POSTPROCESSING"]["MAVG"]["VALUES"].get, ("length", "mode"))
y_pred = moving_average(y_pred, length, mode)
else:
y_pred = self._clf.predict(X_test[self._features])
return y_pred
def save(self, filepath: str = None):
"""Saves a fitted model to the given location as csv file
Args:
filepath (str, optional): Destination path of saved model. Defaults to None.
Raises:
NotFittedError: If the model is not yet fitted it can not be saved.
"""
if self._fitted is False:
raise NotFittedError(
"Can't save model because it is not fitted yet")
_config = get_config()
if filepath is None:
# filepath without suffix!
filepath = get_root() / _config["MODEL"]["model_location"]
else:
filepath = Path(filepath)
# merge X_train and y_train
_X_combined = pd.concat((self._X_train, self._y_train), axis=1)
_X_combined.to_csv((filepath.parent /
(filepath.name)).with_suffix(".csv"), index=False)
if _config["MODEL"]["save_pickle"] is True:
with open(filepath.parent / (filepath.name + '.pickle'), "wb") as f:
pickle.dump(self, f)
def load_model(filepath: str = None) -> Model:
"""Loads a already fitted model from a csv file.
Args:
filepath (str, optional): Location of the model csv file. Defaults to None.
Returns:
Model: Fitted Model for creak classification.
"""
if filepath is None:
filepath = get_root() / get_config()["MODEL"]["model_location"]
filepath = (filepath.parent / (filepath.name)).with_suffix(".csv")
else:
filepath = Path(filepath)
if filepath.suffix == ".csv":
_config = get_config()
_X_combined = pd.read_csv(filepath)
model = Model()
_target_column = _config["MODEL"]["target_label"]
_feature_columns = _config["MODEL"]["FEATURES"]["for_classification"]
_X_train, _y_train = _X_combined[_feature_columns], _X_combined[_target_column]
model.fit(_X_train, _y_train)
return model
if filepath.suffix == ".pickle":
with open(filepath, "rb") as f:
return pickle.load(f)
clfs = {
"rfc": RandomForestClassifier,
"mlp": MLPClassifier
}
from __future__ import annotations
import numpy as np
import operator as operator_
def thresholding(
series: np.ndarray,
threshold: float,
y: np.ndarray = None,
replace_value: float = 0.0,
operator: str = ">=",
normalize: bool = False):
_operators = {
">": operator_.gt,
">=": operator_.ge,
"<": operator_.lt,
"<=": operator_.le
}
# assert y.shape == series.shape
if normalize is True: series /= max(series)
if y is not None:
y[_operators[operator](series, threshold)] = replace_value
return y
else:
return _operators[operator](series, threshold)
def moving_average(series: np.ndarray, N: int = 10, mode: str = "same"):
if len(series) < N:
return series
return np.convolve(series, np.ones((N)) / N, mode=mode)
\ No newline at end of file
from __future__ import annotations
from typing import Optional
import numpy as np
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from ..utils import get_config
def impute(X_train: pd.DataFrame,
X_test: Optional[pd.DataFrame] = None,
return_imputer: bool = False):
config_ = get_config()["MODEL"]["PREPROCESSING"]
impute_strategy = config_["impute_strategy"]
_imputer = SimpleImputer(strategy=impute_strategy)
if X_test is None:
res = pd.DataFrame(_imputer.fit_transform(
X_train), columns=X_train.columns)
else:
X_train = pd.DataFrame(_imputer.fit_transform(
X_train), columns=X_train.columns)
X_test = pd.DataFrame(_imputer.transform(X_test),
columns=X_train.columns)
res = X_train, X_test
return (res, _imputer) if return_imputer is True else res
def split_data(X: pd.DataFrame, y: pd.Series):
config_ = get_config()["MODEL"]["PREPROCESSING"]
impute_strategy = config_["impute_strategy"]
test_size = config_["test_size"]
if impute_strategy is not None:
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=test_size, random_state=config_["random_state"])
X_train, X_test = impute(X_train, X_test)
else:
_tmp = X.copy()
_tmp["label"] = y
_tmp.dropna(inplace=True)
_X_drop = _tmp.drop(["label"], axis=1)
_y_drop = _tmp["label"]
X_train, X_test, y_train, y_test = train_test_split(
_X_drop, _y_drop, test_size=test_size, random_state=config_["random_state"])
del _tmp, _X_drop, _y_drop
return X_train, X_test, y_train, y_test
def buffer(x, sr, opt: str = "nodelay", window=None):
"""
Buffer signal vector into matrix of data frames
x: Signal
N: block size
OL: overlap
window: predefined window
* works only for 1D Signals!
"""
config_ = get_config()["USER"]
N = int(config_["block_size"] * sr)
R = int(config_["hop_size"] * sr)
if window is None:
window = np.ones(N)
assert len(window) == N, "windowlength does not match blocksize N"
n = len(x)
OL = int(N - R)
if opt == 'nodelay':
assert n >= OL, "in 'nodelay' mode, len(x) must be OL or longer"
n_seg = int(np.ceil((n - N) / R + 1))
# print('num_seg:', n_seg)
else:
n_seg = int(np.ceil(n / R))
x = np.concatenate([np.zeros(OL), x.squeeze()])
res = np.zeros((N, n_seg))
for i in range(n_seg - 1):
res[:, i] = x[i * R: i * R + N] * window
i = n_seg - 1
# last block
nLast = x[i * R:].size
res[range(nLast), n_seg - 1] = x[i * R:]
return res
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
from __future__ import annotations
import creapy
if __name__ == "__main__":
print("Extracting features..")
_config = creapy.get_config()
X, y = creapy.calculate_features_for_folder("/home/creaker/tip/audio/samples_new",
features=_config["MODEL"]["FEATURES"]["for_classification"])
print("Fitting model..")
model = creapy.Model()
model.fit(X, y)
model.save()
audio_directory: null
textgrid_directory: null
csv_directory: null
audio_start: 0
audio_end: -1
audio_suffix: .wav
filename_extension:
textgrid_suffix: .TextGrid
gender_model: all
tier_name: creapy
verbose: true
block_size: 0.040 # seconds
hop_size: 0.010 # in seconds
creak_threshold: 0.75
zcr_threshold: 0.08
ste_threshold: 0.00001
from .text_grid_to_intervals import read_textgrid, generate_sample_wavs
from .config import get_config, get_user_config, set_config, reset_config, CONFIG_DIR
from .read_wav import read_wav
from .helpers import *
from .evaluation import evaluation_metrics, evaluate
from .plot import plot
from __future__ import annotations
import yaml
import ruamel.yaml
from os.path import isfile
from pathlib import Path
import sys
_RELATIVE_PATH_TO_CONFIG = "../config.yaml"
_RELATIVE_PATH_TO_USER_CONFIG = "../user_config.yaml"
_CONFIG_DIR = Path(__file__).parent / _RELATIVE_PATH_TO_CONFIG
_USER_CONFIG_DIR = Path(__file__).parent / _RELATIVE_PATH_TO_USER_CONFIG
USER_CONFIG_DIR = Path("~/.creapy/config.yaml").expanduser()
CONFIG_DIR = str(USER_CONFIG_DIR)
yaml_str = """\
audio_directory: null # Path to the directory containing audio files
textgrid_directory: null # Path to the directory containing the respective textgrid files
csv_directory: null # Path to a folder where the csv-files containing the classification results should be stored
audio_start: 0 # starttime in seconds where the audiofile should be analysed
audio_end: -1 # endtime in seconds until the audiofile should be analysed
audio_suffix: .wav # suffix of the audio-file(s)
filename_extension: null # string to append to the original textgrid filename. Creates a new file with the corresponding name.
textgrid_suffix: .TextGrid # suffix of the textgrid-file(s)
gender_model: all # The gender model chosen for creak-classification. Can be all, male or female
tier_name: creapy # tiername chosen creapy's annotations in praat
verbose: true # Verbosity of the tool
block_size: 0.04 # classification blocksize in seconds
hop_size: 0.01 # classification hopsize in seconds
creak_threshold: 0.75 # probability-threshold of the classifies where creak is classified. Can be a decimal value between 0 and 1
zcr_threshold: 0.08 # Threshold for the zero-crossing-rate pre-elimination feature
ste_threshold: 1.0e-05 # Threshold for the short-term-energy crossing rate pre-elimination feature
"""
def get_config() -> dict:
"""
returns the configuration file as a dictionary
Returns:
dict: the configuration
"""
with open(_CONFIG_DIR) as config_file:
config: dict = ruamel.yaml.safe_load(config_file.read())
with open(_USER_CONFIG_DIR) as user_config_file:
_user_config: dict = ruamel.yaml.safe_load(user_config_file.read())
config['USER'] = _user_config
if not isfile(USER_CONFIG_DIR):
return config
with open(USER_CONFIG_DIR) as user_config_file:
user_config: dict = ruamel.yaml.safe_load(user_config_file.read())
for key in user_config.keys():
if key not in config['USER']:
raise ValueError(f'Invalid key found in user config: {key}')
config['USER'].update(user_config)
return config
def get_user_config() -> dict:
return get_config()['USER']
def set_config(**kwargs) -> None:
_default_config: dict = get_config()['USER']
for key in kwargs.keys():
if key not in _default_config:
raise ValueError(
f"key \"{key}\" can't be set in config, possible keys: {list(_default_config.keys())}")
_default_config.update(kwargs)
ruamel_yaml = ruamel.yaml.YAML()
code = ruamel_yaml.load(yaml_str)
code.update(_default_config)
USER_CONFIG_DIR.parent.mkdir(parents=True, exist_ok=True)
with open(USER_CONFIG_DIR, "w") as user_config_file:
ruamel_yaml.dump(code, user_config_file)
def reset_config() -> None:
with open(_USER_CONFIG_DIR) as cfg_file:
_default_config: dict = ruamel.yaml.safe_load(cfg_file.read())
set_config(**_default_config)
from __future__ import annotations
import numpy as np
from pandas import Series
import tgt
from datetime import datetime
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, confusion_matrix
if __name__ == "__main__": # Todo
from creapy import get_config
else:
try:
from ..utils import get_config
except ImportError as ie:
print(ie)
# %%
def evaluation_metrics(y_true, y_pred):
assert all((isinstance(y_pred, (np.ndarray, Series)),
isinstance(y_true, (np.ndarray, Series))))
assert y_pred.shape == y_true.shape
_accuracy = accuracy_score(y_true, y_pred)
recall = recall_score(y_true, y_pred, average=None)
f1 = f1_score(y_true, y_pred, average=None)
precision = precision_score(y_true, y_pred, average=None)
print(
f"""accuracy: {_accuracy}
recall: {recall}
f1: {f1},
precision: {precision}""")
return recall, f1, precision, confusion_matrix(y_true, y_pred)
class CreakInterval(tgt.core.Interval):
def __init__(self, start_time, end_time, delta_start, delta_end, text=''):
super().__init__(start_time, end_time, text)
self.delta_start = delta_start
self.delta_end = delta_end
def __repr__(self):
return u'Interval({0}, {1}, "{2}")'.format(self.start_time, self.end_time, self.text)
def evaluate(textgrid, tier_name_true, tier_name_creapy, boundary_tier_name=None,
tier_name_evaluation=None):
"""Only possible as a function inside creapy module!"""
TEXTGRID_PATH = textgrid
OWN_TIER_NAME = tier_name_true
OUTFILE = None
CREAPY_TIER_NAME = tier_name_creapy
# INTERVAL_LABEL = "c"
EVALUATION_TIER_NAME = tier_name_evaluation if tier_name_evaluation is not None else "creapy-evaluation"
for encoding in ("utf-8", "utf-16"):
try:
tg = tgt.io.read_textgrid(TEXTGRID_PATH, encoding=encoding)
except UnicodeDecodeError as e:
print(f"Error occured reading textfile:\n\n{e}")
else:
break
own_tier: tgt.core.Tier = tg.get_tier_by_name(OWN_TIER_NAME)
creapy_tier: tgt.core.Tier = tg.get_tier_by_name(CREAPY_TIER_NAME)
# Grausliche loesung im moment
t_min = max(own_tier.intervals[0].start_time,
creapy_tier.intervals[0].start_time)
t_max = min(own_tier.intervals[-1].end_time,
creapy_tier.intervals[-1].end_time)
own_tier = tgt.core.IntervalTier(objects=own_tier.get_annotations_between_timepoints(
start=t_min, end=t_max, left_overlap=True, right_overlap=True))
creapy_tier = tgt.core.IntervalTier(objects=creapy_tier.get_annotations_between_timepoints(
start=t_min, end=t_max, left_overlap=True, right_overlap=True))
if boundary_tier_name is not None:
boundary_tier: tgt.core.IntervalTier = tg.get_tier_by_name(
boundary_tier_name)
# creapy_tier = tgt.core.IntervalTier(objects=[
# interval for interval in creapy_tier.intervals if not any(map(lambda x: x.text == "###" or x.text == "", boundary_tier.get_annotations_between_timepoints(
# start=interval.start_time, end=interval.end_time, left_overlap=True, right_overlap=True
# )))
# ])
in_boundary = []
for interval in creapy_tier.intervals:
ols = boundary_tier.get_annotations_between_timepoints(
start=interval.start_time, end=interval.end_time, left_overlap=True,
right_overlap=True
)
if (not all(map(lambda x: x.text in ("###", "SIL"), ols))) and bool(ols):
in_boundary.append(interval)
creapy_tier = tgt.core.IntervalTier(objects=in_boundary)
in_boundary = []
for interval in own_tier.intervals:
ols = boundary_tier.get_annotations_between_timepoints(
start=interval.start_time, end=interval.end_time, left_overlap=True,
right_overlap=True
)
if (not all(map(lambda x: x.text in ("###", "SIL"), ols))) and bool(ols):
in_boundary.append(interval)
own_tier = tgt.core.IntervalTier(objects=in_boundary)
# print([interval for interval in creapy_tier.intervals if not any(map(lambda x: x.text == "###" or x.text == "", boundary_tier.get_annotations_between_timepoints(
# start=interval.start_time, end=interval.end_time, left_overlap=True, right_overlap=True])
# return np.nan, np.nan, np.nan
overlaps = tgt.util.get_overlapping_intervals(
own_tier,
creapy_tier,
)
own_tier_copy = own_tier.get_copy_with_same_intervals_merged()
creapy_tier_copy = creapy_tier.get_copy_with_same_intervals_merged()
# %%
tps = [ol for ol in overlaps if ol.text in ("c+c")]
# %%
# print([ol.text for ol in overlaps if ol.text != "c+c"])
out_intervals = []
tp_intervals = []
fp_intervals = []
TRUE_POSITIVE_LABELS = ("c+c", "c?+c")
FALSE_POSITIVE_LABELS = ("no-c+c")
# %%
for ol in overlaps:
if ol.text in TRUE_POSITIVE_LABELS:
annot_creapy = creapy_tier.get_annotations_between_timepoints(
ol.start_time, ol.end_time, left_overlap=True, right_overlap=True)[0]
annot_own = own_tier.get_annotations_between_timepoints(
ol.start_time, ol.end_time, left_overlap=True, right_overlap=True)[0]
delta_start = annot_creapy.start_time - annot_own.start_time
delta_end = annot_creapy.end_time - annot_own.end_time
delta_length = delta_start - delta_end
tp_intervals.append(CreakInterval(
ol.start_time, ol.end_time, delta_start, delta_end, text="TP"))
elif ol.text in FALSE_POSITIVE_LABELS:
fp_intervals.append(ol)
# %%
# Delete overlaps
for ol_interval in overlaps:
own_tier_copy.delete_annotations_between_timepoints(
ol_interval.start_time, ol_interval.end_time, left_overlap=True, right_overlap=True
)
creapy_tier_copy.delete_annotations_between_timepoints(
ol_interval.start_time, ol_interval.end_time, left_overlap=True, right_overlap=True
)
# %%
own_tier_creak_only = tgt.core.Tier(
objects=own_tier_copy.get_annotations_with_text(pattern="c"))
creapy_tier_creak_only = tgt.core.Tier(
objects=creapy_tier_copy.get_annotations_with_text(pattern="c"))
# %%
eval_tier = tgt.core.IntervalTier(name=EVALUATION_TIER_NAME)
for tp in tp_intervals:
t0_start = tp.start_time - abs(tp.delta_start)
t0_end = tp.end_time + abs(tp.delta_end)
try:
eval_tier.add_annotation(tp)
except ValueError as e:
eval_tier.delete_annotations_between_timepoints(
tp.start_time, tp.end_time, left_overlap=True, right_overlap=True)
try:
eval_tier.add_annotation(tp)
except ValueError as e:
# print(e)
pass
try:
eval_tier.add_annotation(tgt.core.Interval(
start_time=t0_start, end_time=tp.start_time, text="+" if tp.delta_start > 0 else "-"))
except ValueError as e:
# print(e)
pass
try:
eval_tier.add_annotation(tgt.core.Interval(
start_time=tp.end_time, end_time=t0_end, text="+" if tp.delta_end < 0 else "-"))
except ValueError as e:
# print(e)
pass
# %%
for fns in own_tier_creak_only.annotations:
try:
eval_tier.add_interval(tgt.core.Interval(
fns.start_time, fns.end_time, text="FN"))
except ValueError as e:
print(e)
for fps in creapy_tier_creak_only.annotations:
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
print(e)
for fps in fp_intervals:
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
eval_tier.delete_annotations_between_timepoints(
fps.start_time, fps.end_time, left_overlap=True, right_overlap=True)
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
print(e)
# %%
TP = len(tp_intervals)
FP = len(fp_intervals) + len(creapy_tier_creak_only)
FN = len(own_tier_creak_only)
F1 = TP / (TP + (FP + FN) / 2) if (TP + (FP + FN) / 2) != 0 else np.nan
Precision = TP / (TP + FP) if (TP + FP) != 0 else np.nan
Recall = TP / (TP + FN) if (TP + FP) != 0 else np.nan
tg.add_tier(eval_tier)
if tier_name_evaluation is not None:
tgt.io.write_to_file(tg, TEXTGRID_PATH, encoding="utf-16")
return F1, Precision, Recall, TP, FP, FN
def main():
TEXTGRID_PATH = "/home/creaker/tip/stateofgrass/GRASS/004M024F/004M024F_HM2_HM1_CS_001_creak.TextGrid"
OWN_TIER_NAME = "024F-creak"
OUTFILE = "/home/creaker/tip/results/evaluation.txt"
CREAPY_TIER_NAME = "024F-creapy-F"
# INTERVAL_LABEL = "c"
EVALUATION_TIER_NAME = "creapy_evaluation"
# %%
for encoding in ("utf-8", "utf-16"):
try:
tg = tgt.io.read_textgrid(TEXTGRID_PATH, encoding=encoding)
except UnicodeDecodeError as e:
print(f"Error occured reading textfile:\n\n{e}")
else:
break
own_tier: tgt.core.Tier = tg.get_tier_by_name(OWN_TIER_NAME)
creapy_tier: tgt.core.Tier = tg.get_tier_by_name(CREAPY_TIER_NAME)
# Grausliche loesung im moment
t_min = max(own_tier.intervals[0].start_time,
creapy_tier.intervals[0].start_time)
t_max = min(own_tier.intervals[-1].end_time,
creapy_tier.intervals[-1].end_time)
own_tier = tgt.core.IntervalTier(objects=own_tier.get_annotations_between_timepoints(
start=t_min, end=t_max, left_overlap=True, right_overlap=True))
creapy_tier = tgt.core.IntervalTier(objects=creapy_tier.get_annotations_between_timepoints(
start=t_min, end=t_max, left_overlap=True, right_overlap=True))
overlaps = tgt.util.get_overlapping_intervals(
own_tier,
creapy_tier,
)
eval_tier = tgt.core.IntervalTier(name="creapy-evaluation")
own_tier_copy = own_tier.get_copy_with_same_intervals_merged()
creapy_tier_copy = creapy_tier.get_copy_with_same_intervals_merged()
# %%
tps = [ol for ol in overlaps if ol.text in ("c+c")]
# %%
print([ol.text for ol in overlaps if ol.text != "c+c"])
out_intervals = []
tp_intervals = []
fp_intervals = []
TRUE_POSITIVE_LABELS = ("c+c", "c?+c")
FALSE_POSITIVE_LABELS = ("no-c+c")
# %%
for ol in overlaps:
if ol.text in TRUE_POSITIVE_LABELS:
annot_creapy = creapy_tier.get_annotations_between_timepoints(
ol.start_time, ol.end_time, left_overlap=True, right_overlap=True)[0]
annot_own = own_tier.get_annotations_between_timepoints(
ol.start_time, ol.end_time, left_overlap=True, right_overlap=True)[0]
delta_start = annot_creapy.start_time - annot_own.start_time
delta_end = annot_creapy.end_time - annot_own.end_time
delta_length = delta_start - delta_end
tp_intervals.append(CreakInterval(
ol.start_time, ol.end_time, delta_start, delta_end, text="TP"))
elif ol.text in FALSE_POSITIVE_LABELS:
fp_intervals.append(ol)
# %%
# Delete overlaps
for ol_interval in overlaps:
own_tier_copy.delete_annotations_between_timepoints(
ol_interval.start_time, ol_interval.end_time, left_overlap=True, right_overlap=True
)
creapy_tier_copy.delete_annotations_between_timepoints(
ol_interval.start_time, ol_interval.end_time, left_overlap=True, right_overlap=True
)
# %%
own_tier_creak_only = tgt.core.Tier(
objects=own_tier_copy.get_annotations_with_text(pattern="c"))
creapy_tier_creak_only = tgt.core.Tier(
objects=creapy_tier_copy.get_annotations_with_text(pattern="c"))
# %%
eval_tier = tgt.core.IntervalTier(name="creapy-evaluation")
for tp in tp_intervals:
t0_start = tp.start_time - abs(tp.delta_start)
t0_end = tp.end_time + abs(tp.delta_end)
try:
eval_tier.add_annotation(tp)
except ValueError as e:
eval_tier.delete_annotations_between_timepoints(
tp.start_time, tp.end_time, left_overlap=True, right_overlap=True)
try:
eval_tier.add_annotation(tp)
except ValueError as e:
print(e)
try:
eval_tier.add_annotation(tgt.core.Interval(
start_time=t0_start, end_time=tp.start_time, text="+" if tp.delta_start > 0 else "-"))
except ValueError as e:
print(e)
pass
try:
eval_tier.add_annotation(tgt.core.Interval(
start_time=tp.end_time, end_time=t0_end, text="+" if tp.delta_end < 0 else "-"))
except ValueError as e:
print(e)
pass
# %%
for fns in own_tier_creak_only.annotations:
try:
eval_tier.add_interval(tgt.core.Interval(
fns.start_time, fns.end_time, text="FN"))
except ValueError as e:
print(e)
for fps in creapy_tier_creak_only.annotations:
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
print(e)
for fps in fp_intervals:
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
eval_tier.delete_annotations_between_timepoints(
fps.start_time, fps.end_time, left_overlap=True, right_overlap=True)
try:
eval_tier.add_interval(tgt.core.Interval(
fps.start_time, fps.end_time, text="FP"))
except ValueError as e:
print(e)
# %%
TP = len(tp_intervals)
FP = len(fp_intervals) + len(creapy_tier_creak_only)
FN = len(own_tier_creak_only)
F1 = TP / (TP + (FP + FN) / 2)
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
# %%
tg.add_tier(eval_tier)
# tgt.io.write_to_file(tg, TEXTGRID_PATH, encoding="utf-16")
# %%
own_tier_creak_only
# %%
print(
f"""True Positives (TP):\t{TP}
False Positives (FP):\t{FP}
False Negatives (FN):\t{FN}
F1-Score: {F1:.3f}
Precision: {Precision: .3f}
Recall: {Recall: .3f}
"""
)
# %%
config = get_config()
_ste = config["MODEL"]["POSTPROCESSING"]["NON_MODAL_EXCLUSION"]["VALUES"]["STE"]["threshold"]
_zcr = config["MODEL"]["POSTPROCESSING"]["NON_MODAL_EXCLUSION"]["VALUES"]["ZCR"]["threshold"]
_block_size = config["MODEL"]["PREPROCESSING"]["block_size"]
_hop_size = config["MODEL"]["PREPROCESSING"]["hop_size"]
_intervals = config["MODEL"]["POSTPROCESSING"]["INTERVALS"]
# %%
content = f"""{datetime.now().strftime(" [%d/%m/%Y, %H:%M:%S] ").center(120, "-")}
PRAAT STUFF:
Textgrid File: {TEXTGRID_PATH}
Own Tier: {OWN_TIER_NAME}
Creapy Tier: {CREAPY_TIER_NAME}
THRESHOLDS:
STE: {_ste}
ZCR: {_zcr}
creak_threshold: {_intervals["creak_threshold"]}
GAPS:
min_creak_length: {_intervals["min_creak_length"]}
max_gap: {_intervals["max_gap"]}
EVALUATION METRICS:
F1: {F1:.3f}
Precision: {Precision: .3f}
Recall: {Recall: .3f}
True Positives: {TP}
False Positives: {FP}
False Negatives: {FN}
"""
with open(OUTFILE, "a") as f:
f.write(content)
f.write("-" * 120 + '\n')
if __name__ == "__main__":
main()
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment