Speaker Identification
Module Description
Speaker Identification detects and identifies the speakers in audio and video files (also known as Speaker Diarization).
Module ID: speaker_identification
Module Parameters
Name | Type | Default | Description |
---|---|---|---|
model | string | celebrities | The name or the ID of the model to use (See Model Resource). |
min_similarity | number | 0.75 | Thresholding, how similar (in terms of cosine similarity) the recognized identity is to the closest sample in the training data (range from 0.0 to 1.0). |
cluster_unknowns | boolean | false | Whether to give individual results for unknown identities or not (Unknown #1, Unknown #2, ..). |
index_unknowns | boolean | false | Whether to index unknown identities with a unique ID or not (The index is accessible on the left menu bar). This parameter overrides cluster_unknowns since clustering of unknown identities is required for indexing. |
segment_merge_threshold | number | 1.0 | Segments will be merged if the gap between voice segments is lower than this threshold in seconds (range from 0.0 to 10.0). |
Pre-trained models
Name | Description |
---|---|
celebrities | Several personalities, including the world's most famous people and a vast majority of German politicians and athletes |
Example
Send the following JSON as request body via POST to the /jobs/
endpoint:
{
"sources": [
"{url-to-your-image}"
],
"modules": {
"speaker_identification": {
"model": "celebrities"
}
}
}
When requesting the job via GET on the /jobs/{JOB_ID}/
endpoint, the response looks like this:
coming soon