Speaker Identification

Module Description

Speaker Identification detects and identifies the speakers in audio and video files (also known as Speaker Diarization).

Module ID: speaker_identification

Name	Type	Default	Description
model	string	celebrities	The name or the ID of the model to use (See Model Resource).
min_similarity	number	0.75	Thresholding, how similar (in terms of cosine similarity) the recognized identity is to the closest sample in the training data (range from 0.0 to 1.0).
cluster_unknowns	boolean	false	Whether to give individual results for unknown identities or not (Unknown #1, Unknown #2, ..).
index_unknowns	boolean	false	Whether to index unknown identities with a unique ID or not (The index is accessible on the left menu bar). This parameter overrides `cluster_unknowns` since clustering of unknown identities is required for indexing.
segment_merge_threshold	number	1.0	Segments will be merged if the gap between voice segments is lower than this threshold in seconds (range from 0.0 to 10.0).

Name	Description
celebrities	Several personalities, including the world's most famous people and a vast majority of German politicians and athletes

Send the following JSON as request body via POST to the /jobs/ endpoint:

{
  "sources": [
    "{url-to-your-image}"
  ],
  "modules": {
    "speaker_identification": {
      "model": "celebrities"
    }
  }
}

When requesting the job via GET on the /jobs/{JOB_ID}/ endpoint, the response looks like this:

coming soon