DetailedResult Resource
Definition
This is an object representing detailed results of a job. It is a more detailed view than the summary in the Job Resource.
The DetailedResult resource is useful if you are interested in the position of the video where a celebrity was recognized and if you want to know the confidence score for the prediction.
Each DetailedResult object can be interpreted as an segment with a starting point and an end point (frame_start and frame_end).
The best example is the Timeline in the DeepVA Frontend that is fetching the /detailed-results endpoint to visualize the segments:
ENDPOINTS
GET /v1/jobs/{JOB_ID}/detailed-results/
Attributes
Top level attributes
Name | Type | Description |
---|---|---|
media_type | string | The type of media ("video", "image") |
frame_start | number | Start position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image") |
frame_end | number | End position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image") |
time_start | number | Start position in the video for the segment in seconds |
time_end | number | End position in the video for the segment in seconds |
tc_start | string | Starting SMPTE timecode in the video for the segment |
tc_end | string | Ending SMPTE timecode in the video for the segment |
source | string | Media file source |
module | string | Visual Mining module that has been applied |
meta | dict | Based on the module field, meta can contain different data. See "Child attributes of meta" |
detections | list of Detection objects | Detection objects in this segment containing bounding-box information |
thumbnail | Thumbnail | Thumbnail of the result |
Child attributes of meta
Based on the module field, meta can contain different data.
Face Recognition
If module
field was "face_recognition".
Name | Type | Description |
---|---|---|
person | string | The name of the recognized identity |
class_id | string | Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models. |
closest_person | string | Most similar identity if face could not be recognized. Only available if the value of person field is "unknown". |
similarity | number | Similarity metric describing how similar the recognized identity is to the closest sample in the training data (0.0 - 1.0). |
distance | number | Distance metric describing how similar (in terms of euclidean distance) the recognized identity is to the closest sample in the training data (0.0 - 2.0). |
indexed_identity | Indexed Identity | If the result includes an Indexed Identity, the object is given here. If not, it is null . |
Object Recognition
If module
field was "object_recognition".
For images (media_type = image):
Name | Type | Description |
---|---|---|
label | string | The name of the recognized concept |
confidence | number | How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0). |
For videos (media_type = video):
Name | Type | Description |
---|---|---|
label | string | The name of the recognized concept |
mean_confidence | number | How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0). |
Lower Third Recognition
If module
field was "lower_third_recognition".
Name | Type | Description |
---|---|---|
label | string | The name of the detected lower third |
from_dictionary | boolean | If name was matched by our generic name dictionary |
Landmark Recognition
If module
field was "landmark_recognition".
Name | Type | Description |
---|---|---|
landmark | dict | Infos about the recognized landmark |
confidence | number | How confident the system is about the appearance of the landmark in the image or video (0.0 - 1.0). |
These child attributes of the field landmark
are available:
Name | Type | Description |
---|---|---|
name | string | The official name of the landmark |
class_id | string | Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models. |
type | list of strings | The category of the landmark ("null" if a custom trained model was used). Example: ["observation tower", "landmark", "tourist attraction", "lattice tower"] |
latitude | string | Latitude of the recognized landmark ("null" if a custom trained model was used) |
longitude | string | Longitude of the recognized landmark ("null" if a custom trained model was used) |
location | list of strings | Information about the location of the landmark as list of strings. Example: ["Paris, 7e arrondissement", "Paris", "Île-de-France", "France"] |
Gender Neutrality Estimation
If module
field was "gender_neutrality_estimation".
Name | Type | Description |
---|---|---|
label | string | Contains either the value "Gender Neutral", "Male Dominant" or "Female Dominant" |
JSON Example
The following JSON snippet is showing the detailed results of a completed job which was processing an video with the Face Recognition module.
{
"media_type": "video",
"frame_start": 12,
"frame_end": 56,
"source": "storage://WQM2S9L9O0tDYacfNQN7",
"detections": [
{
"frame_index": 12,
"type": "face_bbox",
"data": {
"x": 0.449,
"y": 0.285,
"w": 0.081,
"h": 0.202
}
},
{
"frame_index": 24,
"type": "face_bbox",
"data": {
"x": 0.444,
"y": 0.289,
"w": 0.084,
"h": 0.2
}
},
{
"frame_index": 36,
"type": "face_bbox",
"data": {
"x": 0.445,
"y": 0.283,
"w": 0.082,
"h": 0.196
}
},
{
"frame_index": 48,
"type": "face_bbox",
"data": {
"x": 0.449,
"y": 0.281,
"w": 0.082,
"h": 0.2
}
},
{
"frame_index": 12,
"type": "face_bbox",
"data": {
"x": 0.32,
"y": 0.202,
"w": 0.092,
"h": 0.243
}
},
{
"frame_index": 24,
"type": "face_bbox",
"data": {
"x": 0.315,
"y": 0.211,
"w": 0.09,
"h": 0.237
}
},
{
"frame_index": 36,
"type": "face_bbox",
"data": {
"x": 0.317,
"y": 0.198,
"w": 0.092,
"h": 0.244
}
},
{
"frame_index": 48,
"type": "face_bbox",
"data": {
"x": 0.323,
"y": 0.202,
"w": 0.09,
"h": 0.239
}
},
{
"frame_index": 12,
"type": "face_bbox",
"data": {
"x": 0.194,
"y": 0.091,
"w": 0.103,
"h": 0.281
}
},
{
"frame_index": 24,
"type": "face_bbox",
"data": {
"x": 0.183,
"y": 0.093,
"w": 0.106,
"h": 0.281
}
},
{
"frame_index": 36,
"type": "face_bbox",
"data": {
"x": 0.186,
"y": 0.089,
"w": 0.104,
"h": 0.276
}
},
{
"frame_index": 48,
"type": "face_bbox",
"data": {
"x": 0.192,
"y": 0.087,
"w": 0.103,
"h": 0.272
}
}
],
"module": "face_recognition",
"meta": {
"person": "unknown",
"class_id": null,
"closest_person": "Kevin Stöger",
"mean_distance": 0.7416666666666667,
"mean_similarity": 0.25333333333333335,
"indexed_identity": null
},
"time_start": 0.48,
"time_end": 2.24,
"tc_start": "00:00:12:00",
"tc_end": "00:00:56:00"
}