DetailedResult Resource

Definition

This is an object representing detailed results of a job. It is a more detailed view than the summary in the Job Resource.

The DetailedResult resource is useful if you are interested in the position of the video where a celebrity was recognized and if you want to know the confidence score for the prediction.

Each DetailedResult object can be interpreted as an segment with a starting point and an end point (frame_start and frame_end).

The best example is the Timeline in the DeepVA Frontend that is fetching the /detailed-results endpoint to visualize the segments:

A timeline chart showing the detailed results of a job

ENDPOINTS

GET /v1/jobs/{JOB_ID}/detailed-results/

Attributes

Top level attributes

Name	Type	Description
media_type	string	The type of media ("video", "image")
frame_start	number	Start position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image")
frame_end	number	End position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image")
time_start	number	Start position in the video for the segment in seconds
time_end	number	End position in the video for the segment in seconds
tc_start	string	Starting SMPTE timecode in the video for the segment
tc_end	string	Ending SMPTE timecode in the video for the segment
source	string	Media file source
module	string	Visual Mining module that has been applied
meta	dict	Based on the module field, meta can contain different data. See "Child attributes of meta"
detections	list of Detection objects	Detection objects in this segment containing bounding-box information
thumbnail	Thumbnail	Thumbnail of the result

Child attributes of meta

Based on the module field, meta can contain different data.

Face Recognition

If module field was "face_recognition".

Name	Type	Description
person	string	The name of the recognized identity
class_id	string	Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models.
closest_person	string	Most similar identity if face could not be recognized. Only available if the value of `person` field is "unknown".
similarity	number	Similarity metric describing how similar the recognized identity is to the closest sample in the training data (0.0 - 1.0).
distance	number	Distance metric describing how similar (in terms of euclidean distance) the recognized identity is to the closest sample in the training data (0.0 - 2.0).
indexed_identity	Indexed Identity	If the result includes an Indexed Identity, the object is given here. If not, it is `null`.

Object Recognition

If module field was "object_recognition".

For images (media_type = image):

Name	Type	Description
label	string	The name of the recognized concept
confidence	number	How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0).

For videos (media_type = video):

Name	Type	Description
label	string	The name of the recognized concept
mean_confidence	number	How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0).

Lower Third Recognition

If module field was "lower_third_recognition".

Name	Type	Description
label	string	The name of the detected lower third
from_dictionary	boolean	If name was matched by our generic name dictionary

Landmark Recognition

If module field was "landmark_recognition".

Name	Type	Description
landmark	dict	Infos about the recognized landmark
confidence	number	How confident the system is about the appearance of the landmark in the image or video (0.0 - 1.0).

These child attributes of the field landmark are available:

Name	Type	Description
name	string	The official name of the landmark
class_id	string	Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models.
type	list of strings	The category of the landmark ("null" if a custom trained model was used). Example: `["observation tower", "landmark", "tourist attraction", "lattice tower"]`
latitude	string	Latitude of the recognized landmark ("null" if a custom trained model was used)
longitude	string	Longitude of the recognized landmark ("null" if a custom trained model was used)
location	list of strings	Information about the location of the landmark as list of strings. Example: `["Paris, 7e arrondissement", "Paris", "Île-de-France", "France"]`

Gender Neutrality Estimation

If module field was "gender_neutrality_estimation".

Name	Type	Description
label	string	Contains either the value "Gender Neutral", "Male Dominant" or "Female Dominant"

JSON Example

The following JSON snippet is showing the detailed results of a completed job which was processing an video with the Face Recognition module.

{
            "media_type": "video",
            "frame_start": 12,
            "frame_end": 56,
            "source": "storage://WQM2S9L9O0tDYacfNQN7",
            "detections": [
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.449,
                        "y": 0.285,
                        "w": 0.081,
                        "h": 0.202
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.444,
                        "y": 0.289,
                        "w": 0.084,
                        "h": 0.2
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.445,
                        "y": 0.283,
                        "w": 0.082,
                        "h": 0.196
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.449,
                        "y": 0.281,
                        "w": 0.082,
                        "h": 0.2
                    }
                },
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.32,
                        "y": 0.202,
                        "w": 0.092,
                        "h": 0.243
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.315,
                        "y": 0.211,
                        "w": 0.09,
                        "h": 0.237
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.317,
                        "y": 0.198,
                        "w": 0.092,
                        "h": 0.244
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.323,
                        "y": 0.202,
                        "w": 0.09,
                        "h": 0.239
                    }
                },
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.194,
                        "y": 0.091,
                        "w": 0.103,
                        "h": 0.281
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.183,
                        "y": 0.093,
                        "w": 0.106,
                        "h": 0.281
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.186,
                        "y": 0.089,
                        "w": 0.104,
                        "h": 0.276
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.192,
                        "y": 0.087,
                        "w": 0.103,
                        "h": 0.272
                    }
                }
            ],
            "module": "face_recognition",
            "meta": {
                "person": "unknown",
                "class_id": null,
                "closest_person": "Kevin Stöger",
                "mean_distance": 0.7416666666666667,
                "mean_similarity": 0.25333333333333335,
                "indexed_identity": null
            },
            "time_start": 0.48,
            "time_end": 2.24,
            "tc_start": "00:00:12:00",
            "tc_end": "00:00:56:00"
        }