Skip to content

DetailedResult Resource

Definition

This is an object representing detailed results of a job. It is a more detailed view than the summary in the Job Resource.

The DetailedResult resource is useful if you are interested in the position of the video where a celebrity was recognized and if you want to know the confidence score for the prediction.

Each DetailedResult object can be interpreted as an segment with a starting point and an end point (frame_start and frame_end).

The best example is the Timeline in the DeepVA Frontend that is fetching the /detailed-results endpoint to visualize the segments:

A timeline chart showing the detailed results of a job
ENDPOINTS

GET /v1/jobs/{JOB_ID}/detailed-results/

Attributes

Top level attributes

Name Type Description
media_type string The type of media ("video", "image")
frame_start number Start position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image")
frame_end number End position in the video for the segment as frame number (media_type = "video") or the index of the image in a batch (zero-based) if it was a image job (media_type = "image")
time_start string Start position in the video for the segment in seconds
time_end string End position in the video for the segment in seconds
tc_start string Starting SMPTE timecode in the video for the segment
tc_end string Ending SMPTE timecode in the video for the segment
source string Media file source
module string Visual Mining module that has been applied
meta dict Based on the module field, meta can contain different data. See "Child attributes of meta"
detections list of Detection objects Detection objects in this segment containing bounding-box information
thumbnail Thumbnail Thumbnail of the result

Child attributes of meta

Based on the module field, meta can contain different data.

Face Recognition

If module field was "face_recognition".

Name Type Description
person string The name of the recognized identity
class_id string Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models.
closest_person string Most similar identity if face could not be recognized. Only available if the value of person field is "unknown".
similarity number Similarity metric describing how similar the recognized identity is to the closest sample in the training data (0.0 - 1.0).
distance number Distance metric describing how similar (in terms of euclidean distance) the recognized identity is to the closest sample in the training data (0.0 - 2.0).
indexed_identity Indexed Identity If the result includes an Indexed Identity, the object is given here. If not, it is null.

Object Recognition

If module field was "object_recognition".

For images (media_type = image):
Name Type Description
label string The name of the recognized concept
confidence number How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0).
For videos (media_type = video):
Name Type Description
label string The name of the recognized concept
mean_confidence number How confident the system is about the appearance of the concept in the image or video (0.0 - 1.0).

Lower Third Recognition

If module field was "lower_third_recognition".

Name Type Description
label string The name of the detected lower third
from_dictionary boolean If name was matched by our generic name dictionary

Landmark Recognition

If module field was "landmark_recognition".

Name Type Description
landmark dict Infos about the recognized landmark
confidence number How confident the system is about the appearance of the landmark in the image or video (0.0 - 1.0).

These child attributes of the field landmark are available:

Name Type Description
name string The official name of the landmark
class_id string Unique ID of the predicted class corresponding to the class inside the dataset. Only available for custom trained models.
type list of strings The category of the landmark ("null" if a custom trained model was used). Example: ["observation tower", "landmark", "tourist attraction", "lattice tower"]
latitude string Latitude of the recognized landmark ("null" if a custom trained model was used)
longitude string Longitude of the recognized landmark ("null" if a custom trained model was used)
location list of strings Information about the location of the landmark as list of strings. Example: ["Paris, 7e arrondissement", "Paris", "Île-de-France", "France"]

Gender Neutrality Estimation

If module field was "gender_neutrality_estimation".

Name Type Description
label string Contains either the value "Gender Neutral", "Male Dominant" or "Female Dominant"

JSON Example

The following JSON snippet is showing the detailed results of a completed job which was processing an video with the Face Recognition module.

{
            "media_type": "video",
            "frame_start": 12,
            "frame_end": 56,
            "source": "storage://WQM2S9L9O0tDYacfNQN7",
            "detections": [
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.449,
                        "y": 0.285,
                        "w": 0.081,
                        "h": 0.202
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.444,
                        "y": 0.289,
                        "w": 0.084,
                        "h": 0.2
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.445,
                        "y": 0.283,
                        "w": 0.082,
                        "h": 0.196
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.449,
                        "y": 0.281,
                        "w": 0.082,
                        "h": 0.2
                    }
                },
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.32,
                        "y": 0.202,
                        "w": 0.092,
                        "h": 0.243
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.315,
                        "y": 0.211,
                        "w": 0.09,
                        "h": 0.237
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.317,
                        "y": 0.198,
                        "w": 0.092,
                        "h": 0.244
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.323,
                        "y": 0.202,
                        "w": 0.09,
                        "h": 0.239
                    }
                },
                {
                    "frame_index": 12,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.194,
                        "y": 0.091,
                        "w": 0.103,
                        "h": 0.281
                    }
                },
                {
                    "frame_index": 24,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.183,
                        "y": 0.093,
                        "w": 0.106,
                        "h": 0.281
                    }
                },
                {
                    "frame_index": 36,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.186,
                        "y": 0.089,
                        "w": 0.104,
                        "h": 0.276
                    }
                },
                {
                    "frame_index": 48,
                    "type": "face_bbox",
                    "data": {
                        "x": 0.192,
                        "y": 0.087,
                        "w": 0.103,
                        "h": 0.272
                    }
                }
            ],
            "module": "face_recognition",
            "meta": {
                "person": "unknown",
                "class_id": null,
                "closest_person": "Kevin Stöger",
                "mean_distance": 0.7416666666666667,
                "mean_similarity": 0.25333333333333335,
                "indexed_identity": null
            },
            "time_start": 0.48,
            "time_end": 2.24,
            "tc_start": "00:00:12:00",
            "tc_end": "00:00:56:00"
        }