Skip to content

Object & Scene Recognition

Module Description

Object and Scene Recognition

Object and scene recognition detects and labels various objects and scenes, from general to more specific ones. With this module you can immediately summarize the content of pictures or videos. It can be used to conveniently and reliably categorize and archive visual data with more than 1,500 object classes.

Module ID: object_scene_recognition

Module Parameters

Name Type Default Description
model string general-c The name or the ID of the model to use (See Model Resource).
min_confidence number 70 Only return predictions with at least higher confidence than this threshold (range from 0 to 100).
language number 0 Language of the predicted labels (0 = English, 1 = German).

Pre-trained models

Name Description
general-c Various objects and scenes, from general to more specific ones

general-a and general-b are subsets. general-b is recommended.

Example

Send the following JSON as request body via POST to the /jobs/ endpoint:

{
  "sources": [
    "{url-to-your-image}"
  ],
  "modules": {
    "object_scene_recognition": {
      "model": "general-c"
    }
  }
}

When requesting the job via GET on the /jobs/{JOB_ID}/ endpoint, the response looks like this:

{
    "id": "878e6e61-6fa5-4cac-8d1e-dd4066d902df",
    "tag": "",
    "state": "completed",
    "errors": [],
    "progress": 1,
    "duration": 49.409,
    "time_created": "2021-05-20 09:43:14.525000",
    "time_started": "2021-05-20 09:43:14.615000",
    "time_completed": "2021-05-20 09:44:04.024000",
    "sources": [
        "storage://WQM2S9L9O0tDYacfNQN7"
    ],
    "modules": {
        "object_scene_recognition": {
            "model": "general-c",
            "state": "completed",
            "progress": 1
        }
    },
    "media_type": "video",
    "result": {
        "detailed_link": "http://api.deepva.com/api/v1/jobs/878e6e61-6fa5-4cac-8d1e-dd4066d902df/detailed-results",
        "summary": [
            {
                "source": "storage://WQM2S9L9O0tDYacfNQN7",
                "media_type": "video",
                "info": {
                    "fps": 25.0,
                    "resolution": [
                        960,
                        540
                    ],
                    "total_frames": 3636,
                    "duration": 145.44
                },
                "items": [
                    {
                        "type": "object",
                        "label": "Person",
                        "module": "object_scene_recognition"
                    },
                    {
                        "type": "object",
                        "label": "Cow",
                        "module": "object_scene_recognition"
                    }
                ]
            }
        ]
    }
}

To get detailed information about the predicted label (for example the time code or the confidence of the predicted label) you can request the /jobs/{JOB_ID}/detailed-results/ endpoint, the response looks like this:

{
    "total": 60,
    "offset": 0,
    "limit": 10,
    "next": "http://api.deepva.com/api/v1/jobs/878e6e61-6fa5-4cac-8d1e-dd4066d902df/detailed-results/?limit=10&offset=10",
    "prev": "http://api.deepva.com/api/v1/jobs/878e6e61-6fa5-4cac-8d1e-dd4066d902df/detailed-results/?limit=10&offset=0",
    "data": [
        {
            "id": "12b38867-7d4d-4c30-8222-2e55e3ca4e68",
            "media_type": "video",
            "frame_start": 1032,
            "frame_end": 1283,
            "source": "storage://WQM2S9L9O0tDYacfNQN7",
            "module": "object_scene_recognition",
            "meta": {
                "label": "Person",
                "mean_confidence": 1.0,
                "parents": []
            },
            "thumbnail": null,
            "detections": [],
            "time_start": 41.28,
            "time_end": 51.32,
            "tc_start": "00:00:41:07",
            "tc_end": "00:00:51:08"
        },
        {
            "id": "51b3d4d5-e178-4b08-bf9f-290b94821b21",
            "media_type": "video",
            "frame_start": 1284,
            "frame_end": 1379,
            "source": "storage://WQM2S9L9O0tDYacfNQN7",
            "module": "object_scene_recognition",
            "meta": {
                "label": "Cow",
                "mean_confidence": 0.9986,
                "parents": [
                    {
                        "label": "Cattle",
                        "parents": [
                            {
                                "label": "Mammal",
                                "parents": [
                                    {
                                        "label": "Animal",
                                        "parents": []
                                    }
                                ]
                            }
                        ]
                    }
                ]
            },
            "thumbnail": null,
            "detections": [],
            "time_start": 51.36,
            "time_end": 55.16,
            "tc_start": "00:00:51:08",
            "tc_end": "00:00:55:03"
        }
    ]
}