Skip to content

Frequently Asked Questions

Web Application

General Questions

Which image and video file formats are supported?

This link lists all sources that can be used as a basis for a mining job as well as all currently supported file formats. These apply not only to Face Recognition, but also to all other recognition modules.

Is it possible to change the settings so that people will be recognized from the side?

There is no explicit setting in this regard. In general, recognizing faces from the side can be very difficult, since they are often very similar. If an AI model is created that contains faces that are shown from the side, they are more likely to be recognized in the analysis. In general, however, Face Recognition is focused on faces shown from the front.

Is it possible to change the settings so that rather people in the foreground / in the background are recognized?

There is no explicit setting in this respect. Possibly, these results can be achieved by modifying the parameter "Face Detection Scale".

Is it possible to change the settings so that rather people in the foreground / in the background are recognized?

There is no explicit setting in this respect. Possibly, these results can be achieved by modifying the parameter "Face Detection Scale".

What are possible reasons for DeepVA to not recognize people / landmarks?

There are several reasons for DeepVA to not recognize certain people:

  • The image or video is too blurred
  • The depicted person is too dissimilar to the trained AI model (e.g. by wearing large sunglasses, fundamentally changed attributes such as hair or beard, mask, etc.)
  • Person is not included in the used AI model, the dataset does not contain the corresponding person
  • The faces are too small (see "Face Detection Scale" under Face Recognition)
  • The detection threshold is too low (see "Max. Distance" under Face Recognition)
  • In case of Landmark Recognition the buildings and structures should be shown as a whole (not too close and not too far away)
How can the settings of different modules be changed?

The basic settings of the various analysis modules can be changed when selecting the modules via "Set Parameters".

Visual Mining Modules and their parameters

Face Recognition

Model

This parameter specifies which AI model is to be used for analysis via Face Recognition. Currently only one model can be selected. In the future, several models can be used in one analysis.

Max. Distance

This value indicates how similar a depicted face must be to a face from the AI model to be identified as a specific person. The value 1 is a good default value, values <1 require a higher similarity to identify a person, values >1 increase the probability of identifying people but also increase the probability of errors. For example, if I want to be very sure that the identification is error-free, I choose a value <1. In this case, it is possible that people are declared as "unknown" although they would have been identified if the value was 1.

Cluster Unknown Identities

If this parameter is activated, unrecognized people are displayed separately in the analysis result in individual clusters ("unknown 1, unknown 2" etc.). If this parameter is deactivated, unrecognized people are combined in one group ("unknown").

Face Detection Scale

This value sets the size of the faces to be recognized. The larger the selected value, the smaller the faces that are recognized can be.

- 600 = default value

- 600-1000 = small faces are also recognized

- Below 600 = only relatively large faces are recognized

For more information on Face Recognition: https://docs.deepva.com/mining-modules/face-recognition

Face Dataset Creation

Dataset ID

Here, the ID of the dataset, which should be extended by new classes, is inserted. These new classes are created via text inserts in interviews. The name and corresponding face (image) is extracted. In the future there will be a dropdown menu here. The dataset ID can be found under "Datasets" in the overview or alternatively copied in the menu of a dataset.

Single Name Detection

Normally, a combination of first and last names is extracted from text inserts on screen and assigned to the corresponding face. If this option is activated, single first names or single surnames are extracted as well.

Min. Face Size

This value sets the minimum size of a face to be recognized. The height and width of a face must not fall below this value (in pixels). The smaller the value (for example, 60), the smaller the faces that are captured. This can of course affect the quality of the dataset.

Sharpness Threshold

This value represents the minimum sharpness a face must have in order to be extracted. Faces may be shown in large size, but not in camera focus. The higher this value, the sharper a face has to be to be captured (40 = acceptable value, for example 100 = a face must be very sharp).

Diversity Analysis

Face Area Threshold

This parameter specifies the ratio of displayed faces compared to the entire image. If the face is smaller than the threshold, it will not be included in the analysis. A value of 0.5 means that a face must fill 50% of the entire image to be included in the analysis.

For more information on Diversity Analysis: https://docs.deepva.com/mining-modules/gender-neutrality-estimation

Object & Scene Recognition

Model

This parameter specifies which AI model is to be used for analysis via Object & Scene Recognition. Currently only one model at a time can be selected. In the future, several models can be used in one analysis.

Min. Confidence

This parameter specifies the probability with which the system would describe a displayed object as the actual object. This value can be between zero and one. If a high value is chosen, the accuracy of the analysis increases. If a low value is selected, it might be possible that objects are detected, but in fact incorrectly labeled.

Language

Here you can change the speech output of the recognized objects. German and English are both based on the same database.

For more information on Object & Scene Recognition: https://docs.deepva.com/mining-modules/object-scene-recognition

Lower Third Recognition

What is the difference between Lower Third Recognition and Face Dataset Creation?

Currently, there is no difference between the two modules. Historically, the Lower Third Recognition was used for automatic training data creation. In the future, Face Dataset Creation will be offered for the automated generation of datasets.

Why are single names not recognized in Lower Third Recognition?

In order to recognize single names in Lower Third Recognition in addition to first and last names, the parameter "Single Name Detection" in the module selection must be activated first.

Single Name Detection

Normally, a combination of first and last names is extracted from text inserts on screen and assigned to the corresponding face. If this option is activated, single first names or single surnames are extracted as well.

Min. Face Size

This value sets the minimum size of a face to be recognized. The height and width of a face must not fall below this value (in pixels). The smaller the value (for example, 60), the smaller the faces that are captured. This can of course affect the quality of the dataset.

Sharpness Threshold

This value represents the minimum sharpness a face must have in order to be extracted. Faces may be shown in large size, but not in camera focus. The higher this value, the sharper a face has to be to be captured (40 = acceptable value, for example 100 = a face must be very sharp).

For more information on Lower Third Recognition: https://docs.deepva.com/mining-modules/lower-third-recognition/

Landmark Recognition

Model

This parameter specifies which AI model is to be used for analysis via Landmark Recognition. Currently only one model can be selected. In the future, several models can be used in one analysis.

Min. Similarity

This value can be between 0 and 1 and indicates the correspondence with the training data. A value of 1 means that a displayed landmark must match the training data 100% to be recognized (it would have to be the exact same image in this case). If the value is very low, more landmarks are recognized, but in turn the error rate can increase.

For more information on Landmark Recognition: https://docs.deepva.com/mining-modules/landmark-recognition

QR-Code Detection

Speed Mode

This parameter specifies how fast image or video material is searched for QR codes. The faster the analysis, the more frames are dropped by the QR Code Detection (frame drop). Usually, however, medium or even fast processing is sufficient to detect every QR code displayed in a video.

For more information on QR-Code Detection: https://docs.deepva.com/mining-modules/qrcode-detection

Datasets

What is a dataset?

A dataset is a collection of image data used for the creation or training of an AI model. These image data provide the necessary input for an AI model to reliably recognize people (Face Recognition) or landmarks (Landmark Recognition) in image or video analysis. A dataset contains classes (a class would be Angela Merkel, for example, in terms of people) and this class contains images of the respective person. In terms of landmarks, for example, the Elbphilharmonie would be a class.

What are the requirements for a dataset?

A dataset for Face Recognition must contain at least three images per class. A dataset for Landmark Recognition must contain at least one image per class. It is recommended to use images that contain only one face and have a certain quality. Images of buildings should show the building in its entirety.

Where can the Dataset ID be found?

The dataset ID can be found under "Datasets" in the overview or alternatively copied in the menu of a dataset.

I created a dataset, why can’t I use it for my analysis?

Before a dataset can be used in the analysis, an own AI model based on this dataset must first be trained. This AI model can then be selected and used in the parameters when selecting the mining modules. An own AI model can be created under "AI Models" and then under "Start Training" or alternatively in the menu "Datasets".

What does the activation / deactivation of a class (an object in a class) mean?

A class or individual objects of a class can be deactivated for each dataset. This can be useful if individual persons or landmarks in images or videos should no longer be recognized. After deactivating a class or an object in a class, it is important to update the associated AI model. Deactivating a class or an object in a class does not automatically lead to exclusion in the analysis! Deactivation is a feature for training an AI model, not for the analysis.

For more information on Datasets: https://docs.deepva.com/core-resources/dataset

Dataset Evaluation

What does it mean if a dataset contains "warnings" after evaluation?

The result of a dataset evaluation can be a warning shown in yellow. This means that there are outliers in certain classes. These differ significantly from the other images in these classes and can influence the training of an AI model based on that dataset, but not necessarily. In any case, it is worthwhile to take a closer look at these images and possibly exclude them from the training.

What does it mean if a dataset contains "errors" after evaluation?

The result of a dataset evaluation can be an error shown in red. Errors are images that are not suitable for training the dataset for certain reasons (quality, blur, etc.). They should be deactivated or removed before training, otherwise the performance of the AI model will be negatively affected.

Can I train an AI model despite errors and warnings?

Despite errors and warnings, an AI model can be trained from a dataset. The results of the evaluation do not prevent training, but it is still recommended to exclude errors from the training and to take a closer look at warnings.

When is a dataset suitable for training?

A dataset is evaluated as "good" (displayed in green) if it contains neither errors nor warnings in this dataset. The dataset is therefore suitable for training an AI model.

Training

What status can the training of a dataset have?

A dataset is evaluated as "good" (displayed in green) if it contains neither errors nor warnings in this dataset. The dataset is therefore suitable for training an AI model.

  • Pending: The training of a new AI model has been initiated, but is not yet processed.
  • Processing: The training of a new AI model is currently being processed.
  • Completed: The training of a new AI model is completed. The AI model has been created and can now be used in the analysis.
  • Failed: The training of a new AI model could not be completed. A new AI model was not created. This can have several reasons: - A dataset must contain images, at least three for Face Recognition and at least one for Landmark Recognition. - The specified dataset does not exist (only relevant for API). - There were technical problems with DeepVA servers.
Can I enhance an existing AI model from DeepVA with my own training data?

An individually created AI model can be enhanced by adding classes and images to the corresponding data set. Simply start a training and select the option "Update Existing Model". Here, under "Change Log" a version number or a description of the changes can be added. An already pre-built AI model created by DeepVA cannot be enriched with training data at this time. In the future, however, when selecting the mining modules in the parameter settings, there will be the option to use several AI models for analysis so you can also include your own model.

For more Information on AI models: https://docs.deepva.com/core-resources/model

Storage

What is the storage?

The storage is the file system of DeepVA. On the one hand, all image data from Face Dataset Creation is stored here, on the other hand, own images and videos can be stored here for analysis. The storage therefore works as a database to store image and video files and manage them internally in DeepVA.

What are reasons to use the storage?

Using the storage can be practical if no public URLs should be used for analysis. The storage is also used if the image or video material to be analyzed is located on your own hard disk. In this case, it is first uploaded to the storage and then analyzed. This means that all files come from one system and the speed of the analysis increases. Image data from Face Dataset Creation is also stored in the storage.

Why can't I move files in the storage?

Currently, it is not possible to move files in the storage. This feature is planned. If an image or video file is to be stored in another folder, it must be deleted and uploaded to the respective folder again.

Can I upload multiple files to the storage?

Under "Select Files" several files can be uploaded to the storage by using the Shift key or by selecting several files with the Ctrl key.

What do I need to pay attention to using the storage?

Image or training data obtained using Face Dataset Creation is stored in the storage. If this data is deleted in the storage, it will also be deleted in the respective dataset. This leads to problems in the creation and use of own AI models.

For more information on the Storage: https://docs.deepva.com/#storage