Changelog
The changelog is a record of changes to the DeepVA software.
05. Nov 2024
- Releasing celebrities v33 model for Face Recognition mining module
09. Oct 2024
- Hotfix: Fixing an issue where transcript exports were failing due to a wrong offsetting of paragraphs
02. Oct 2024
- Adding support for formatting VTT and SRT files when exporting transcripts (
max_line_width
andmax_line_count
) to align with industry standards - Adding a
tag
field to the export resource - Adding a
transcript_tag
field to theon_created
event of the transcript version webhook - Adding the restriction in the selection of language field of a transcript variant by dropdown of ISO languages
- Adding the support of
*.ogg
audio files (audio/ogg
andapplication/ogg
) - Adding the functionality to adjust the playback speed in the video player on the Transcript Editor page
- Fixing an issue where saving a transcript would jump to the first page in the Transcript Editor
- Adding a message for the HTTP 403 on the external Transcript Editor when permissions were restricted
- Fixing an issue with the pagination functionality of the dictionaries page
- Adding a global 404 page to the frontend
- Minor UI improvements
27. Sep 2024
- Changing default threshold for Dictionary Specification to a value of
0.90
due to better quality of matching - Fixing an issue where the order of paragraphs in the Transcript Editor were not correct after saving
18. Sep 2024
- Adding the option to restrict the JTW token for authentication of the stand-alone version of the Transcript Editor to a specific transcript via a custom permissions claim
- Changing the
exp
claim from optional to mandatory in the JWT token for authentication of the stand-alone version of the Transcript Editor
16. Sep 2024
- Fixing an issue where empty segments from Voice Activity Detection were causing the whole Speech Recognition job to fail
- Improving the robustness of downloads of external URLs as job sources by implementing retry logic and optimizing timeout handling
02. Sep 2024
- Improving fuzzy dictionaries for word correction in Speech Recognition
- Small bugfixes
19. Aug 2024
- Optimization of the loading of transcript segments in the Transcript Editor
10. Aug 2024
- Displaying more job-related information on the job overview and teh job detail page
- Adding filter for tag field on the job overview page
- Displaying the tag field of transcripts on the transcript overview page
- Adding filter functionality to the transcripts page
- Adding a spinner while loading the transcript paragraphs for a better user experience
- Improving the color-coding for transcript word-level confidence in the Transcript Editor
22. July 2024
- Displaying descriptive network error instead of the generic unknown error when service timeouts
- Adding an action to the source list context menu to start a job from the source on the right info card when a job is selected
- Small bugfixes
12. July 2024
- Adding the fields
export_id
andjob_id
to the Artifacts resource for exports and job-based generated artifacts
11. July 2024
- Adding three new webhook events for exports (
deepva.export.on_started
,deepva.export.on_completed
anddeepva.export.artifact.on_created
)
08. July 2024
- Fixing a bug where
time_created
field of theTranscriptVariantVersion
was not updated correctly
04. July 2024
- Adding a page to manage Access Keys to the Preferences
28. June 2024
- Introducing Access Keys enabling integrating of various frontend components into user workflows
- Adding a stand-alone version of the Transcript Editor accessible trough JWT token authentication to enable integration into custom solutions and workflows
20. June 2024
- Adding a tag field to the Transcript resource allowing to tag a transcript with a custom ID or label for reference
17. June 2024
- Adding DOXC (.docx) as a new export format for the Transcript Editor
- Fixing a bug where the creation of a transcript from a job with multiple mining modules was failing
- Introducing an API-wide webhook system to subscribe to certain events. See Webhooks for more information.
14. June 2024
- Releasing celebrities v31 model for Face Recognition mining module (EURO 2024 update)
10. June 2024
- Adding export functionality for the transcript editor to export the transcript in SRT and WebVTT subtitle format
- Releasing celebrities v2 model for Speaker Identification mining module (350+ new identities added)
- Introducing a new argument for the Speech Recognition module to disable the paragraph formatting which combines multiple segments into a paragraph of sentences (
format_paragraph
). This is useful for subtitles where short segments are expected. - Fixing a bug where the transcript view errors out when the linked job has been removed
- Fixing a bug where an empty source was accepted for the job creation
- Fixing a bug where the transcript editor did not jump to the next page while playing the video
- UI improvements for the transcript editor
- Improving the status visualization of exports
- Minor bugfixes
17. May 2024
- Releasing celebrities v30 model for Face Recognition mining module
03. May 2024
- Release of the Transcript Editor (Beta 1) that allows users to edit and manage transcripts generated by the Speech Recognition mining module
- Adding cancellation feature for ongoing uploads in the storage uploader dialog to ensure smoother user experience
- Fixing various minor UI bugs across dictionaries, storage page, dataset class clusters and indexed identities
- Fixing an issue related to dataset type loading based on selected model type in the training wizard
- Fixing an issue where re-running jobs did not pre-select the correct model
- Minor UI bugfixes and usability improvements
26. Apr 2024
- Releasing celebrities v29 model for Face Recognition mining module (100+ new identities added)
19. Mar 2024
- Adding the indexing capabilities to the Speaker Identification module which allows to "finger print" and index the speaker's voice
- Introducing API endpoints for the upcoming Transcript Editor
- Minor bugfixes
08. Mar 2024
- Releasing celebrities v28 model for Face Recognition mining module (new identities added)
28. Feb 2024
- Fixing a bug to prevent possible class name folder collision during dataset exports
12. Dez 2023
- Releasing the Training and Evaluation for speaker datasets
- Changing the default sorting of the job results in the UI with a video source to be by
frame_start
- Minor bugfixes
28. Nov 2023
- New Visual Mining Module: Personal Data Anonymization which blurs faces and license plates in images and videos
16. Nov 2023
- Releasing celebrities v27 model for Face Recognition mining module (new identities added)
14. Nov 2023
- Adding a feature to the Object and Scene Recognition module which allows to enable captioning (
enable_captioning
module parameter) - Fixing a bug where IAIS Face Dataset Export and IAIS Audio Dataset Export did not export more than 100 classes
- Adding a summary as CSV file for the IAIS Face Dataset
- Minor bugfixes
27. Oct 2023
- Adding MapDictionary as a new type of Dictionary
- Adding the capability to allow users to specify a start and end range for partial video processing
- Internal database upgrades
- Migrate to explicit UUID for Knowledge Graph Node IDs as a preparation for the upcoming Knowledge Graph release
- Improving the inference time for batch of images in Face Recognition mining module
- Fixing a bug in custom IP address whitelisting on API-Keys
- Minor bugfixes
24. Oct 2023
- Releasing celebrities v26 model for Face Recognition mining module (new identities added)
10. Oct 2023
- Releasing a new model (
zero-shot
) for Object and Scene Recognition including a large set of pre-trained labels with powerful zero-shot generalization capabilities which is customizable with your own dictionary of labels. - Adding face dataset export for IAIS format
- Display the detected main language on Subtitle Detection result page
- Correcting some translation errors for the German UI
- Adding punctuation information to word-level timestamps for Speech Recognition
28. Sep 2023
- Fixing an issue where unicode decoding errors were not handled properly for custom dictionaries
- Introducing a new field for dataset samples
used_in_training
which indicates if the sample was used for training or not. The field is optional and will be part of the response object if theinclude_used_in_training
query parameter is given during the requesting of the/samples
,/audios
or/images
endpoints - Allowing partners to specify and permit specific CORS origins for API requests
05. Sep 2023
- Releasing celebrities v25 model for Face Recognition mining module (new identities added and providing class IDs for our pre-trained model too)
- Fixing an issue where empty results were causing the Speaker Dataset Creation job to fail
- Minor UI bugfixes for Speech Recognition and Subtitle Detection result page
22. Aug 2023
- Add Translation section to Speech Recognition results page
- Small usability improvements for the Job result section
08. Aug 2023
- Adding the support of multiple models (chain of models) for Face Recognition
- New Visual Mining Module: Subtitle Detection which detects the appearance of burned-in subtitles, its position, language and the actual text content of the subtitle
- Fixing an issue with Speech Recognition when the punctation ends up in the next token instead of the current one
- Improving some minor performance issues in the DeepVA Worker deployments by limiting the resources
- Minor UI bugfixes
25. July 2023
- Improving the scrolling and user interaction for Speech Recognition transcripts while playing the video or audio on the job result page
- Avoid showing redundant parent labels for Object and Scene Recognition in the UI
- Adding E-Mail verification and password reset functionality
- Fixing an issue where Face Recognition jobs were failing for specific edge cases
11. July 2023
- Enable the support of live streams for Face Recognition
- Automatic translation for Speech Recognition transcripts to several languages
- Small UI bug fixes
27. June 2023
- Adding the management of text-based dictionaries to the UI
- Improve the performance of thumbnail loading in the UI
- Overhauling of the webhook event names by introducing namespaces. See Webhook Resource for more information.
- Removing the argument
word_level_timestamps
for Speech Recognition and enable it by default always - New status page to monitor the health of our services at status.deepva.com
- Adding an audio player to the job result page for audio-based sources
- Fixing an issue where the session was not updated when changing the password
- Joining text segments of Speech Recognition to a paragraph in order to improve Named Entity Recognition (NER) results
- Handling and reporting an error when the source video has no audio track
- Introducing a new mode for Speech Recognition in order to choose quality over speed of the processing (
mode
) - Fixing an issue where filtering on dataset evaluation feedback was not working in the UI
- Fixing an issue where white spaces and umlauts in the file name did not allow the video player to play the source from the storage
- Minor bug fixes
13. June 2023
- Releasing celebrities v24 model for Face Recognition mining module
26. May 2023
- Improving performance of Speech Recognition mining module (improved timestamps and transcript quality)
- Adding word-level timestamps for Speech Recognition mining module
- Introducing Named Entity Recognition (NER) for Speech Recognition mining module
- Introducing text-based dictionaries to customize the result of mining modules such as Speech Recognition and Lower Third Recognition which for example enables customized named entity recognition
- Introducing editing mode of transcripts for Speech Recognition mining module
- Adding the spoken language to a speech segment for Speech Recognition mining module
- Releasing celebrities v23 model for Face Recognition mining module
- Dropping video transcoding for files supported by the browser (direct play)
- Introducing Job Batches to group Jobs
- Introducing Diversity Reporting across jobs in a batch hierarchy
- Increasing an internal network request timeout that was exceeded occasionally for result submission of large jobs
- Fixing an issue where listing training sample sources from a prediction was not working ("Go to training source")
- Fixing an issue where updating the value of custom fields of a dataset class where not handled correctly from inside the class view
- Fixing an issue where the video player did not show up for failed jobs
- Adding reverse chronological sorting on the storage page
- Improving some default thresholds for Advanced Diversity Analysis mining module
- Internal improvements and minor bug fixes
23. Feb 2023
- Releasing celebrities v22 model for Face Recognition mining module
- Hotfix: Fixing broken video player for public URLs and YouTube videos
22. Feb 2023
- Hotfix: Fixing an issue where jobs with a large number of detailed-results were failing due to a limitation of the payload size
- Minor fixes
17. Feb 2023
- Adding a feature to the Face Recognition mining module which returns the top k most similar identities (
enable_top_k
module parameter). - Fixing an issue regarding the max. file name length on the storage (increased from 100 to 255)
- Improving the performance of processing a batch of images in a job
- Providing OpenAPI/Swagger specifications for all endpoints at https://api.deepva.com/swagger
- Fixing an issue regarding the request limit on webserver level
- Renaming the value for the field media_type from wav and mp3 to audio (following values are supported:
image
,video
,audio
,videostream
,pdf
,xml
) - Introducing a
ttl
field to the Job object which allows to set a time-to-live in seconds until the job will be deleted - Improving the performance of the file upload to the storage
- Improving the Speech Recognition mining module by adding Voice Activity Detection (
enable_vad
module parameter) - Fixing an issue where custom fields were not saved for datasets of type
audio
- Fixing the calculation of the job progress for audio based modules such as Speech Recognition
- Fixing an issue where the filtering for "Go to training source" from a Face Recognition result did not work properly
- Fixing the broken folder dropdown in the Visual Mining Job wizard
- Adding the support for Instagram Reels and Tiktok Video URLs as job sources
- Adding the support for m4a files
- UI improvements
- Minor fixes and improvements
13. Dec 2022
- New Visual Mining Module: Speech Recognition which enables speech-to-text functionality
- Fixing an issue where
detailed_link
in the Job and Detailed Results object were broken (wrong HTTP scheme) - Improving the handling of static files
- Add expiration date of users
- Prevent stopping of jobs in the state
waiting
- UI improvements
- Minor fixes
30. Nov 2022
- Introducing stop operation for jobs (Jobs can be stopped at any progress without loosing their results)
- Adding support for audio files on the storage
- Adding Dataset Management for audio datasets
- Introducing abstraction for training samples by adding a general
/samples
endpoint + endpoints for/images
and /audios
- Adding the ability for annotation of audio segments via the UI
- Improving the fairness of job queuing by introducing a "fair queue"
30. Sep 2022
- New Visual Mining Module: Speaker Dataset Creation which enables to automate the retrieval of audio-based datasets (similar to Face Dataset Creation)
23. Feb 2023
- Releasing celebrities v22 model for Face Recognition mining module
- Hotfix: Fixing broken video player for public URLs and YouTube videos
22. Feb 2023
- Hotfix: Fixing an issue where jobs with a large number of detailed-results were failing due to a limitation of the payload size
- Minor fixes
17. Feb 2023
- Adding a feature to the Face Recognition mining module which returns the top k most similar identities (
enable_top_k
module parameter). - Fixing an issue regarding the max. file name length on the storage (increased from 100 to 255)
- Improving the performance of processing a batch of images in a job
- Providing OpenAPI/Swagger specifications for all endpoints at https://api.deepva.com/swagger
- Fixing an issue regarding the request limit on webserver level
- Renaming the value for the field media_type from wav and mp3 to audio (following values are supported:
image
,video
,audio
,videostream
,pdf
,xml
) - Introducing a
ttl
field to the Job object which allows to set a time-to-live in seconds until the job will be deleted - Improving the performance of the file upload to the storage
- Improving the Speech Recognition mining module by adding Voice Activity Detection (
enable_vad
module parameter) - Fixing an issue where custom fields were not saved for datasets of type
audio
- Fixing the calculation of the job progress for audio based modules such as Speech Recognition
- Fixing an issue where the filtering for "Go to training source" from a Face Recognition result did not work properly
- Fixing the broken folder dropdown in the Visual Mining Job wizard
- Adding the support for Instagram Reels and Tiktok Video URLs as job sources
- Adding the support for m4a files
- UI improvements
- Minor fixes and improvements
13. Dec 2022
- New Visual Mining Module: Speech Recognition which enables speech-to-text functionality
- Fixing an issue where
detailed_link
in the Job and Detailed Results object were broken (wrong HTTP scheme) - Improving the handling of static files
- Add expiration date of users
- Prevent stopping of jobs in the state
waiting
- UI improvements
- Minor fixes
30. Nov 2022
- Introducing stop operation for jobs (Jobs can be stopped at any progress without loosing their results)
- Adding support for audio files on the storage
- Adding Dataset Management for audio datasets
- Introducing abstraction for training samples by adding a general
/samples
endpoint + endpoints for/images
and /audios
- Adding the ability for annotation of audio segments via the UI
- Improving the fairness of job queuing by introducing a "fair queue"
30. Sep 2022
- New Visual Mining Module: Speaker Dataset Creation which enables to automate the retrieval of audio-based datasets (similar to Face Dataset Creation)
29. Mar 2022
- New Visual Mining Module: Advanced Diversity Analysis which gives more detailed result than the previous Diversity Analysis. The previous module will be still available for backward compatibility reasons.
- Introducing summarized results for jobs which is only used by Advanced Diversity Analysis so far. The ResultSummary of the job object will become deprecated and is going to be removed in the v2 of the API.
- Fixing an issue when showing a large list of custom fields on the preferences page
- Fixing an issue which broke the ability to update the value of a custom field on the class level
- Fixing an issue where the timeline chart was not updated properly on the job result page
07. Mar 2022
- UI improvements
- Adding a preferences page for general account settings
- Adding German language to the UI (accessible via the preferences page)
- Adding custom fields for dataset classes to the preferences page
- Faster processing of YouTube links
- Fixing an issue where some endpoints where redirected if no trailing slash was given
- Adding the ability to use more than one model for face recognition in on-prem environment
- Refactor UI for the storage section
- Adding search and filter to storage picker in the Visual Mining Wizard
- Introducing v2 of the API (BETA!, not recommended for production yet)
- Minor fixes
26. July 2021
- Show fallback image if image url is not available anymore
- Fix an issue when showing landmark recognition result
- Show spinner while results are loading
- Minor fixes
22. Jun 2021
- Existing fields were updated in the API response:
- The existing fields
frame_start
andframe_end
of the Detailed Result object will have the zero-based index of the source when a batch of images is passed to a job (before both fields had anull
value for image jobs)
- The existing fields
- New fields will be added to the API response by Friday, 25. Jun 2021:
- A new string field called
type
will be added to the Class object representing the inherited type of the dataset (e.g. "face" or "landmark")
- A new string field called
16. Jun 2021
- Fixing an issue where Dataset Evaluation was failing for large datasets
04. Jun 2021
- Management of index collections
- UI improvements
- Sorting for number of images per class added
- Small performance improvements on storage level
- Face Recognition: +1k identities added to our pre-trained model
- Object & Scene Recognition: Improved model 'general-c' added
- Landmark Recognition: Improved model 'general-b' added
- Improved performance for Dataset Evaluation
- Minor bugfixes
12. Feb 2021
- UI improvements
- Small bugfixes on face recognition result visualization
01. Feb 2021
- Thumbnails added to Face Recognition result
- Indexing of unknown identities for Face Recognition added (See module parameters)
- Showing Diversity Analysis chart on module result and dataset level
20. Nov 2020
- The Help Center is now available with some tutorial videos
- Performance improvements for operations on datasets
- New Visual Mining Module: Aspect Ratio Detection
- Mining Module
Gender Neutrality Estimation
renamed intoDiversity Analysis
since it has a new ability to detect the age as well (Module ID has changed todiversity_analysis
). - Bugfixes and improvements for Dataset Evaluation
29. Oct 2020
- Hotfix: Broken video player for videos on DeepVA Storage
23. Oct 2020
- UI Job page re-designed
- Dataset Evaluation available to check the quality of your datasets (see FAQ)
- Support of MOV (QuickTime) video format added
- Class page loading time improved
- Minor bugfixes
28. Aug 2020
- UI Dataset, Class and Image page re-designed
- Page loading time improvements
- Several bugfixes
14. Aug 2020
- Landmark Recognition: Improvements for
general
model - New Visual Mining Module: QR Code Detection offering the possibility to find and decode QR Codes but also EAN13 codes (European Article Number) and their corresponding product names in your videos and images.
11. Aug 2020
- List of Detections added to DetailedResult resource (enable the user to get bounding-boxes of a face when applying Face Recognition)
- Support of MXF video format added
- Support of M3U8 stream URLs added
- Landmark Recognition: Europe + North America pre-trained model released (general)
10. Jul 2020
- Custom training of Landmark Recognition models
29. May 2020
- Model versioning
- UI Model page re-designed
- Multi-file upload feature added
15. Apr 2020
- New Visual Mining Module: Landmark Recognition offering the possibility to identify all important sights, architectural structures and natural monuments across the world
28. Apr 2020
- Video player for job result added
01. Apr 2020
- Custom training of Face Recognition models
- Dataset management via UI
- New Visual Mining Module: Gender Neutrality Estimation offering the possibility to determine the percentage of gender occurrence in images or videos. Ensure your desired ratio between male and female in any content.