Skip to content
Jean-Philippe Moreux edited this page Apr 8, 2020 · 16 revisions

This graph presents the main characteristics of the dataset.

Ads dataset statistics

Object detection

Yolo v3 have been applied to the ads images (see "Face and object detection" section on the Image Retrieval page. Seven "transports" classes are used: bicycle, car, motorbike, aeroplane, train, truck, boat. Yolo v3 generated 17.5k annotations (1.400 on the means of transport classes).

Human annotations

Yolo may have some serious issues on inferencing objects on heritage newspapers ads. Consequently, a human annotation campaign has been applied to the whole dataset, in order to fix false positives and false negatives. 3,5k annotations have been produced, using the editing features of the GallicaPix web app.

Analysis

Clone this wiki locally