DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning - IDEX UCA JEDI Université Côte d'Azur Access content directly
Journal Articles Ecological Informatics Year : 2023

DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning

Abstract

Videos and images from camera traps are more and more used by ecologists to estimate the population of species on a territory. It is a laborious work since experts have to analyse massive data sets manually. This takes also a lot of time to filter these videos when many of them do not contain animals or are with human presence. Fortunately, deep learning algorithms for object detection can help ecologists to identify multiple relevant species on their data and to estimate their population. In this study, we propose to go even further by using object detection model to detect, classify and count species on camera traps videos. To this end, we developed a 3-step process: (i) At the first stage, after splitting videos into images, we annotate images by associating bounding boxes to each label thanks to MegaDetector algorithm; (ii) then, we extend MegaDetector based on Faster R-CNN architecture with backbone Inception-ResNet-v2 in order to not only detect the 13 relevant classes but also to classify them; (iii) finally, we design a method to count individuals based on the maximum number of bounding boxes detected. This final stage of counting is evaluated in two different contexts: first including only detection results (i.e. comparing our predictions against the right number of individuals, no matter their true class), then an evolved version including both detection and classification results (i.e. comparing our predictions against the right number in the right class). The results obtained during the evaluation of our model on the test data set are: (i) 73,92\% mAP for classification, (ii) 96,88\% mAP for detection with a ratio Intersection-Over-Union (IoU) of 0.5 (overlapping ratio between groundtruth bounding box and the detected one), and (iii) 89,24\% mAP for detection at IoU=0.75. Highly represented classes, like humans, have highest values of mAP around 81\% whereas less represented classes in the train data set, such as dogs, have lowest values of mAP around 66\%. Regarding the proposed counting method, we predicted a count either exact or $\pm$ 1 unit for 87\% with detection results and for 48\% with detection and classification results of our test data set. Our model is also able to detect empty videos. To the best of our knowledge, this is the first study in France about the use of object detection model on a French national park to locate, identify and estimate the population of species from camera trap videos.
Fichier principal
Vignette du fichier
article_deepWild-light.pdf (2.73 Mo) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-03797530 , version 1 (04-10-2022)
hal-03797530 , version 2 (04-04-2023)

Identifiers

Cite

Fanny Simões, Charles Bouveyron, Frédéric Precioso. DeepWILD : Wildlife Identification, Localisation and estimation on camera trap videos using Deep learning. Ecological Informatics, 2023, 75, ⟨10.1016/j.ecoinf.2023.102095⟩. ⟨hal-03797530v2⟩
393 View
269 Download

Altmetric

Share

Gmail Mastodon Facebook X LinkedIn More