Automated species identification in camera trap images for wildlife conservation

Citation

Abstract

Wildlife conservation involves protecting, preserving, and managing wildlife species and their habitats. With today’s rapid pace of human development, climate change, and other unsustainable practices, the need for wildlife conservation has heightened. Despite significant progress in species identification using deep-learning models, significant challenges still remain in effectively detecting small animals in low-contrast trap images due to limited feature extraction capabilities. This thesis presents a novel end-to-end framework integrating a shifted window based local self-attention mechanism along with enhanced feature fusion in a object detection head and incorporating multimodal large language model to address these limitations. The proposed architecture involves a Swin-BiFPN backbone integrated in a Faster RCNN detection network, coupled with a visual semantic extraction module driven by the LLaVA v1.5 (13B) multimodal large language model. The detection framework, capable of extracting crucial features in challenging trap images, demonstrates consistently high results and robust generalization capabilities. Furthermore, the visual semantic extraction module provides zero-shot detection capability, as well as providing valuable insights and emergent cues of the animal’s behavior, further supporting the conservation effort. The MLLM evaluation was conducted using both traditional NLP metrics (precision, recall, F1, and SBERT similarity) and subjective scoring by LLM-based judges (GPT-4.1 and GROK 3.0), across five MLLMs, demonstrating the model’s strong performance in visual description generation. The proposed framework improves detection accuracy across low-contrast trap images and small animals while also demonstrating zero-shot detection capability leveraging the MLLM.

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 51-53).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2025.

Publisher Link

Type

Thesis