Camera traps are increasingly used for wildlife monitoring in ecology and conservation, but they produce massive image data with challenges like inefficient manual annotation and high false positives (e.g., empty frames from wind or lighting). Single deep learning models, such as MegaDetector (YOLO-based), MLWIC2, DeepFaune, SpeciesNet, and AddaxAI, perform well in specific scenarios but struggle with generalisation across environments. Research shows ensemble methods like voting and averaging improve robustness (Norouzzadeh et al., 2018), and meta-learning via stacking (e.g., XGBoost) refines results.
This Master's project introduces a hierarchical ensemble framework integrating seven open-source, locally deployable models (e.g., Pytorch-Wildlife, MLWIC2, DeepFaune, SpeciesNet, AddaxAI) to reduce false positives and boost species-level accuracy and species identification scope. The methodology comprises three phases:
Phase 1: Model Evaluation assesses models on ten heterogeneous datasets. Preprocessing involves normalisation, missing value handling, and creating subsets for binary (animal vs. non-animal) and species-level tasks. Rare species are addressed with smoothing for balanced evaluation.
Phase 2: Ensemble Learning uses weighted methods for accuracy enhancement. Binary detection employs accuracy-based weighted averaging and F1-optimised thresholding via grid search. Species classification applies dynamic weighted voting with species-specific weights, only on animal-positive images. Performance is compared via metrics and confusion matrices.
Phase 3: Meta-Model Stacking trains higher-level models on meta-features like confidences, agreement rates, entropy, and weight interactions. Rare classes are filtered for generalisation; data is split 70/15/15 (train/validation/test) with stratification. Meta-models (XGBoost, Random Forest, Logistic Regression) are tuned via GridSearchCV and benchmarked against Phase 2 baselines, showing reduced misclassifications and improved stability through confidence-weighted features and inter-model consistency.
Implementation details: Task 1 downloads 10,000 images per dataset from Google Cloud (prioritising rare species), cleans labels (JSON to CSV with unified fields), and creates a taxonomic mapping table. Task 2 organises code in Jupyter notebooks, runs predictions, merges results, and validates via mapping. Task 3 evaluates models with confusion matrices. Task 4 implements weighted averaging for binary and species tasks, including preprocessing, weights, ensembles, optimisation, and evaluation. Task 5 stacks meta-models on ensemble outputs, adding features like confidence-weight interactions and consistency scores; trains classifiers with tuning and compares to ensembles.
This thesis contributes a novel framework achieving up to 25% false-positive reduction and an effective species classification scope expansion over baselines, validated on multi-geographic datasets. It reduces manual work for ecologists, with open-source code for reproducibility and extension in biodiversity monitoring. Moreover, this project explored the application of meta-learning in enhanced wildlife classification performance, although the further improvement is still necessary.