Designing a System for Tracking and Spatio-Temporal Action Classification of Primates
by Ulf Matthias Nuske
Date of Examination:2025-11-12
Date of issue:2025-12-10
Advisor:Prof. Dr. Florentin Wörgötter
Referee:Prof. Dr. Florentin Wörgötter
Referee:Prof. Dr. Alexander Ecker
Files in this item
Name:Dissertation_Nuske-Matthias.pdf
Size:72.8Mb
Format:PDF
Abstract
English
Primate researchers in both laboratory settings and natural environments are fundamentally interested in primate behavior. Approaches range from controlled laboratory setups enabling implant-based neuronal recordings to field studies observing natural individual and group behavior. Recent advances in recording devices and computer vision have made it possible to automatically collect behavioral data at large scales. This includes highly calibrated static camera systems in the lab, as well as camera traps and handheld recordings in the wild, triggered automatically or operated manually. The rapid increase in data volume renders manual analysis infeasible, while providing abundant training material for computer vision methods. This allows researchers to use data-driven approaches to efficiently detect and analyze behaviors in long video recordings or image sequences. Given this complex video data, we present CNN-based methods for multi-animal multi-object tracking and spatio-temporal action detection of individual primates. The overarching aim is to reduce manual annotation effort and support large-scale behavioral analysis by providing robust tools for tracking, identifying and classifying bodily actions, thereby enabling data-driven research in primatology and neuroscience. Towards supporting automated primate behavior analysis, in this thesis we present two main contributions: 1. Tracking: A multi-object tracker based on bounding boxes, adapted from the single-class pedestrian tracking architecture FairMOT, to track both primates and additional objects of interest. Bounding boxes can be annotated efficiently and generalize well across environments and primate species. Extending the tracker from single- to multi-class detection enables researchers to include other relevant objects --such as feeders or enrichment devices-- depending on the experimental setup. This supports richer behavioral analyses. 2. Action detection: Multiple approaches for automatic action classification, building on the bounding box–based tracking: (a) joint action detection within the tracking framework, offering an integrated end-to-end solution that is easy to use for behavioral researchers; (b) a second-stage ResNet-based classifier operating on cropped primate detections, enabling flexible retraining for new action sets, eliminating the need to retrain the tracker; (c) a lightweight and conceptually simple MLP-based model that predicts actions from time-series of bounding box traces across multiple camera perspectives, capturing movement dynamics over time. We trained and evaluated our tracking and action classification methods on a dataset collected in the Exploration Room at the German Primate Center. The dataset focuses on on rhesus macaques (Macaca mulatta) engaging in a free-foraging experiment designed to encourage a wide range of full-body behaviors, while interacting with multiple objects of interest. The adaptability of our tracking framework was further demonstrated by successfully applying a closely related variant --with modifications to the association strategy-- to recordings from the wild. The architectures developed as part of this work are available at https://github.com/mnuske/PrEFairMOT.
Keywords: Machine Learning; Datascience; Primate Behavior; Deep Learning; Primate Tracking; Computer Vision
