Klamer Schutte | TU Delft Repository

Video BagNet

Short temporal receptive fields increase robustness in long-term action recognition

Conference paper (2023) - O. Strafforello (author) , O. Strafforello (author) , X. Liu (author) , Klamer Schutte (author) , J.C. van Gemert (author)

Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF). We argue that these models are not always the best choice for temporal modeling in videos. A large temporal receptive field allows the model ...

Are current long-term video understanding datasets long-term?

Conference paper (2023) - O. Strafforello (author) , O. Strafforello (author) , Klamer Schutte (author) , J.C. van Gemert (author)

Many real-world applications, from sport analysis to surveillance, benefit from automatic long-term action recognition. In the current deep learning paradigm for automatic action recognition, it is imperative that models are trained and tested on datasets and tasks that evaluate ...

Long-term behaviour recognition in videos with actor-focused region attention

Conference paper (2021) - Luca Ballan (author) , Luca Ballan (author) , O. Strafforello (author) , O. Strafforello (author) , O. Strafforello (author) , Klamer Schutte (author)

Long-Term activities involve humans performing complex, minutes-long actions. Differently than in traditional action recognition, complex activities are normally composed of a set of sub-actions, that can appear in different order, duration, and quantity. These aspects introduce ...