Skip to main content

Research Repository

Advanced Search

Deep Adaptive Temporal Pooling for Activity Recognition

Song, Sibo; Cheung, Ngai-Man; Chandrasekhar, Vijay; Mandal, Bappaditya

Deep Adaptive Temporal Pooling for Activity Recognition Thumbnail


Sibo Song

Ngai-Man Cheung

Vijay Chandrasekhar


Deep neural networks have recently achieved competitive accuracy for human activity recognition. However, there is room for improvement, especially in modeling of long-term temporal importance and determining the activity relevance of different temporal segments in a video. To address this problem, we propose a learnable and differentiable module: Deep Adaptive Temporal Pooling (DATP). DATP applies a self-attention mechanism to adaptively pool the classification scores of different video segments. Specifically, using frame-level features, DATP regresses importance of different temporal segments, and generates weights for them. Remarkably, DATP is trained using only the video-level label. There is no need of additional supervision except video-level activity class label. We conduct extensive experiments to investigate various input features and different weight models. Experimental results show that DATP can learn to assign large weights to key video segments. More importantly, DATP can improve training of frame-level feature extractor. This is because relevant temporal segments are assigned large weights during back-propagation. Overall, we achieve state-of-the-art performance on UCF101, HMDB51 and Kinetics datasets.

Conference Name MM '18: ACM Multimedia Conference
Conference Location Seoul Republic of Korea
Start Date Oct 22, 2018
End Date Oct 26, 2018
Acceptance Date Jul 1, 2018
Online Publication Date Oct 15, 2018
Publication Date Oct 21, 2018
Publicly Available Date May 26, 2023
Publisher Association for Computing Machinery (ACM)
ISBN 978-1-4503-5665-7
Keywords Human activity recognition, adaptive temporal pooling
Publisher URL


You might also like

Downloadable Citations