actionrecognition

Action recognition is the task of identifying actions performed by humans in video data. Given a short clip or a segment of a longer video, the goal is to assign one or more action labels that describe the activities shown. It is a core problem in computer vision and multimedia, combining spatial information from individual frames with temporal dynamics across time.

Approaches have evolved from engineered features to deep learning. Early methods used handcrafted descriptors such as

Prominent datasets include UCF101, HMDB51, and the large-scale Kinetics datasets, with evaluation typically reported as top-1

Applications include video surveillance, sports analytics, human-computer interaction, and content-based video retrieval. Challenges include intra-class variation,

Related topics include action detection, skeleton-based action recognition, and broader video understanding.

spatio-temporal

generalization,

self-supervised