CVPR 2011 Tutorial on Human Activity Recognition

- Frontiers of Human Activity Analysis -

J. K. Aggarwal, Michael S. Ryoo, and Kris Kitani

Date: June 20th Monday

Human activity recognition is an important area of computer vision research and applications. The goal of the activity recognition is an automated analysis (or interpretation) of ongoing events and their context from video data. Its applications include surveillance systems, patient monitoring systems, and a variety of systems that involve interactions between persons and electronic devices such as human-computer interfaces. Most of these applications require recognition of high-level activities, often composed of multiple simple (or atomic) actions of persons. This tutorial provides a detailed overview of various state-of-the- art research papers and techniques for human activity recognition. We discuss both the methodologies developed for simple individual-level activities and those for high-level multi-person (and object) interactions. An approach-based taxonomy is chosen, comparing the advantages and limitations of each approach.

We briefly review early history of human activity recognition, and discuss methodologies designed for recognition of activities of individual persons. Approaches utilizing space-time volumes and/or sequential models are also covered. Next, hierarchical recognition methodologies for high-level activities are presented and compared. We categorize human activities into human actions, human-human interactions, human-object interactions, and group activities, discussing approaches designed for their recognition. Hierarchical state-based approaches and syntactic approaches that interpret videos in terms of stochastic strings are covered. Finally, we discuss description-based approaches that analyze videos by maintaining their knowledge on activities' temporal, spatial, and logical structures. Recent video datasets designed to encourage human activity recognition research will be discussed as well. This tutorial will provide the impetus for future research and applications in more productive areas.

This tutorial is partly based on the following survey paper:

J. K. Aggarwal and M. S. Ryoo, Human Activity Analysis: A Review, ACM Computing Surveys (CSUR), 43(3), April 2011.


Time Topic Slides

  - Introduction to human activity recognition
  - Applications
  - Technical challenges in human activity recognition



  - Activity classification vs. detection
  - Approach [representation, recognition]-based taxonomy of activity recognition methods
  - Single-layered approaches vs. hierarchical approaches


Single-layered approaches for action recognition

  - Sequential recognition approaches using state models (e.g. HMMs)
  - Activity recognition by matching space-time volumes
  - Local spatio-temporal features
  - Action recognition datasets

3:00pm Coffee break
Hierarchical approaches for activity-level analysis

Hierarchical statistical approaches and syntactic approaches

  - Hierarchical approaches of concatenating multiple layers of statistical state models (e.g. HMMs)
  - Syntactic approaches of representing an activity as a set of strings
  - Stochastic context-free grammar for activity recognition


Description-based human activity recognition

  - Activity representation using Allen's temporal predicates
  - Human-object interactions and group activities
  - Spatio-temporal relationship match


Applications and challenges

5:30pm End
Michael S. Ryoo