CVPR 2014 Tutorial on 

Emerging Topics in Human Activity Recognition


Tutorial Date

June 23rd, 2014 (Location: Room C211)


Michael Ryoo


Ivan Laptev


Greg Mori


Sangmin Oh



In the past 5 years, the field of human activity recognition has grown dramatically, reflecting its importance in many high-impact societal applications including smart surveillance, web-video search and retrieval, quality-of-life devices for elderly people, and human-computer interfaces. Given the initial success of bag-of-words methods for action classification, the field is gradually moving towards more structured interpretation of complex human activities involving multiple people and objects as well as interactions among them in various realistic scenarios. New important research topics and problems are appearing as a consequence, including (i) modeling temporal structure of activities, (ii) learning relations between actions and objects/scenes/social roles, (iii) group activity recognition, and  (iv) first-person activity recognition. The objective of this tutorial is to introduce and overview recent progress in these emerging topics, as well as to discuss, motivate and encourage future research in diverse subfields of action recognition.

SCHEDULE (tentative)

Organizers of the tutorial will offer a sequence of lectures on active and emerging topics in activity recognition. Starting with the general motivation, history overview and basic bag-of-words techniques, we will next present advances in several subproblems of action recognition. In particular, we will cover (i) modeling spatio-temporal structure of actions (I. Laptev), (ii) group activity recognition (G. Mori), (iii) activity recognition from the first-person view (M. Ryoo), and (iv) real-world applications of activity recognition (S. Oh) .



Relevant publications of organizers


8:30 am

1. Introduction

Speaker: Michael Ryoo, ...

  • Introduction to human activity recognition
  • Applications and challenges
  • History of activity recognition
  • Dimensions in human activity recognition: types of videos, levels of human activities, and structure complexity



8:45 am

1.1 Action recognition with bag-of-features

Speaker: Ivan Laptev

  • Spatio-temporal features
  • Bag-of-words action recognition
  • Recent results and benchmarks


9:00 am

2. Beyond bag-of-features

Speaker: Ivan Laptev

  • Spatio-temporal structure of simple actions
  • Temporal structure of composite activities
  • Weakly-supervised learning



9:30 am

3. Group activity recognition

Speaker: Greg Mori

  • Human-human interactions
  • Human-object interactions
  • Social role analysis in video
  • Person context for activity recognition



10:15 am

Coffee break

10:30 am

4. First-person activity recognition

Speaker: Michael Ryoo

  • 3rd-person vs. 1st-person videos
  • Ego-action recognition and objects in first-person videos
  • First-person interaction recognition
  • Features for first-person activity recognition
  • ‘Ego’centric videos?



11:15 am

5. Real-world applications of activity recognition

Speaker: Sangmin Oh

  • Large Scale Unconstrained Video Analysis and Retrieval
  • Sports Video analysis
  • Action Recognition for Interactive Systems (Games etc)
  • Traffic Analysis and Surveillance



12:00 pm

5. Discussions and directions

  • Human activity prediction (i.e., early recognition)
  • Action vocabulary
  • Summary and closing





