Projects

Movie/Script
We parse the video into a hierarchy of frames, shots and scenes, and align it to the screenplay via the subtitles.
movie/script
Action Database
Once the screenplay is aligned, we can extract actions from the screenplay to obtain a realistic, large scale, highly varied dataset of actions "in the wild".
Learning with Ambiguous Labels
We learn person-specific face recognition classifiers learned using the ambiguous labels present in the screenplay - each face is associated with a set of labels, only one of which is correct.