Predicting the exciting portions of a video is a widely relevant problem because of applications such as video summarization, searching for similar videos, and recommending videos to users. Researchers have proposed the use of physiological indices such as pupillary dilation as a measure of emotional arousal. The key problem with using the pupil to measure emotional arousal is accounting for pupillary response to brightness changes. We propose a linear model of pupillary light reflex to predict the pupil diameter of a viewer based only on incident light intensity. The residual between the measured pupillary diameter and the model prediction is attributed to the emotional arousal corresponding to that scene. We evaluate the effectiveness of this method of factoring out pupillary light reflex for the particular application of video summarization. The residual is converted into an exciting-ness score for each frame of a video. We show results on a variety of videos, and compare against ground truth as reported by three independent coders.
© 2016 Copyright is held by the owner/author(s). Publication rights licensed to ACM.