By Mayu Otani, CyberAgent, Inc., Japan, otani_mayu@cyberagent.co.jp | Yale Song, Microsoft Research, USA, yalesong@microsoft.com | Yang Wang, University of Manitoba, Canada, ywang@cs.umanitoba.ca
With the broad growth of video capturing devices and applications on the web, it is more demanding to provide desired video content for users efficiently. Video summarization facilitates quickly grasping video content by creating a compact summary of videos. Much effort has been devoted to automatic video summarization, and various problem settings and approaches have been proposed. Our goal is to provide an overview of this field. This survey covers early studies as well as recent approaches which take advantage of deep learning techniques. We describe video summarization approaches and their underlying concepts. We also discuss benchmarks and evaluations. We overview how prior work addressed evaluation and detail the pros and cons of the evaluation protocols. Last but not least, we discuss open challenges in this field.
The widespread use of the internet and affordable video capturing devices has dramatically changed the landscape of video creation and consumption. In particular, user-created videos are more prevalent than ever with the evolution of video streaming services and social networks. The rapid growth of video creation necessitates advanced technologies that enable efficient consumption of desired video content. The scenarios include enhancing user experience for viewers on video streaming services, enabling quick video browsing for video creators who need to go through a massive amount of video rushes, and for security teams who need to monitor surveillance videos.
With the broad growth of video capturing devices and applications on the web, it is more demanding to provide desired video content for users efficiently. Video summarization facilitates quickly grasping video content by creating a compact summary of videos. Much effort has been devoted to automatic video summarization, and various problem settings and approaches have been proposed. This monograph provides an overview of this field, and covers early studies as well as recent approaches which take advantage of deep learning techniques. Video summarization approaches and their underlying concepts are described, and benchmarks and evaluations are included. Evaluation techniques in prior work in this field are addressed, and the pros and cons of the evaluation protocols are detailed. The monograph concludes with current and open challenges in this field.
This monograph is a useful reference for students and professionals who are active in, or wish to enter into the field of Video Summarization.