Foundations and Trends® in Information Retrieval > Vol 5 > Issue 4–5

Spoken Content Retrieval: A Survey of Techniques and Technologies

By Martha Larson, Faculty of Electrical Engineering, Mathematics and Computer Science, Multimedia Information Retrieval Lab, Delft University of Technology, The Netherlands, m.a.larson@tudelft.nl | Gareth J. F. Jones, Centre for Next Generation Localisation, School of Computing, Dublin City University, Ireland, gjones@computing.dcu.ie

 
Suggested Citation
Martha Larson and Gareth J. F. Jones (2012), "Spoken Content Retrieval: A Survey of Techniques and Technologies", Foundations and Trends® in Information Retrieval: Vol. 5: No. 4–5, pp 235-422. http://dx.doi.org/10.1561/1500000020

Publication Date: 23 Jul 2012
© 2012 M. Larson and G. J. F. Jones
 
Subjects
Natural language processing for IR,  Speech and spoken language processing
 

Free Preview:

Download extract

Share

Download article
In this article:
1 Introduction 
2 Overview of Spoken Content Indexing and Retrieval 
3 Automatic Speech Recognition (ASR) 
4 Exploiting Automatic Speech Recognition Output 
5 Spoken Content Retrieval beyond ASR Transcripts 
6 Accessing Information in Spoken Content 
7 Conclusion and Outlook 
References 

Abstract

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR.

DOI:10.1561/1500000020
ISBN: 978-1-60198-528-6
196 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-60198-529-3
196 pp. $220.00
Buy E-book (.pdf)
Table of contents:
1: Introduction
2: Overview of Spoken Content Indexing and Retrieval
3: Automatic Speech Recognition
4: Exploiting Automatic Speech Recognition Output
5: Spoken Content Retrieval beyond ASR Transcripts
6: Accessing Information in Spoken Content
7: Conclusion and Outlook
References

Spoken Content Retrieval

Speech media, i.e., digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research work on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires a combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. Spoken Content Retrieval: A Survey of Techniques and Technologies provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It can be read sequentially from beginning to end, but is also written modularly, making it possible to read parts of the survey selectively. Spoken Content Retrieval: A Survey of Techniques and Technologies includes an extensive bibliography including over 300 references. The bibliography was selected with the goal of providing a comprehensive selection of entry points into the literature that would allow further exploration of the issues covered. The text is an invaluable reference for researchers with backgrounds in speech technology or information retrieval who are seeking deeper insight on how these fields are integrated to support research and development addressing the core challenges of SCR.

 
INR-020