Foundations and Trends® in Information Retrieval > Vol 2 > Issue 3

Statistical Language Models for Information Retrieval A Critical Review

By ChengXiang Zhai, University of Illinois at Urbana-Champaign, USA, czhai@cs.uiuc.edu

 
Suggested Citation
ChengXiang Zhai (2008), "Statistical Language Models for Information Retrieval A Critical Review", Foundations and TrendsĀ® in Information Retrieval: Vol. 2: No. 3, pp 137-213. http://dx.doi.org/10.1561/1500000008

Publication Date: 30 Nov 2008
© 2008 C. Zhai
 
Subjects
Formal models and language models for IR
 

Free Preview:

Download extract

Share

Download article
In this article:
1 Introduction 
2 The Basic Language Modeling Approach 
3 Understanding Query Likelihood Scoring 
4 Improving the Basic Language Modeling Approach 
5 Query Models and Feedback in Language Models 
6 Language Models for Special Retrieval Tasks 
7 Unifying Different Language Models 
8 Summary and Outlook 
Acknowledgments 
References 

Abstract

Statistical language models have recently been successfully applied to many information retrieval problems. A great deal of recent work has shown that statistical language models not only lead to superior empirical performance, but also facilitate parameter tuning and open up possibilities for modeling nontraditional retrieval problems. In general, statistical language models provide a principled way of modeling various kinds of retrieval problems. The purpose of this survey is to systematically and critically review the existing work in applying statistical language models to information retrieval, summarize their contributions, and point out outstanding challenges.

DOI:10.1561/1500000008
ISBN: 978-1-60198-186-8
84 pp. $65.00
Buy book (pb)
 
ISBN: 978-1-60198-187-5
84 pp. $100.00
Buy E-book (.pdf)
Table of contents:
1: Introduction
2: The Basic Language Modeling Approach
3: Understanding Query Likelihood Scoring
4: Improving the Basic Language Modeling Approach
5: Query Models and Feedback in Language Models
6: Language Models for Special Retrieval Tasks
7: Unifying Different Language Models
8: Summary and Outlook
Acknowledgements
References

Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval systematically and critically reviews the existing work in applying statistical language models to information retrieval, summarizes their contributions, and points out outstanding challenges. Statistical language models have recently been successfully applied to many information retrieval problems. A great deal of recent work has shown that statistical language models not only lead to superior empirical performance, but also facilitate parameter tuning and open up possibilities for modeling non-traditional retrieval problems. In general, statistical language models provide a principled way of modeling various kinds of retrieval problems. Statistical Language Models for Information Retrieval reviews the development of this language modeling approach. It surveys a wide range of retrieval models based on language modeling and attempts to make connections between this new family of models and traditional retrieval models. It summarizes the progress made so far in these models and point out remaining challenges to be solved to further increase their impact. Statistical Language Models for Information Retrieval is written for readers who already have some basic knowledge about information retrieval. Some knowledge of probability and statistics such as the maximum likelihood estimator is helpful, but not a prerequisite to understanding the high-level discussion.

 
INR-008