Statistical Language Models for Information Retrieval A Critical Review

ChengXiang Zhai

doi:10.1561/1500000008

Foundations and Trends® in Information Retrieval > Vol 2 > Issue 3

Statistical Language Models for Information Retrieval A Critical Review

By ChengXiang Zhai, University of Illinois at Urbana-Champaign, USA, czhai@cs.uiuc.edu

Suggested Citation

ChengXiang Zhai (2008), "Statistical Language Models for Information Retrieval A Critical Review", Foundations and Trends® in Information Retrieval: Vol. 2: No. 3, pp 137-213. http://dx.doi.org/10.1561/1500000008

Publication Date: 30 Nov 2008

Subjects

Formal models and language models for IR

Journal details

Download article

In this article:

Abstract

Statistical language models have recently been successfully applied to many information retrieval problems. A great deal of recent work has shown that statistical language models not only lead to superior empirical performance, but also facilitate parameter tuning and open up possibilities for modeling nontraditional retrieval problems. In general, statistical language models provide a principled way of modeling various kinds of retrieval problems. The purpose of this survey is to systematically and critically review the existing work in applying statistical language models to information retrieval, summarize their contributions, and point out outstanding challenges.

DOI:10.1561/1500000008

Book details

ISBN: 978-1-60198-187-5

84 pp. $100.00

To Order this Article please contact ⁠ Emerald Customer Support

Table of contents:

1: Introduction

2: The Basic Language Modeling Approach

3: Understanding Query Likelihood Scoring

4: Improving the Basic Language Modeling Approach

5: Query Models and Feedback in Language Models

6: Language Models for Special Retrieval Tasks

7: Unifying Different Language Models

8: Summary and Outlook

Acknowledgements

References

Statistical Language Models for Information Retrieval

Statistical Language Models for Information Retrieval systematically and critically reviews the existing work in applying statistical language models to information retrieval, summarizes their contributions, and points out outstanding challenges. Statistical language models have recently been successfully applied to many information retrieval problems. A great deal of recent work has shown that statistical language models not only lead to superior empirical performance, but also facilitate parameter tuning and open up possibilities for modeling non-traditional retrieval problems. In general, statistical language models provide a principled way of modeling various kinds of retrieval problems. Statistical Language Models for Information Retrieval reviews the development of this language modeling approach. It surveys a wide range of retrieval models based on language modeling and attempts to make connections between this new family of models and traditional retrieval models. It summarizes the progress made so far in these models and point out remaining challenges to be solved to further increase their impact. Statistical Language Models for Information Retrieval is written for readers who already have some basic knowledge about information retrieval. Some knowledge of probability and statistics such as the maximum likelihood estimator is helpful, but not a prerequisite to understanding the high-level discussion.

1 Introduction
2 The Basic Language Modeling Approach
3 Understanding Query Likelihood Scoring
4 Improving the Basic Language Modeling Approach
5 Query Models and Feedback in Language Models
6 Language Models for Special Retrieval Tasks
7 Unifying Different Language Models
8 Summary and Outlook
Acknowledgments
References

Statistical Language Models for Information Retrieval A Critical Review

Free Preview:

Share

Journal details

Abstract

Book details

Statistical Language Models for Information Retrieval