Adversarial Web Search

Carlos Castillo; Brian D. Davison

doi:10.1561/1500000021

Foundations and Trends® in Information Retrieval > Vol 4 > Issue 5

Adversarial Web Search

By Carlos Castillo, Yahoo! Research, Catalunya-Spain, chato@yahoo-inc.com | Brian D. Davison, Lehigh University, USA, davison@cse.lehigh.edu

Suggested Citation

Carlos Castillo and Brian D. Davison (2011), "Adversarial Web Search", Foundations and Trends® in Information Retrieval: Vol. 4: No. 5, pp 377-486. http://dx.doi.org/10.1561/1500000021

Publication Date: 22 Jan 2011

Subjects

Web search

Journal details

Download article

In this article:

Abstract

Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to game search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception.

In this monograph, we consider the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as "Adversarial Information Retrieval". We show that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. We also examine work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced.

Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) and newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.

DOI:10.1561/1500000021

Book details

ISBN: 978-1-60198-414-2

124 pp. $85.00

Buy book (pb)

ISBN: 978-1-60198-415-9

124 pp. $100.00

Buy E-book (.pdf)

Table of contents:

1: Introduction

2: Overview of search engine spam detection

3: Dealing with content spam and plagiarized content

4: Curbing nepotistic linking

5: Propagating trust and distrust

6: Detecting spam in usage data

7: Fighting spam in user-generated content

8: Discussion

Acknowledgements

References

Adversarial Web Search

Web search engines have become indispensable tools for finding content. As the popularity of the Web has increased, the efforts to exploit the Web for commercial, social, or political advantage have grown, making it harder for search engines to discriminate between truthful signals of content quality and deceptive attempts to improve search engines' rankings. This problem is further complicated by the open nature of the Web, which allows anyone to write and publish anything, and by the fact that search engines must analyze ever-growing numbers of Web pages. Moreover, increasing expectations of users, who over time rely on Web search for information needs related to more aspects of their lives, further deepen the need for search engines to develop effective counter-measures against deception. Adversarial Web Search considers the effects of the adversarial relationship between search systems and those who wish to manipulate them, a field known as "Adversarial Information Retrieval". It shows that search engine spammers create false content and misleading links to lure unsuspecting visitors to pages filled with advertisements or malware. It also examines work over the past decade or so that aims to discover such spamming activities to get spam pages removed or their effect on the quality of the results reduced. Research in Adversarial Information Retrieval has been evolving over time, and currently continues both in traditional areas (e.g., link spam) as well as newer areas, such as click fraud and spam in social media, demonstrating that this conflict is far from over.

1 Introduction
2 Overview of Search Engine Spam Detection
3 Dealing with Content Spam and Plagiarized Content
4 Curbing Nepotistic Linking
5 Propagating Trust and Distrust
6 Detecting Spam in Usage Data
7 Fighting Spam in User-Generated Content
8 Discussion
Acknowledgments
References

Adversarial Web Search

Free Preview:

Share

Journal details

Abstract

Book details

Adversarial Web Search