By Hannah Bast, University of Freiburg, Germany, bast@cs.uni-freiburg.de | Björn Buchhold, University of Freiburg, Germany, buchhold@cs.uni-freiburg.de | Elmar Haussmann, University of Freiburg, Germany, haussmann@cs.uni-freiburg.de
This article provides a comprehensive overview of the broad area of semantic search on text and knowledge bases. In a nutshell, semantic search is “search with meaning”. This “meaning” can refer to various parts of the search process: understanding the query (instead of just finding matches of its components in the data), understanding the data (instead of just searching it for such matches), or representing knowledge in a way suitable for meaningful retrieval. Semantic search is studied in a variety of different communities with a variety of different views of the problem. In this survey, we classify this work according to two dimensions: the type of data (text, knowledge bases, combinations of these) and the kind of search (keyword, structured, natural language). We consider all nine combinations. The focus is on fundamental techniques, concrete systems, and benchmarks. The survey also considers advanced issues: ranking, indexing, ontology matching and merging, and inference. It also provides a succinct overview of fundamental natural language processing techniques: POS-tagging, named-entity recognition and disambiguation, sentence parsing, and distributional semantics. The survey is as self-contained as possible, and should thus also serve as a good tutorial for newcomers to this fascinating and highly topical field.
This monograph provides a comprehensive overview of the broad area of semantic search on text and knowledge bases. In a nutshell, semantic search is “search with meaning”. This “meaning” can refer to various parts of the search process: understanding the query (instead of just finding matches of its components in the data), understanding the data (instead of just searching it for such matches), or representing knowledge in a way suitable for meaningful retrieval.
Semantic search is studied in a variety of different communities with a variety of different views of the problem. Semantic Search on Text and Knowledge Bases classifies this work according to two dimensions: the type of data (text, knowledge bases, combinations of these) and the kind of search (keyword, structured, natural language). All nine combinations are considered. The focus is on fundamental techniques, concrete systems, and benchmarks. The monograph also considers advanced issues: ranking, indexing, ontology matching and merging, and inference. It also provides a succinct overview of fundamental natural language processing techniques: POS-tagging, named-entity recognition and disambiguation, sentence parsing, and distributional semantics.
Semantic Search on Text and Knowledge Bases is as self-contained as possible, and should thus also serve as a good tutorial for newcomers to this fascinating and highly topical field.