By Guy Van den Broeck, University of California, Los Angeles, USA, guyvdb@cs.ucla.edu | Dan Suciu, University of Washington, USA, suciu@cs.washington.edu
Probabilistic data is motivated by the need to model uncertainty in large databases. Over the last twenty years or so, both the Database community and the AI community have studied various aspects of probabilistic relational data. This survey presents the main approaches developed in the literature, reconciling concepts developed in parallel by the two research communities. The survey starts with an extensive discussion of the main probabilistic data models and their relationships, followed by a brief overview of model counting and its relationship to probabilistic data. After that, the survey discusses lifted probabilistic inference, which are a suite of techniques developed in parallel by the Database and AI communities for probabilistic query evaluation. Then, it gives a short summary of query compilation, presenting some theoretical results highlighting limitations of various query evaluation techniques on probabilistic data. The survey ends with a very brief discussion of some popular probabilistic data sets, systems, and applications that build on this technology.
Probabilistic data is motivated by the need to model uncertainty in large databases. Over the last twenty years or so, both the Database community and the AI community have studied various aspects of probabilistic relational data.
Query Processing on Probabilistic Data: A Survey presents the main approaches developed in the literature, reconciling concepts developed in parallel by the two research communities. It starts with an extensive discussion of the main probabilistic data models and their relationships, followed by a brief overview of model counting and its relationship to probabilistic data. The monograph proceeds to discuss lifted probabilistic inference, a suite of techniques developed in parallel by the Database and AI communities for probabilistic query evaluation. It then provides a summary of query compilation, presenting some theoretical results highlighting limitations of various query evaluation techniques on probabilistic data. It ends with a brief discussion of some popular probabilistic data sets, systems, and applications that build on this technology.