By Jayant R. Haritsa, Indian Institute of Science, India, haritsa@iisc.ac.in
The primordial function of a database system is to efficiently compute correct answers to user queries. Therefore, robust query processing (RQP), where strong numerical guarantees are provided on query performance, has been a long-standing core objective in the design of industrial-strength database engines. Unfortunately, however, RQP has proved to be a largely intractable and elusive challenge, despite sustained efforts spanning several decades. This problematic situation has arisen from a variety of knotty technical hurdles, including complex query representations, limited metadata coverage, coarse statistical models, and hypersensitive operator behaviors. Its impact is felt acutely since the performance degradation faced by database queries can be huge, reaching orders of magnitude as compared to an oracular ideal.
Notwithstanding this daunting history, the good news is that in recent times, there have been a host of exciting technical advances that collectively promise to materially address the robustness objective. The new approaches have been constructed at different levels in the database architecture, and tackle robustness in cost models, database operators, query execution plans and query processing strategies. Although most of this literature is based on statistical and geometric formulations, a significant corpus of machine learning-based techniques is also now available.
In this monograph, we present an overview of these novel research paradigms, and highlight their strengths and limitations. Further, we enumerate a suite of open technical problems that remain to be solved to make RQP a contemporary reality.
The primary function of a database system is to efficiently compute correct answers to user queries. Therefore, robust query processing (RQP), where strong numerical guarantees are provided on query performance, has been a long-standing core objective in the design of industrial-strength database engines. Unfortunately, however, RQP has proved to be a largely intractable and elusive challenge, despite sustained efforts spanning several decades. In this monograph, a holistic coverage of the RQP innovations is provided, and strengths and limitations are highlighted. Further, open technical problems that remain to be solved to make RQP a contemporary reality are also enumerated.
In this monograph, representative techniques along these various dimensions are covered. After the introduction, a background to RQP is given. In the chapters thereafter, the authors cover Robust Operators, Plans, and Execution, and then Structural Bounds, Cost Models and Machine Learning Techniques are surveyed. The monograph concludes with a chapter on Holistic Robustness, and Future Directions.
The target audience for this monograph includes researchers, developers and students with an interest in the internals of database engines.