By Bor-Yuh Evan Chang, University of Colorado, USA, evan.chang@colorado.edu | Cezara Drăgoi, INRIA, France and CNRS, PSL University, France, cezarad@di.ens.fr | Roman Manevich, Ben-Gurion University of the Negev, Israel, romanm@cs.bgu.ac.il | Noam Rinetzky, Tel Aviv University, Israel, maon@cs.tau.ac.il | Xavier Rival, INRIA, France and CNRS, PSL University, France, rival@di.ens.fr
The computation of semantic information about the behavior of pointer-manipulating programs has been a long standing issue, attacked with diverse and numerous techniques and tools for over 50 years. As usual in automatic verification of infinite-state programs, properties of interest are not computable. Thus, static analyses can only be conservative, leading different analyses to make different tradeoffs between the intricacies of the properties they detect, the precision of their inference procedure and analysis, and the scalability of the analysis.
In this context,shape analyses focus on inferring highly complex properties of heap-manipulating programs. These programs utilize data structures which are implemented using an unbounded number of dynamically- (heap-) allocated memory cells interconnected via mutable pointer-links. Because shape analyses have to reason about data structures whose size is not bounded by a fixed, known value, they cannot track explicitly the particular properties of every concrete memory cell which the program uses, as done, e.g., by analysis of variable-manipulating non-recursive programs. Instead, shape analysessummarize memory regions by letting one piece of abstract information, calledsummary predicate, describe several concrete cells. The need to cope with data structures of unbounded sizes is a challenge shape analyses share with static analyzers of array-manipulating programs. However, while the size of an array may change in different executions, its layout (i.e., its dimensions and the way its contents are spread over the memory) is fixed. In contrast, the layout of a pointer-linked data structure, colloquially referred to as itsshape, may evolve dynamically during the program execution and a memory cell can be part of different data structures at different points in time. As a result, shape analyses need to let the denotation of summary predicates in terms of the constituents and layouts of the memory regions which they represent evolve during the analysis as well.
In this survey, we consider that shape analyses are characterized and defined by the presence of summary predicates describing a set of concrete memory cells that varies during the course of the analysis. We use this characterization as a means for distinguishing shape analyses as a particular class of pointer analyses. We show that many “standard” pointer analyses do not fit the aforementioned description, while many analyses relying on very different mathematical foundations, e.g., shape graphs, three-valued logic, and separation logic, do.
The ambition of this survey is to provide a comprehensive introduction to the field of shape analysis, and to present the foundation of this topic, in a single document that is accessible to readers who are not familiar with it. To do so, we characterize the essence of shape analysis compared to more classical pointer analyses. We supply the intuition underlying the abstractions commonly used in shape analysis and the algorithms that allow to statically compute intricate semantic properties. Then, we cover the main families of shape analysis abstraction and algorithms, highlight the similarities between them, and also characterize the main differences between the most common approaches. Last, we review a few other static analysis works (such as array abstractions, dictionary abstractions and interprocedural analyses) that were influenced by the ideas of shape analysis, so as to demonstrate the impact of the field.
Shape analyses focusing on inferring highly complex properties of heap-manipulating programs are techniques used in the automatic verification of infinite-state programs.
This survey provides a comprehensive introduction to the field of shape analysis, and presents the foundation of the topic in an accessible manner to readers who are not familiar with it. To do so, the authors characterize the essence of shape analysis compared to more classical pointer analyses as well as giving the intuition underlying the abstractions commonly used in shape analysis and the algorithms that allow to statically compute intricate semantic properties.
The authors cover the main families of shape analysis abstraction and algorithms, highlight the similarities between them, and characterize the main differences between the most common approaches. Finally, the authors demonstrate the impact of the field by describing a few other static analysis works – array abstractions, dictionary abstractions and interprocedural analyses – that were influenced by the ideas of shape analysis.
Researchers and students new to the concept of shape analysis will find this monograph a one-stop shop of information that will quickly get them up to speed on all aspects of the topic. With a comprehensive set of references, this accessible survey will enable the reader to adopt the techniques quickly in their own further research.