Oleg Kiselyov (2018), "Reconciling Abstraction with High Performance:
A MetaOCaml approach", Foundations and Trends® in Programming Languages: Vol. 5: No. 1, pp 1-101. http://dx.doi.org/10.1561/2500000038

© 2018 O. Kiselyov

Partial Evaluation, Program Transformations and Optimizations, Compilation and Interpretation Techniques, Domain Specific Languages

Download article
**In this article:**

1. Introduction

2. First Steps

3. Filtering

4. Linear Algebra DSL:
Complex Vector Arithmetic and Data Layout

5. Linear Algebra DSL:
Matrix-Vector Operations and Modular
Optimizations

6. From an Interpreter to a Compiler: DSL for
Image Manipulation

7. Further Challenges

8. Conclusions

Acknowledgements

Index of the Accompanying Code

References

A common application of generative programming is building highperformance computational kernels highly tuned to the problem at hand. A typical linear algebra kernel is specialized to the numerical domain (rational, float, double, etc.), loop unrolling factors, array layout and a priori knowledge (e.g., the matrix being positive definite). It is tedious and error prone to specialize by hand, writing numerous variations of the same algorithm. The widely used generators such as ATLAS and SPIRAL reliably produce highly tuned specialized code but are difficult to extend. In ATLAS, which generates code using printf, even balancing parentheses is a challenge. According to the ATLAS creator, debugging is nightmare. A typed staged programming language such as MetaOCaml lets us state a general, obviously correct algorithm and add layers of specializations in a modular way. By ensuring that the generated code always compiles and letting us quickly test it, MetaOCaml makes writing generators less daunting and more productive. The readers will see it for themselves in this hands-on tutorial. Assuming no prior knowledge of MetaOCaml and only a basic familiarity with functional programming, we will eventually implement a simple domain-specific language (DSL) for linear algebra, with layers of optimizations for sparsity and memory layout of matrices and vectors, and their algebraic properties. We will generate optimal BLAS kernels. We shall get the taste of the “Abstraction without guilt”.

1. Introduction

2. First Steps

3. Filtering

4. Linear Algebra DSL:
Complex Vector Arithmetic and Data Layout

5. Linear Algebra DSL:
Matrix-Vector Operations and Modular
Optimizations

6. From an Interpreter to a Compiler: DSL for
Image Manipulation

7. Further Challenges

8. Conclusions

Acknowledgements

Index of the Accompanying Code

References

*Reconciling Abstraction with High Performance: A MetaOCaml Approach* teaches the reader how to write typed code generators, how to make them modular, and how to gradually introduce domain-specific optimizations with MetaOCaml. Assuming no prior knowledge of MetaOCaml and only a basic familiarity with functional programming, it explains and illustrates how to implement a simple domain-specific language (DSL) for linear algebra, with layers of optimizations for sparsity and memory layout of matrices and vectors, and their algebraic properties.

*Reconciling Abstraction with High Performance: A MetaOCaml Approach* is based on the written record of a live tutorial delivered on several occasions (first at CUFP – Commercial Users of Functional Programming 2013). It inherits the hands-on style of those tutorials, built around live coding, in interaction with the MetaOCaml and its type checker and the audience. It develops code piece-by-piece by submitting small fragments to the MetaOCaml interpreter, fixing type problems, generating sample code and testing it, noting the points of improvement, and adjusting the generator as needed. The monograph includes many exercises and homework projects to work on alone or in groups.

Replication Data | 2500000038_supp.zip (ZIP).

This file contains the source code of the tutorial (see Index of the Accompanying Code in the monograph).