Restructuring large-scale HPC Codes with Coccinelle

LRZ has experience developing techniques for HPC-oriented large-scale restructuring of code by means of the Coccinelle software.

Coccinelle works by means of a semantic patch language (SmPL) in which one  transformations of C/C++ code.

The idea behind this approach is that the transformation specification is kept short, and the code affected is large.

It can be viewed as an aid towards modification of large-scale codes, and it allows performance experiments, pervasive code patching of great complexity.

The following table brings an example of its usage.

 This program declares two arrays and two array accesses.

Suppose we wish to replace all triple bracket references with C++23 (in red) multi-index bracket accesses.

This SmPL program for Coccinelle contains two rules.

The first rule is named tomultiindex.

It replaces expressions of the form a[x][y][z] ( x,y,z being arbitrary expressions) with a[x,y,z] – that is, a new C++23 notation.

The second rule is anonymous.

It replaces expressions of the form b[...] (with ... being any expression) with b[0].

The result is being shown here (with modified lines in blue).



int main ()
{        
       int a [1][1][1];
       int b [1][1][1];
       int i=0,j=0,k=0;
       a[i][j][k]++;
       b[i][j][k]++;
}

# spatch --c++=23
@tomultiindex@
symbol a;  
expression x,y,z;
@@
- a[x][y][z]
+ a[x, y, z]

@@ 

symbol b;
@@
- b [...]
+ b [0]

int main ()
{
       int a [1][1][1];
       int b [1][1][1];
       int i=0,j=0,k=0;
       a[i, j, k ]++;
       b[0][ j][k ]++;
}
To try the example, you can put this listing in example.cpp

This listing may go in example.cocci.

This is the usual semantic patch file extension.

Then you may get to this output by calling

spatch --test example

The spatch executable comes with any Coccinelle installation.

You can install and use Coccinelle on your personal laptop; coccinelle is available on most Linux distributions.

The approach can be used for more complex changes, involving function arguments, data structures, pragma-based APIs like OpenMP or OpenAcc. 

The idea is that updating a huge codebase in thousands of locations may be extremely time consuming, error-prone, or sheer impossible without this tool. For instance, Coccinelle is being routinely used for the maintenance and update of Linux Kernel drivers (it's millions of lines of code). 

To see our work on other use cases, as well as documentation:  here is a starting point on the Coccinelle website; on Michele Martone's website is a list of tutorials and publications related to this approach.


The starting use case for us to gain experience with it was with an Array-of-Structures to a Structure-of-Arrays vectorization-enabling refactoring, of which you can read in  Michele Martone, Julia Lawall: Refactoring for Performance with Semantic Patching: Case Study with Recipes  https://inria.hal.science/hal-03266521 (https://link.springer.com/chapter/10.1007/978-3-030-90539-2_15)


At any rate, you're welcome to inquire via the https://servicedesk.lrz.de/ by specifying HPC and Coccinelle in the subject. if you're interested in support with this in your specific project (e.g. consultation, trainings, etc). Please don't ask for installing this on the machine – this is not intended at the moment.