PHAST Library (Parallel Heterogeneous-Architecture STL-like Template Library) is a modern C++ programming library based on the classic STL "containers, iterators, algorithms" approach.
It defines three main containers: the canonical mono-dimensional vector, a bi-dimensional matrix, and a three-dimensional cube (it should be parallelepiped, but that's an ugly name!). All the containers are dynamic, as well as STL containers.
Many types of iterators are defined accordingly, so that containers can be iterated in multiple ways: algorithms can work not only on ranges of scalar elements, but also on rows, matrices, or three-dimensional blocks. In general, each container can be seen as a collection of sections of a lesser or equal dimensionality.
This abstraction level gives the user a powerful tool to natively express complex problems that can't be expressed by the means of other similar libraries.
All the algorithms are parallel under-the-hood. The whole library can be targeted on multi-core systems (standard C++ thread implementation) or NVIDIA GPUs (CUDA implementation) via a single #define statement.
PHAST Library is in fact a multi-platform parallel library, faithful to the code once philosophy.
Different platforms require different parallelization techniques. They have already been implemented in the inner layers, so users don't have to worry about them. These techniques have been summarized and translated in a bunch of parallelization parameters.
PHAST Library uses heuristics to infer the values of such parameters, trying to intercept the configuration that would lead to the best performance for the task at hand. It also allows users to manually set them. This way, custom optimization is still possible, and in fact it can be achieved with a minimum effort.
Now, let's explore PHAST Library main features!
In PHAST Library there three containers: vector, matrix, and cube. They are mono-, bi-, and three-dimensional, respectively. A coordinate system has been attached to them, with the three axes labelled i, j, and k.
Various kinds of iterators can be obtained from each container via begin\end methods. These iterators are completely described by:
The axes spanned by an iterator are clearly specified in its name and the begin\end methods used ot obtain it. For instance, an iterator_i of a matrix will span axis i and can be obtained by calling begin_i or end_i.
The dimensionality of the sections pointed by an iterator can be immediately calculated by subtracting from the dimensionality of the container the number of axes spanned by the iterator. For instance, an iterator_i of a matrix will point to sections of dimensionality 1, i.e. vectors.
Some applications require accessing data in a blocking fashion with blocks of variable size. To achieve this, a grid object can be constructed on containers and iterated through grid_iterators, special iterators that point to sections of the same dimensionality of the container.
The following table shows the available iterators for each container, the nature of the sections pointed by each of them, and the axes they span.
For instance, a cube defines a range of iterator_i [begin_i(), end_i()) that spans axis i and point to matrices that lay on axes j and k.
Or, a range [begin_ij(), end_ij()) in a matrix object is a range of iterator_ij pointing to scalar elements.
PHAST Library provides many STL-like algorithms that permit applying the most common computations on ranges of iterators. Here is a full list:
Some algorithms, like for_each, count_if, and find_if, apply a unary, binary, or ternary operation to each of the sections pointed by the iterators in the range. The particular operation performed on each section is embodied by a PHAST-functor passed to the algorithm as a parameter.
PHAST-functors are modern C++ structs with an operator() method defined in their bodies. They must derive from pre-defined base-functors that determine the nature of the functor, i.e., the number and types of parameters it accepts in its operator() method. A full list of these base-functores follows:operator() parameters can be references or const-references to scalar, vector, matrix, or cube containers in the phast::functor:: namespace. They are not low-level structures, but in-functor containers that can be iterated via in-functor iterators and manipulated via in-functor algorithms, the same way as containers are iterated via iterators and manipulated via algorithms. This way, code can be maintained concise and highly expressive even in functors!
All that said, the best way to understand how to write PHAST code is to check how correct code looks like! Try different containers and iterators and watch the code changing accordingly. We bet you will notice how brief and concise it remains...
As we said, an important feature of PHAST Library is the possibility to manipulate in-functor containers via in-functor algorithms.
These are methods of the inherited base-functor, and can be accessed via this-> inside a functor's operator(). Here is a full list of them:
Some in-functor algorithms admit a unary operation. This operation can be embodied by an inner-functor, that can be declared similarly to the main-functor.
So, PHAST Library admits a multi-layered functor structure that leads to a hierarchical, nested parallelism.
So, application code is not affected by the underlying platform!
It can be migrated seamlessly from a platform to another with a single macro definition: no setup code, no contexts, no explicit device manipulation, no boiler-plate related to the underlying device.
This code once philosophy has been a major concern while developing PHAST Library and will be surely honored in future developments.
The way algorithms execute on a given platform can be tuned by setting some parallelization parameters. PHAST Library tries its best to automatically select the optimal values for them, considering the particular architecture chosen for that execution. Though, the selection process is based on heuristics, and sometimes a sub-optimal set can be selected to execute the given task.
For this reason, PHAST Library gives to its users the possibility to explicitly select some parameters' values.
These parameters are architecture-specific and non-overlapping: this way, users can specify all of them once and only the relevant ones will affect the program execution.
Some research papers that present PHAST Library idea and programming approach. If you use PHAST Library in a scientific publication, please cite the Main Paper written below.
B. Peccerillo, and S. Bartolini, "PHAST - A Portable High-Level Modern C++ Programming Library for GPUs and Multi-Cores", IEEE Transactions on Parallel and Distributed Systems 30, 1 (Jan 2019), 2019, pp. 174-189
B. Peccerillo and S. Bartolini, "Task-DAG Support in Single-Source PHAST Library: Enabling Flexible Assignment of Tasks to CPUs and GPUs in Heterogeneous Architectures", Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM), Washington DC, 2019, pp. 91-100
B. Peccerillo and S. Bartolini, "Single-source Library for Enabling Seamless Assignment of Data-parallel Task-DAGs to CPUs and GPUs in Heterogeneous Architectures", Proceedings of the 10th and 8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM), Valencia, 2019, pp. 3:1-3:4
B. Peccerillo and S. Bartolini, "PHAST Library – Enabling Single-Source and High Performance Code for GPUs and Multi-cores", 2017 International Conference on High Performance Computing & Simulation (HPCS), Genova, 2017, pp. 715-718;