**Overview**

**PHAST Library** (Parallel Heterogeneous-Architecture STL-like Template Library) is a modern C++ programming library based on the classic
STL **"containers, iterators, algorithms"** approach.

It defines three main containers: the canonical mono-dimensional **vector**, a bi-dimensional **matrix**, and a three-dimensional
**cube** (it should be *parallelepiped*, but that's an ugly name!). All the containers are dynamic, as well as STL containers.

Many types of iterators are defined accordingly, so that containers can be iterated in multiple ways: algorithms can work not only on ranges of scalar elements, but also on rows, matrices, or three-dimensional blocks. In general, each container can be seen as a collection of sections of a lesser or equal dimensionality.

This abstraction level gives the user a powerful tool to natively express complex problems that can't be expressed by the means of other similar libraries.

All the algorithms are **parallel** under-the-hood. The whole library can be targeted on multi-core systems (standard C++
thread implementation) or NVIDIA GPUs (CUDA implementation) via a single **#define** statement.

PHAST Library is in fact a multi-platform parallel library, faithful to the *code once* philosophy.

Different platforms require different *parallelization* techniques. They have already been implemented in the inner layers,
so users don't have to worry about them. These techniques have been summarized and translated in a bunch of
**parallelization parameters**.

PHAST Library uses heuristics to infer the values of such parameters, trying to intercept the configuration that would lead to the best
performance for the task at hand. It also allows users to *manually* set them. This way, custom optimization
is still possible, and in fact it can be achieved with a minimum effort.

Now, let's explore PHAST Library main features!

In PHAST Library there three containers: **vector**, **matrix**, and **cube**. They are mono-, bi-, and three-dimensional,
respectively. A coordinate system has been attached to them, with the three axes labelled **i**, **j**, and **k**.

Various kinds of iterators can be obtained from each container via begin\end methods. These iterators are completely described by:

- the axes they span while
*moving*; - the dimensionality of the sections they point to.

The axes spanned by an iterator are clearly specified in its name and the begin\end methods used ot obtain it. For instance, an
**iterator_i** of a matrix will span axis **i** and can be obtained by calling **begin_i** or **end_i**.

The dimensionality of the sections pointed by an iterator can be immediately calculated by subtracting from the dimensionality of the
container the number of axes spanned by the iterator. For instance, an **iterator_i** of a matrix will point to sections of
dimensionality **1**, i.e. **vectors**.

Some applications require accessing data in a blocking fashion with blocks of variable size. To achieve this,
a **grid** object can be constructed on containers and iterated through **grid_iterators**, special iterators
that point to sections of the same dimensionality of the container.

The following table shows the available iterators for each container, the nature of the sections pointed by each of them, and the axes they span.

Container | iterator_i | iterator_ij | iterator_ijk | grid_iterator |
---|

For instance, a **cube** defines a range of **iterator_i** [begin_i(), end_i()) that spans axis **i** and point to matrices that
*lay* on axes **j** and **k**.

Or, a range [begin_ij(), end_ij()) in a **matrix** object is a range of **iterator_ij** pointing to scalar elements.

PHAST Library provides many STL-like algorithms that permit applying the most common computations on ranges of iterators. Here is a full list:

Some algorithms, like **for_each**, **count_if**, and **find_if**, apply a unary, binary, or ternary operation to each of the sections pointed
by the iterators in the range. The particular operation performed on each section is embodied by a **PHAST-functor** passed to the algorithm
as a parameter.

**PHAST-functors** are modern C++ structs with an **operator()** method defined in their bodies. They must derive from pre-defined **base-functors**
that determine the *nature* of the functor, i.e., the number and types of parameters it accepts in its **operator()** method. A full list of these
base-functores follows:

All that said, the best way to understand how to write PHAST code is to check how *correct code* looks like! Try different containers and
iterators and watch the code changing accordingly. We bet you will notice how brief and concise it remains...

As we said, an important feature of PHAST Library is the possibility to manipulate **in-functor containers** via **in-functor algorithms**.

These are methods of the inherited base-functor, and can be accessed via **this->** inside a functor's **operator()**. Here is a full list of them:

Some in-functor algorithms admit a unary operation. This operation can be embodied by an **inner-functor**, that can be declared similarly to
the main-functor.

So, PHAST Library admits a **multi-layered functor structure** that leads to a hierarchical, nested parallelism.

So, application code is not affected by the underlying platform!

It can be migrated seamlessly from a platform to another with a single macro definition: no setup code, no contexts, no explicit device manipulation, no boiler-plate related to the underlying device.

This **code once** philosophy has been a major concern while developing PHAST Library and will be surely honored in future developments.

The way algorithms execute on a given platform can be tuned by setting some **parallelization parameters**. PHAST Library tries its best to automatically select the
**optimal** values for them, considering the particular architecture chosen for that execution. Though, the selection process is based on heuristics, and sometimes a sub-optimal set can be
selected to execute the given task.

For this reason, PHAST Library gives to its users the possibility to explicitly select some parameters' values.

These parameters are architecture-specific and non-overlapping: this way, users can specify all of them once and only the relevant ones will affect the program execution.

Some research papers that study PHAST Library idea and programming approach. If you want to cite us, please refer to the main one.

B. Peccerillo, and S. Bartolini, "PHAST - A Portable High-Level Modern C++ Programming Library for GPUs and Multi-Cores", IEEE Transactions on Parallel and Distributed Systems 30, 1 (Jan 2019), 2019, pp. 174-189

B. Peccerillo and S. Bartolini, "Task-DAG Support in Single-Source PHAST Library: Enabling Flexible Assignment of Tasks to CPUs and GPUs in Heterogeneous Architectures", Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM), Washington DC, 2019, pp. 91-100

B. Peccerillo and S. Bartolini, "Single-source Library for Enabling Seamless Assignment of Data-parallel Task-DAGs to CPUs and GPUs in Heterogeneous Architectures", Proceedings of the 10th and 8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (PARMA-DITAM), Valencia, 2019, pp. 3:1-3:4

B. Peccerillo, S. Bartolini, and Ç. K. Koç, "Parallel Bitsliced AES through PHAST: a Single-Source High-Performance Library for Multi-Cores and GPUs", Journal of Cryptographic Engineering, 2017

B. Peccerillo and S. Bartolini, "PHAST Library – Enabling Single-Source and High Performance Code for GPUs and Multi-cores", 2017 International Conference on High Performance Computing & Simulation (HPCS), Genova, 2017, pp. 715-718;

Now that you have read the main features of PHAST Library, try it yourself!

Click here and register to our website. You will gain access to the download section.