The best way to learn a new technology is to look at some examples:

Basic setup

In this example, we initialize a basic environment just creating a new communicator (CPU-based or GPU-based). Such created environment is pretty useless, though. We need to create some kernels and buffers to play with.

Simple kernel computation

In this example, we execute a simple kernel performing the square of every element of the given buffer. As you can see, you can treat buffers like std::vector, thus using operator[] for setting and getting values.

STL Algorithm functions

In Laetus, we have converted most STL Algorithm functions to operate on OpenCL buffers in parallel. In the example, we fill two buffer, each one of one million elements, with different content, and then we swap these contents. The final std::cout are almost instantaneous.

Iterators support

Thanks to function overloading, for each algorithmic function two versions are provided: the former works on the entire buffer, the latter takes one or more iterators, just like the serial counterparts.

Seamless structs passing

You can even pass Laetus a Buffer of complex data structures, the only prerequisite is that it can't contain a pointer to function. This is an OpenCL's limitations, not a Laetus' one, though. Note that you need to define your struct both host and device side. In the example beside, we start creating a new std::vector of Vec structs, and initializing a new Laetus buffer with the newly created vector. We'll store the result of the computation in another Vec struct, so we need to create a appropriate Laetus buffer with only one free slot to store the result. The OpenCL kernel is pretty simple, we use atomic functions to add, for each work item, all coordinate to the appropriate result vector. So we'll end having in the result 12,15,18 where 12 i.e {v1.x + v2.x and v3.x}. For y and z the same applies.

This is the kernel: ..and this is the host code.

Big Data computations

Laetus uses an extremely smart memory management system, automatically dividing huge data structure in small and efficient memory chunks. These tecnology allows you to operate on huge data structures. Take a look at these fill, on 700 Million of elements: