openACC (My Notes)

Parallelize Your Code

  • Benefits:
    • No involvement of OpenCL, CUDA, etc.
    • Source can work on both CPUs (serially) and can be deployed to GPUs

Directives

The directive way approach has shown that one stands to gain by improving the performance of code execution.

Give hints to the compiler so it knows what to do.

#pragma omp parallel for reduction(+:pi) and its equivalent in OpenACC: #pragma acc kernels This is something that is usually added right before code that can be highly parallelizable. Takes serial code and makes run on GPUs.

#pragma acc kernels

Put a directive in front of the loops, then the compiler will generate the right gpu code.

Syntax

pragma is a directive of instruction to the compilers.

#pragma acc kernels [clause ...] is followed by structure blocks such as for loop in C.

C Pointers

Using keywords like restrict

float *restrict y when nothing is going to point to y, then use restrict.

To activate the directives: pgcc -acc -Minfo=accel saxpy.c

Parallel programming using the Multiprocessing module in Python

Here, I take a look at the multiprocessing module

comments powered by Disqus