The directive way approach has shown that one stands to gain by improving the performance of code execution.
Give hints to the compiler so it knows what to do.
#pragma omp parallel for reduction(+:pi)
and its equivalent in OpenACC: #pragma acc kernels
This is something that is usually added right before code that can be highly
parallelizable. Takes serial code and makes run on GPUs.
#pragma acc kernels
Put a directive in front of the loops, then the compiler will generate the right gpu code.
pragma
is a directive of instruction to the compilers.
#pragma acc kernels [clause ...]
is followed by structure blocks such as for loop in C.
Using keywords like restrict
float *restrict y
when nothing is going to point to y
, then
use restrict.
To activate the directives:
pgcc -acc -Minfo=accel saxpy.c
Here, I take a look at the multiprocessing module
comments powered by Disqus