Previous | Contents | Index |
Compaq KAP C does parallel decomposition of programs so they run on
symmetric multiprocessor (SMP) systems. This chapter describes how to
compile and run a program for parallel execution using the
kcc
driver and
kapc
.
3.1 Overview
KAP C transforms C source programs so that, when compiled and linked, they execute as multithreaded processes. These threads can run simultaneously --- that is, in parallel --- on symmetric multiprocessor systems. The result is a program whose start-to-finish time is less than a C program that does not execute as a multithreaded process. More specifically, at run time the instructions from for loops in a transformed C program execute in parallel mode. Parallelization is the process that transforms for loops into instructions in an executable file that execute as multithreaded processes.
KAP C considers all for loops in a program as candidates for parallelization. Each loop is or is not parallelized according to:
This chapter describes the three basic methods of controlling parallel processing (automatic, directed, and combination). It explains, for each method, how to:
Compaq KAP C provides three methods for programmers to control parallel processing. Their summaries follow:
KAP C will not perform automatic parallel decomposition or serial optimization on files that contain OpenMP directives. |
> kcc -ckapargs='-concurrent' my_prog.c |
> kcc -ckapargs='-noconc' my_prog.c -omp -pthread -call_shared |
> kcc -ckapargs='-concurrentize' openmp.c no_openmp.c |
When using any of these three methods, you must be aware of the values
of certain environment variables, because they affect the run-time
behavior of your program. See Section 3.10, Running a Parallelized Program.
3.3 Summary of Parallel Processing Controls
Compaq KAP C provides the following parallel command-line switches, directives, and assertions for use with automatic parallel processing. Refer to the appropriate sections for explanations and code examples as follows:
-concurrent
-minconcurrent
-scheduling
#pragma _KAP minconcurrent(<integer>) #pragma _KAP concurrent #pragma _KAP concurrent call #pragma _KAP serial |
Parallel pragmas
Worksharing pragmas
Workqueuing pragmas
Combined parallel and worksharing/workqueuing pragmas
Synchronization pragmas
Privatization of global variables pragmas
Two types of command lines, kcc and kapc , invoke KAP C software:
As a programmer, you should always remember that you implement a parallel processing method (automatic, directed, or combination) by making choices from the previous command-line options, directives, and assertions. Your choices affect the following actions:
For example, suppose you choose combination detection and parallelization for source programs openmp.c and no_openmp.c . These programs contain some or none of the parallel processing directives, parallel processing assertions, and OpenMP directives. Consider the following command:
> kcc -ckapargs='-concurrent -minconcurrent=1000' openmp.c no_openmp.c |
This command tells KAP to:
KAP C parallel processing options, such as -concurrent , are enclosed in single quotation marks and are values of the -ckapargs option. The kcc driver responds to the options enclosed in these single quotation marks by passing them as arguments to the kapc preprocessor (which actually transforms the source program file).
The default values of the parallel processing options also control KAP C loop detections, loop transformations, calling of the compiler and linker, and run-time scheduling. They are:
-minconcurrent=1000 -scheduling=e -chunk=1 |
Read the explanations of each of the three methods of parallelization
in light of how your choices of options, directives, and assertions
affect KAP C detection of loops, changes to loops, compiler and linker
behavior, and run-time behavior of executable file
a.out
.
3.5 Automatic Parallelization Using the kcc Driver
To compile and run your program with parallel processing, use the -concurrentize switch, abbreviated -conc, as follows:
> kcc -ckapargs='-conc' myprog.c |
For information on running a parallel program, see Section 3.10.
3.6 Preprocessing a Program for Parallel Execution Using kapc
To execute KAP as a standalone preprocessor, use the following commands:
cc -P -D__KAP -U_INLINE_INTRINSICS myprog.c kapc -conc -cmp=myprog_mp.c myprog.i cc -migrate myprog_mp.c -tune host -call_shared -omp -lkio -fast -pthread |
An explanation of the remaining switches follows:
When you use kapc to preprocess a file, you must set the Compaq C compiler and linker switches appropriately. For this reason, Compaq recommends that you use kcc whenever possible, because kcc automatically sets the compiler and linker switches correctly. |
Under the directed method, KAP C does not do any automatic parallel detection. As always, any OpenMP directives in the original source program are passed to the C compiler for processing.
Parallelization by means of inserting OpenMP directives is most useful for programs under the following circumstances:
The directed method applies only to for loops with OpenMP directives. Consider a C application with two procedures called "example_1" and "example_2" with the following contents:
int a[1000], b[1000], c[1000], d[1000], n; void example_1( ) { int i; #pragma omp parallel shared(n, a, b, c, d) private(i) { #pragma omp for nowait for(i=0;i<n;i++){ a[i]=b[i]+c[i]; if(d[i]){ a[i]/=d[i]; } } } } void example_2() { int j; for (j=0;j<n;j++) { a[j]=b[j]*c[j]; d[j]=a[j]/d[j]; } } |
Compaq KAP passes the OpenMP directives of the for loop with index "i" onto the compiler for processing. KAP C does not parallelize the for loop with index "j". So, "directed" means that any loops not surrounded with OpenMP directive statements are not parallelized. If instead KAP C were to attempt to transform both for loops, then it would be running under the combination method where the procedure "example_2" would have to reside in a C source file that does not contain any OpenMP directives.
An example of how to use KAP to process a program for which no automatic parallelization is desired is given below:
> kcc -ckapargs='-noconc' my_prog.c -omp -pthread -call_shared |
The results include a transformed source program and its processing by
the compiler and linker to create executable file
a.out
. Because of the
-noconc
switch, KAP C does not automatically set compiler and linker switches
related to parallel processing. Therefore, the user must explicitly set
the
-omp
and
-pthread
compiler and
-call_shared
linker switches.
3.7.1 Changing Source Programs
Insert OpenMP directives (beginning with #pragma omp ) only with loops that are safe to parallelize. When KAP C sees a loop prefaced with OpenMP directives, it does not perform data dependence analysis on that loop and does not prevent you from using a parallel directive incorrectly. The OpenMP directives are described in the Compaq C documentation.
The OpenMP directives (pragmas) are listed below:
#pragma omp parallel
#pragma omp for
#pragma omp sections
#pragma omp single
#pragma omp taskq
#pragma omp task
#pragma omp parallel for
#pragma omp parallel sections
#pragma omp parallel taskq
#pragma omp critical
#pragma omp ordered
#pragma omp master
#pragma omp atomic
#pragma omp flush
#pragma omp barrier
#pragma omp threadprivate
There are no KAP C switches that affect the processing by the compiler
of OpenMP directives inserted by the user.
3.7.3 Directing the Compilation and Linking Process
To parallelize a program containing OpenMP directives, you normally need to give only the kcc command with the -noconc KAP switch, the -omp and -pthread C compiler switches, and -call_shared linker switch.
An example follows:
> kcc -ckapargs='-noconc' myprog.c -omp -pthread -call_shared |
Because of the
-noconc
switch, KAP C does not automatically set the compiler and linker
switches needed for parallelization. Correct ones appear here.
3.8 Combined Automatic and Directed Parallelization Using the kcc Driver
Parallelization by the combined method is most useful for large programs in which you want to explicitly control the parallelization of some for loops by inserting OpenMP parallel directives while letting KAP C automatically parallelize the remaining loops. The combined method is a merge of the automatic and directed methods. The appropriate command line to use to process a program using the combined method is:
> kcc -ckapargs='-concurrent' openmp.c no_openmp.c |
You insert OpenMP directives around those for loops that you want to explicitly parallelize. As mentioned previously, KAP C/OpenMP does not perform automatic parallel decomposition or serial optimizations on files that contain OpenMP directives.
In addition, you can insert guiding assertions, that is, non-OpenMP directives, around loops that you want to help KAP C to parallelize automatically. KAP C cannot automatically parallelize loops with data dependencies between loop iterations and loops with calls to external routines. You can help KAP C automatically parallelize these loops by placing parallel processing assertions and parallel processing directives (each beginning with #pragma _KAP) in the source program. These assertions and directives are:
#pragma _KAP concurrent #pragma _KAP concurrent call #pragma _KAP serial #pragma _KAP minconcurrent |
Command-line switches you can give to KAP C that affect its transformation of for loops are:
To construct a program for parallel execution via the combined method, you normally need to give only the -concurrent switch to the kcc command as follows:
> kcc -ckapargs='-concurrent' my_prog.c |
The -concurrent switch tells KAP to automatically parallelize appropriate for loops within files that do not contain OpenMP directives. The -concurrent switch also sets the compiler and linker switches needed for parallelization. KAP inserts OpenMP directives around loops that it automatically detects are good candidates for parallelization. The actual parallelization is done by the compiler which processes the OpenMP directives inserted automatically by KAP and the OpenMP directives inserted by the programmer.
Finally, you may want to create a completely non-parallelized program so you can compare its execution time with the times of programs that are parallelized in various ways (such as the automatic method and the directed method). The following command does this:
> kcc -ckapargs='-noconc' -noomp myprog.c |
The
-noconc
switch prevents automatic parallelization of
for
loops. The
-noomp
switch prevents the C compiler from responding to any parallel
directive statements in the transformed source file it receives.
3.9 Compiling a Program for Parallel Execution Using kapc
Normally, you use the kcc command with the -conc switch to create an optimized and parallelized executable file. Compaq recommends this command because it sets the compiler and linker switches correctly. To view these switches, include the -v switch with the kcc command. |
To compile a program for parallel execution using the kapc command on Tru64 UNIX, issue the following commands:
kapc -conc -cmp=myprog_mp.c myprog.c cc myprog_mp.c -fast -tune host -automatic -omp -pthread |
The kapc command preprocesses myprog.c to produce a new source file, myprog_mp.c , which contains OpenMP directives inserted by KAP C for loops KAP C has selected for automatic parallelization. The file, myprog_mp.c is then processed by the compiler and linker to produce a parallelized executable, a.out . Further explanation of the switches used follows:
To run a program parallelized with OpenMP directives, you may want to change the following environment variables:
ps mOpcpu |
-addressresolution=4 --- see Section 4.9.1
-fuse --- see Section 4.9.9
-fuselevel=1 --- see Section 4.9.10
-ipa --- see Section 4.5.1
-ipa_from_files=<file>,<file> --- see Section 4.5.7
-ipa_optimize=2 --- see Section 4.5.11
Previous | Next | Contents | Index |