Compaq KAP C/OpenMP
for Tru64 UNIX
User Guide


Previous Contents Index


Chapter 3
KAP Parallel Processing

Compaq KAP C does parallel decomposition of programs so they run on symmetric multiprocessor (SMP) systems. This chapter describes how to compile and run a program for parallel execution using the kcc driver and kapc .

3.1 Overview

KAP C transforms C source programs so that, when compiled and linked, they execute as multithreaded processes. These threads can run simultaneously --- that is, in parallel --- on symmetric multiprocessor systems. The result is a program whose start-to-finish time is less than a C program that does not execute as a multithreaded process. More specifically, at run time the instructions from for loops in a transformed C program execute in parallel mode. Parallelization is the process that transforms for loops into instructions in an executable file that execute as multithreaded processes.

KAP C considers all for loops in a program as candidates for parallelization. Each loop is or is not parallelized according to:

This chapter describes the three basic methods of controlling parallel processing (automatic, directed, and combination). It explains, for each method, how to:

3.2 Parallel Processing Methods

Compaq KAP C provides three methods for programmers to control parallel processing. Their summaries follow:

Note

KAP C will not perform automatic parallel decomposition or serial optimization on files that contain OpenMP directives.

When using any of these three methods, you must be aware of the values of certain environment variables, because they affect the run-time behavior of your program. See Section 3.10, Running a Parallelized Program.

3.3 Summary of Parallel Processing Controls

Compaq KAP C provides the following parallel command-line switches, directives, and assertions for use with automatic parallel processing. Refer to the appropriate sections for explanations and code examples as follows:

Two types of command lines, kcc and kapc , invoke KAP C software:

3.4 Interaction of Parallel Processing Controls

As a programmer, you should always remember that you implement a parallel processing method (automatic, directed, or combination) by making choices from the previous command-line options, directives, and assertions. Your choices affect the following actions:

For example, suppose you choose combination detection and parallelization for source programs openmp.c and no_openmp.c . These programs contain some or none of the parallel processing directives, parallel processing assertions, and OpenMP directives. Consider the following command:


> kcc -ckapargs='-concurrent -minconcurrent=1000' openmp.c no_openmp.c 

This command tells KAP to:

KAP C parallel processing options, such as -concurrent , are enclosed in single quotation marks and are values of the -ckapargs option. The kcc driver responds to the options enclosed in these single quotation marks by passing them as arguments to the kapc preprocessor (which actually transforms the source program file).

The default values of the parallel processing options also control KAP C loop detections, loop transformations, calling of the compiler and linker, and run-time scheduling. They are:


-minconcurrent=1000 
-scheduling=e 
-chunk=1 

Read the explanations of each of the three methods of parallelization in light of how your choices of options, directives, and assertions affect KAP C detection of loops, changes to loops, compiler and linker behavior, and run-time behavior of executable file a.out .

3.5 Automatic Parallelization Using the kcc Driver

To compile and run your program with parallel processing, use the -concurrentize switch, abbreviated -conc, as follows:


> kcc -ckapargs='-conc' myprog.c 

For information on running a parallel program, see Section 3.10.

3.6 Preprocessing a Program for Parallel Execution Using kapc

To execute KAP as a standalone preprocessor, use the following commands:


cc -P -D__KAP -U_INLINE_INTRINSICS myprog.c 
 
kapc -conc -cmp=myprog_mp.c myprog.i 
 
 
cc -migrate myprog_mp.c -tune host -call_shared -omp -lkio -fast -pthread 
 

An explanation of the remaining switches follows:

Note

When you use kapc to preprocess a file, you must set the Compaq C compiler and linker switches appropriately. For this reason, Compaq recommends that you use kcc whenever possible, because kcc automatically sets the compiler and linker switches correctly.

3.7 Directed Parallelization Using the kcc Driver and OpenMP Directives

Under the directed method, KAP C does not do any automatic parallel detection. As always, any OpenMP directives in the original source program are passed to the C compiler for processing.

Parallelization by means of inserting OpenMP directives is most useful for programs under the following circumstances:

The directed method applies only to for loops with OpenMP directives. Consider a C application with two procedures called "example_1" and "example_2" with the following contents:


int a[1000], b[1000], c[1000], d[1000], n; 
 
void example_1( ) 
 
{ 
    int i; 
#pragma  omp parallel shared(n, a, b, c, d) private(i) 
 
    { 
#pragma omp for nowait 
    for(i=0;i<n;i++){ 
       a[i]=b[i]+c[i]; 
       if(d[i]){ 
          a[i]/=d[i]; 
 
       } 
     } 
   } 
 } 
void example_2() 
 
{ 
   int j; 
 
   for (j=0;j<n;j++) 
      { 
      a[j]=b[j]*c[j]; 
      d[j]=a[j]/d[j]; 
      } 
} 

Compaq KAP passes the OpenMP directives of the for loop with index "i" onto the compiler for processing. KAP C does not parallelize the for loop with index "j". So, "directed" means that any loops not surrounded with OpenMP directive statements are not parallelized. If instead KAP C were to attempt to transform both for loops, then it would be running under the combination method where the procedure "example_2" would have to reside in a C source file that does not contain any OpenMP directives.

An example of how to use KAP to process a program for which no automatic parallelization is desired is given below:


 
> kcc -ckapargs='-noconc' my_prog.c -omp -pthread -call_shared 

The results include a transformed source program and its processing by the compiler and linker to create executable file a.out . Because of the -noconc switch, KAP C does not automatically set compiler and linker switches related to parallel processing. Therefore, the user must explicitly set the -omp and -pthread compiler and -call_shared linker switches.

3.7.1 Changing Source Programs

Insert OpenMP directives (beginning with #pragma omp ) only with loops that are safe to parallelize. When KAP C sees a loop prefaced with OpenMP directives, it does not perform data dependence analysis on that loop and does not prevent you from using a parallel directive incorrectly. The OpenMP directives are described in the Compaq C documentation.

The OpenMP directives (pragmas) are listed below:

3.7.2 Giving Command-Line Switches

There are no KAP C switches that affect the processing by the compiler of OpenMP directives inserted by the user.

3.7.3 Directing the Compilation and Linking Process

To parallelize a program containing OpenMP directives, you normally need to give only the kcc command with the -noconc KAP switch, the -omp and -pthread C compiler switches, and -call_shared linker switch.

An example follows:


> kcc -ckapargs='-noconc' myprog.c -omp -pthread -call_shared 

Because of the -noconc switch, KAP C does not automatically set the compiler and linker switches needed for parallelization. Correct ones appear here.

3.8 Combined Automatic and Directed Parallelization Using the kcc Driver

Parallelization by the combined method is most useful for large programs in which you want to explicitly control the parallelization of some for loops by inserting OpenMP parallel directives while letting KAP C automatically parallelize the remaining loops. The combined method is a merge of the automatic and directed methods. The appropriate command line to use to process a program using the combined method is:


> kcc -ckapargs='-concurrent' openmp.c no_openmp.c 

3.8.1 Changing Source Programs

You insert OpenMP directives around those for loops that you want to explicitly parallelize. As mentioned previously, KAP C/OpenMP does not perform automatic parallel decomposition or serial optimizations on files that contain OpenMP directives.

In addition, you can insert guiding assertions, that is, non-OpenMP directives, around loops that you want to help KAP C to parallelize automatically. KAP C cannot automatically parallelize loops with data dependencies between loop iterations and loops with calls to external routines. You can help KAP C automatically parallelize these loops by placing parallel processing assertions and parallel processing directives (each beginning with #pragma _KAP) in the source program. These assertions and directives are:


#pragma _KAP concurrent 
#pragma _KAP concurrent call 
#pragma _KAP serial 
#pragma _KAP minconcurrent 

3.8.2 Giving Command-Line Switches

Command-line switches you can give to KAP C that affect its transformation of for loops are:

3.8.3 Directing the Compilation and Linking Process

To construct a program for parallel execution via the combined method, you normally need to give only the -concurrent switch to the kcc command as follows:


> kcc -ckapargs='-concurrent' my_prog.c 

The -concurrent switch tells KAP to automatically parallelize appropriate for loops within files that do not contain OpenMP directives. The -concurrent switch also sets the compiler and linker switches needed for parallelization. KAP inserts OpenMP directives around loops that it automatically detects are good candidates for parallelization. The actual parallelization is done by the compiler which processes the OpenMP directives inserted automatically by KAP and the OpenMP directives inserted by the programmer.

Finally, you may want to create a completely non-parallelized program so you can compare its execution time with the times of programs that are parallelized in various ways (such as the automatic method and the directed method). The following command does this:


> kcc -ckapargs='-noconc' -noomp myprog.c 

The -noconc switch prevents automatic parallelization of for loops. The -noomp switch prevents the C compiler from responding to any parallel directive statements in the transformed source file it receives.

3.9 Compiling a Program for Parallel Execution Using kapc

Note

Normally, you use the kcc command with the -conc switch to create an optimized and parallelized executable file. Compaq recommends this command because it sets the compiler and linker switches correctly.

To view these switches, include the -v switch with the kcc command.

To compile a program for parallel execution using the kapc command on Tru64 UNIX, issue the following commands:


kapc -conc -cmp=myprog_mp.c myprog.c 
 
cc myprog_mp.c -fast -tune host -automatic -omp -pthread 

The kapc command preprocesses myprog.c to produce a new source file, myprog_mp.c , which contains OpenMP directives inserted by KAP C for loops KAP C has selected for automatic parallelization. The file, myprog_mp.c is then processed by the compiler and linker to produce a parallelized executable, a.out . Further explanation of the switches used follows:

3.10 Running a Parallelized Program

To run a program parallelized with OpenMP directives, you may want to change the following environment variables:

3.11 Parallel Programming Tips


Previous Next Contents Index