Previous | Contents | Index |
This chapter describes the commands necessary to run KAP on Tru64 UNIX systems.
Compaq KAP Fortran/OpenMP can be run in either of two modes:
The following provides information and restrictions:
DO 100 x=0.0, 10.0, 0.2 |
KAP is installed on the Tru64 UNIX system with the system command
setld
. See the Compaq KAP Fortran/OpenMP for Tru64 UNIX Installation
Guide for details.
2.3 Compiling a Program Using the kf90 Driver
The
kf90
command invokes a driver program that automatically calls KAP,
the Compaq Fortran compiler, and the linker.
2.3.1 Passing Default KAP Switch Settings to kf90
Because kf90 calls KAP and the Compaq Fortran compiler, you can substitute the kf90 command for the Fortran 90 command. For example, to use kf90 to compile myprog.f90 with the default KAP switch settings, use the command:
kf90 myprog.f90 |
The kf90 command uses the KAP preprocessor on myprog.f90 , compiles the result with the Compaq Fortran compiler, links the object code into an executable image, and produces the following files:
To see a list of the KAP switches and Compaq Fortran compiler switches passed by kf90 , use the -v switch as follows:
kf90 myprog.f90 -v |
An example of the output is:
oursmp> kf90 -v matmul.f timing_fortran.o /usr/bin/kapf90 -cmp=./matmul.cmp.f matmul.f -tune=EV4 -nofree KAP/Tru64_U_F90 4.4 k340504 20010517 20-Aug-2001 09:27:21 KAP/Tru64_U_F90 4.4 k340504 20010517 : 0 errors in file matmul.f /usr/bin/f90 -fast -v ./matmul.cmp.f timing_fortran.o -tune host -non_shared /usr/lib/cmplrs/fort90/decfort90 -platinum -fast -tune host -non_shared -I/usr/lib/cmplrs/hpfrtl -o /tmp/forAAAaaupma.o ./matmul.cmp.f /usr/bin/cc -fast -v -tune host -non_shared /usr/lib/cmplrs/fort90/for_main.o /tmp/forAAAaaupma.o -O4 timing_fortran.o -qlshpf -lUfor -lfor -lFutil -lm -lots -qlc_r /usr/lib/cmplrs/cc/ld -g0 -O4 -non_shared /usr/lib/cmplrs/cc/crt0.o /usr/lib/cmplrs/fort90/for_main.o /tmp/forAAAaaupma.o timing_fortran.o -qlshpf -qlc_r -lUfor -lfor -lFutil -lm -lots -lc /usr/lib/cmplrs/cc/ld: 0.45u 0.39s 0:06 13% 0+28k 305+173io 14pf+0w 28stk+4752mem |
The final resource usage appears in C-shell format at the end.
2.4 Passing KAP Switches to kf90
The -fkapargs switch specifies one or more KAP command-line switches to the preprocessor. For example, to use kf90 to optimize and compile the file myprog.f90 using KAP switches for general optimization, use the command:
kf90 -fkapargs='-roundoff=3 -scalaropt=3 -list=myprog_annotated.lis' myprog.f90 |
For descriptions of all KAP command-line switches, see Chapter 4.
2.4.1 Passing Compaq Fortran Compiler Switches to kf90
Any command-line switch that is valid for the Compaq Fortran compiler or the linker is valid for the kf90 command. You can specify compiler switches and KAP switches on the same line. For example, to optimize and compile the file myprog.f90 using KAP switches for general optimization and to specify the name of the executable file with the Compaq Fortran compiler switch, -o , use the following command:
kf90 -fkapargs='-optimize=5 -roundoff=3 -scalaropt=3' -o=myprog.exe myprog.f90 |
The kf90 command specifies the Compaq Fortran compiler switches -notransform_loops , -lpthread , and -fast , by default. To override any of the individual compiler switches encompassed by -fast , specify them on the kf90 command line. For example, the following command sets the compiler switch -math_library accurate and overrides the default -math_library noaccurate set by -fast :
kf90 -math_library accurate myprog.f90 |
For information about the
-fast
compiler switch, see the Compaq Fortran User Manual for Tru64 UNIX
and Linux Alpha Systems.
2.4.2 Additional Information About Using the kf90 Driver
The kf90 command sets the compiler switch -tune host by default. The -tune host switch causes the compiler to optimize to the host architecture. For example, if you want to optimize for the ev5 architecture but are compiling on an ev4 system, you should override the default setting of the -tune switch, as follows:
kf90 -tune ev5 myprog.f90 |
The kf90 command specifies the linker switches -lpthread and -non_shared by default. The -non_shared switch causes the image to be linked with archive libraries instead of with shared libraries. To override the -non_shared default, specify -call_shared on the command line, for example:
kf90 -call_shared myprog.f90 |
The kf90 driver accepts either Fortran 90 or Fortran 77 source input.
Like the f90 command, the kf90 command assumes by default that source files with an extension of .f90 are free format, and source files with an extension of .f , .for , or .FOR are fixed format. You can override these defaults by using a format-related switch with either the KAP preprocessor or with the Compaq Fortran compiler. The format-related compiler switches are -free and -fixed . The corresponding KAP preprocessor switches are -freeformat and -nofreeformat .
Table 2-1 lists combinations of switches and file extensions and the resulting assumption KAP makes about the format of the source file.
Switches | Source File Extension | ||
---|---|---|---|
KAP | F90 | .f90 | .f, .for, .FOR |
default | default | free | fixed |
-freeformat | default | free | free |
-nofreeformat | default | fixed | fixed |
default | -free | free | free |
default | -fixed | fixed | fixed |
-freeformat | -fixed | KAP issues error message | |
-nofreeformat | -free | KAP issues error message |
For more information about the
-[no]freeformat
switch, see Section 4.5.7. For more information about the
-free
and
-fixed
switches, see the Compaq Fortran User Manual for Tru64 UNIX and
Linux Alpha Systems.
2.5 Compiling a Program Containing C Preprocessor Directives Using kf90
If your Fortran program contains C preprocessor directives and you do not want to use any additional C preprocessor directives in the kf90 command line, use the Compaq Fortran compiler switch -cpp , as follows:
kf90 -cpp myprog.f90 |
The -cpp switch causes the C preprocessor to run on your Fortran program before compilation.
In the event you want to use C preprocessor directives in the kf90 command line, you must also include the C preprocessor switch -C to avoid errors resulting from C comment lines. For example, in the following kf90 command line where -Dfoo is a C preprocessor switch, you must include -C , as follows:
kf90 -cpp -C -Dfoo myprog.f90 |
The
kf90
driver does not set the
-C
switch when you use C preprocessor directives in the command line.
2.6 Optimized Programs
The kf90 command saves the optimized version of your source program in the current directory for use in debugging and profiling. The default file extension of the optimized source depends on the input file extension, as follows:
File Extension of Input | File Extension of Transformed Source |
---|---|
.f90 | .cmp.f90 |
.f, .for, .FOR | .cmp.f |
The Compaq Fortran compiler uses the file extension of the optimized
source file to determine the source format. Compaq Fortran assumes
sources with a file extension of
.f90
are free format and sources with a file extension of
.f
,
.for
, or
.FOR
are fixed format. You can override the defaults by using the Compaq
Fortran compiler switches
-free
and
-fixed.
You can override the naming of the optimized program by using the
-cmp
switch. See the
-cmp
description in Section 4.9.1.
2.7 KAP Command-Line Switches Determined by Compiler Switches
Some Compaq Fortran compiler switches automatically set KAP command switches or alter the default KAP switch settings.
Explicitly calling the compiler switch -assume=accuracy causes KAP to be called with -roundoff=0 . Otherwise, the KAP command-line switch -roundoff defaults to -roundoff=3 .
Explicitly calling the compiler switch -nof77 causes KAP to be called with the -onetrip command-line switch.
Explicitly calling the compiler switch
-noi4
causes KAP to be called with the command-line switches
-integer=2
and
-logical=2
; otherwise, the defaults are
-integer=4
and
-logical=4
.
2.8 Compiling a Program Using kapf90
Use the following command to execute KAP as a standalone preprocessor:
kapf90 [kap_switch_string] myprog.f90 -cmp=myprog.cmp.f90 -freeformat |
The kapf90 command assumes that the source file input is fixed format by default. Use the Compaq KAP Fortran/OpenMP -freeformat switch to cause KAP to treat source files as free format, as shown in the previous code example. For more information about the -freeformat switch, see Section 4.5.7.
After preprocessing your program, give myprog.cmp.f90 to the Compaq Fortran compiler, as follows:
f90 -fast -tune host -non_shared myprog.cmp.f90 |
When you use kapf90 to process a file, you must set the Compaq Fortran compiler and linker switches appropriately. For this reason, Compaq recommends that you use kf90 whenever possible, because kf90 automatically sets the compiler and linker switches correctly. |
If a Fortran 90 program contains C preprocessor directives, preprocess it with cpp before you process it with kapf90 . For example, if your program has C include statements, process it as follows:
cpp -P myprog.f > myprog.i kapf90 myprog.i -cmp=myprog.f90 f90 myprog.f90 |
Specify switches in lowercase with the syntax -switch[=value]. Do not leave spaces between the switch name and the value. Switches can appear before or after the input file as follows:
kapf90 -inm myprog.f90 -roundoff=2 -freeformat |
KAP recognizes standard abbreviations for switches. Switches that take a list of names must have the names separated by commas and with no spaces, for example:
-inff=besl.f90,util.f90 |
Enclose KAP command-line switches passed through kf90 by using the -fkapargs switch with single quotation marks, as follows:
kf90 -fkapargs='-optimize=5 -roundoff=3 -scalaropt=3' -w myprog.f90 |
Compaq Fortran compiler switches, for example,
-w
, do not require quotation marks.
2.11 Using File Naming Conventions
Any input file name is valid. If the file name does not have an extension, the extension .f90 is assumed. As KAP processes a Fortran 90 file it generates three output files --- the optimized program file, the optional listing file, and the executable file.
The default output file names are as follows:
<file>.cmp.f90 --- the optimized Fortran 90 program from the kf90 driver
<file>.cmp --- the optimized Fortran 90 program from kapf90
<file>.out --- the annotated KAP listing file
a.out --- the executable file
Other output file names can be specified with the -cmp and -list switches.
When KAP detects an error condition, KAP writes a message to standard
error.
2.12 Guidelines for Optimizing With KAP
This section describes how you can get maximum performance in your application programs in the shortest time.
This information can be used with both multiprocessor and single-processor systems, and with both Fortran and C versions of all KAP products. Therefore, the information may contain references to command-line switches or settings that are unavailable or that are different from those in the KAP that you are using.
This section provides separate protocols for small and large programs. Small programs are defined as those that can be compiled and run quickly. Because the cost of each iteration is small, you can take risks. The information presented here further assumes that small programs have a small number of program units.
Large programs are defined as those that take more time to compile and
run than it takes for you to check the results. A program can be large
either because the source code is very large or because the execution
time is long.
2.12.1 Optimizing Small Programs with KAP
Follow these guidelines to optimize small programs:
Follow these guidelines to optimize large programs:
After you have used the KAP protocol for either small or large programs, you can find ways to fine-tune KAP to fit your application.
This section helps you discover which KAP command-line switches, directives, or assertions can be used to try to improve KAP performance for a particular application program. The following is a list of common goals and common program situations that KAP users often have, and it offers suggestions for possible improvements.
Remember that KAP is a tool to optimize Compaq Fortran code. Like any tool, it performs best when you are familiar with the details of how it works and are able to use its switches correctly and advantageously.
Although KAP default switch settings will achieve performance improvement, you can often achieve greater improvement if you understand and use alternate switch settings. Moreover, you can often insert directives or assertions to achieve improved performance.
See Table 2-2 for user actions and specific goals.
Goal | User Action |
---|---|
Have a more informative listing to help answer your questions. | Use -lo=otkl or other listing switches under -listoptions command-line switch. |
Recognize more reductions. | Increase -roundoff switch setting. |
Answer a KAP generated question. | Use appropriate assertion. |
Eliminate unnecessary last-value assignment. | Use !*$* assert no last value needed or -assume without the l switch; or try -save=manual . |
Spend less time optimizing deeply nested loops. | Reduce -limit and -arclimit or their directives. |
Disable inner loop unrolling. | Use -unroll=1 or -scalaropt < 2. |
Disable outer loop unrolling. | Use -roundoff < 3 or -scalaropt < 3. |
Prevent a given loop from being optimized. | Use !*$* assert do (serial) , !*$* assert do prefer (serial), !*$* noconcurrent , or !*$* optimize (0) . (Remember to reenable optimization after the serial loop.) |
Disable some data dependence checking. | Use !*$* assert no recurrence for one loop nest. |
Expand (inline) subroutine calls within DO loops. |
Use
-inline, -inline_from_files,
or
-inline_create
and
-inline_from_libraries . Or, if the goal is to execute the subroutine body concurrently, try -ipa or !*$* assert concurrent call . |
Inline more routines. |
Increase
-inline_depth
and
-inline_looplevel . (See also the !*$* inline directive.) |
Turn off directives and assertions. | Use the -nodirectives switch. |
Process a program that uses intentional array bounds violation. | Use !*$* assert bounds violations . |
Use STATIC storage. | Insert SAVE statements or use -save=all_adjust . |
After you have successfully run KAP on a working program by using either the protocol for small programs or that for large programs, you can try the following procedures to find additional opportunities for optimization within your program:
-fuse --- see Section 4.7.9
-fuselevel=1 --- see Section 4.7.10
-ipa --- see Section 4.6.1
-ipa_from_files=<file>,<file> --- see Section 4.6.7
-ipa_optimize=2 --- see Section 4.6.11
The following are some problems you may encounter when using KAP and possible fixes and workarounds:
Previous | Next | Contents | Index |