United States    
COMPAQ STORE | PRODUCTS | SERVICES | SUPPORT | CONTACT US | SEARCH
Alpha migration tools

 

.
} About AMT
.
.
.
amt_products.gif (1331 bytes)
} Migration Tools
FX!32
FreePort Express
}Download
}Quick Start
}User's Guide
}White Papers
DECmigrate
} Performance Tools
} Documentation & White Papers
.
.
.
.
.
.
.
.
fpx-name.gif (499 bytes)
User's Guide

Introduction |Translating | Debugging | Performance | fpx messages | fpxr messages

Chapter 4: Enhancing Performance

This chapter discusses various ways to enhance the performance of translated executables.


Enhancing performance by repeating translation

In many cases, you can improve translated performance by retranslating the original SunOS executable. The first time fpx translates an executable or a shared library, it attempts to find all the entry points. But at run time, fpxr may encounter entry points that fpx did not detect. When this occurs, fpxr interprets the newly discovered code and records information about each new entry point it finds. If you set the environment variable FPXR_GENERATE_FEEDBACK when you run an exectuable, fpxr writes the information to a feedback file, called executable.hif by default. The performance of the translated executable improves after you retranslate with the feedback file because fpx knows where to find the previously undiscovered code.

Also, branches and subroutine calls whose targets may vary at run time need to call into fpxr to locate their targets. The run-time performance of executables using this behavior can be improved substantially by using feedback. This is especially true of programs that use X11 services and programs written in C++.

Determining if you should retranslate

Most programs benefit somewhat by retranslating with feedback. If, upon program exit, fpxr prints a message like this one, you can improve performance substantially:

Total instructions emulated: nn

Run the program again with the FPXR_GENERATE_FEEDBACK environment variable set, and follow the instructions in the section Creating and using a feedback file.

If you set the FPXR_GENERATE_FEEDBACK environment variable before running a translated executable, fpxr may print a message like the following: nn lines written to .hif file...

If such a message does occur at run time, you can improve performance by retranslating the file.

Creating and using a feedback file

Figure 4-1 illustrates the process for creating a feedback file and retranslating the executable until no untranslated code is detected at run time.

Figure 4-1 Using feedback files to improve performance

amt-feedback.gif (12447 bytes)


Correcting for nonfinite numbers and SIGFPEs

If the translated executable aborts at run time with messages about floating-point exceptions (SIGFPEs), or issues a message that says the executable was performing calculations with nonfinite numbers, you have to retranslate the program with the -full_fp option in the fpx command line. Nonfinite numbers are infinites, denormalized numbers, and NaNs (not-a-numbers). Using the -full_fp option enables Alpha to handle the nonfinite calculations so the results are completion safe and conform to the IEEE standard for nonfinite numbers, IEEE/ANSI 754-1985.

Only use -full_fp if you receive SIGFPEs at run time. The -full_fp option exacts a performance penalty but allows the executable to complete correctly.

Note that in many cases, the use of a NaN or a denormalized number indicates a bug in the original program, usually the use of an uninitialized double-precision floating-point variable. If you can, you should check the source program for such errors and correct them.

Figure 4-2 illustrates the process for correcting for floating-point exceptions (SIGFPEs) or other problems caused by nonfinite numbers.

Figure 4-2 Correcting for nonfinite numbers and SIGFPEs

amt-denorm.gif (10465 bytes)

Refer to the figure and and follow these steps:

1 The translated program aborts at run time, issues floating-point exceptions (SIGFPEs), or issues the following fpxr message:

The translated executable attempted a floating-point calculation with a denormalized number...

To correct for nonfinite numbers and SIGFPEs, proceed to the next step.

2 Translate the input executable with the -full_fp option. The resulting translated executable should work properly.

If you still receive the SIGFPE errors, document the behavior and email your report to fpx-bug@amt.tay1.dec.com so that -full_fp can be fixed.

3 Run the executable on the Alpha system.

To improve performance in other aspects of the executable, such as getting jump points corrected, you should run the executable with FPXR_GENERATE_FEEDBACK and use the feedback process described in the section Creating and using a feedback file. Note that you must always have -full_fp set when you retranslate the input executable.

Notes on nonfinite numbers

SPARC based code handles some computations using nonfinite numbers transparently. For example, the code may generate or use NaNs, infinites, or by denormalized numbers. This may happen inadvertently during computations with uninitialized variables.

On the SPARC architecture, such operations proceed without generating a SIGFPE either because the hardware can handle them, or because they trap to operating system software that transparently fixes up the results.

On the Alpha architecture, attempts to use nonfinite numbers cause an exception to occur, as do conditions that might produce such nonfinite numbers, such as overflow. These floating-point traps are imprecise. That is, a trap may not be delivered until several instructions after the faulting one was issued. The system can only recover from such traps if the Alpha code is structured in such a way as to be "completion safe."

By default, fpx generates code that is not completion safe, because of the performance penalty incurred otherwise. If a program uses a nonfinite number or if it encounters an exceptional condition (such as an overflow or a divide by zero), it generates a floating-point exception, and Digital UNIX reports this with a SIGFPE signal. There is no way to recover from such an exception because there is no reliable way to determine which instruction caused the exception on Alpha.

Note that because SPARC handles these computations transparently, you may not be aware that the program uses nonfinite numbers in computations at all. Problems are invisible until the program fails after translation due to SIGFPEs with imprecise fault PCs.

If an executable that ran on SPARC fails with SIGFPE when translated to Alpha, retranslate the executable with the fpx option -full_fp. With -full_fp enabled, all floating-point code is generated to be completion safe. When the translated executable uses a nonfinite number or encounters an exceptional condition, -full_fp enables the exception to be handled by the Digital UNIX operating system, and the result is the expected value as specified in the IEEE floating-point standard, IEEE/ANSI 754-1985. Performance is somewhat slower, but you do not get traps.

In summary, if the executable is translated with -full_fp, then

All computations that use nonfinite numbers yield the results specified in IEEE/ANSI 754-1985, such as:
2 + Inf = Inf
Inf - Inf = NaN
All computations that produce nonfinite numbers yield the results specified in IEEE/ANSI 754-1985, such as:
2/0 = In
0/0 = NaNf

 

For additional information, see the ieee(1) reference page.

SPARC floating-point control negister

The fpx command does not precisely emulate the behavior of the SPARC floating-point control register. If a program expects some of its operations to trap, and it sets the SPARC floating-point control register to enable one or more traps, that behavior is ignored by the translated program and the default result is generated instead. Thus, for example, PL/I programs that use the "ON OVERFLOW" or "ON ZERODIVIDE" statements to catch floating-point problems find that the on-units are never entered.


Correcting unaligned double-precision floating-Point numbers

The fpx option -F controls floating-point optimizations. If you receive an error message saying, Unaligned access on ldt or Unaligned access on stt during run time, retranslate the executable with the -F option and run the executable again.

The message indicates that the original SPARC executable used double-precision floating-point numbers that are not aligned on a natural boundary for the Alpha system. The operating system can correct unaligned accesses at run time, but at a major performance penalty.

The -F option generates a conservative sequence for loading and storing double-precision numbers that is faster than allowing the operating system to fix the instruction, but is slower than the single load or store opcode. If you do not use the -F option, then the Digital UNIX system corrects each instance of an unaligned floating-point quadword instruction at run time, which can be very slow.

For information about how floating-point numbers are represented in the Alpha architecture, refer to the Alpha Architecture Reference Manual.

Writing a feedback file

If you can identify hidden entry points or other useful information about the input executable, you can write a feedback file manually to give fpx the information it is unable to discover during a translation. Hidden entry points are never called directly through SPARC jump or branch instructions. For example, code that is invoked only by signal handlers in stripped executables is usually a hidden entry point.

Your feedback files must follow the naming requirements described in Creating and using a feedback file and the format shown in Format used in feedback files in this chapter.

Format used in feedback files

The feedback file consists of a series of property records in ASCII, one per line, each of which attaches a property to an offset in the executable. For example:

+419a84 jalr "+4177ac +3"

Table 4-1 describes the components of a property record.

Table 4-1 Property Record Components

Component Description Example
offset The offset is a plus sign (+) followed by a hexadecimal number representing the address of an instruction in the executable. +419a84
property name The property name is a name from Table 4-2. jalr
property value (optional) The property value format depends on the property name. "+4177ac+3"

The conventions for specifying property records are

Use only one property record per line.
Lines beginning with a semicolon (;) are comment records and may occur anywhere in the file. The translator ignores them.
Case is significant in property names.
For each line, use spaces to separate the values of the offset, property name, and property value.
Symbol names and values that contain spaces or special characters must be enclosed in quotation marks (").

Example 4-1 is an excerpt from a sample .hif file. Use the format shown if you are writing a feedback file.

Example 4-1 Sample feedback file

+419a84 jalr "+4177ac +3"
+420acc jr "+421018 +75"
+420acc jr "+437de4 +20"
+4177ac jalentry "subr1"
+421018 branchentry "label2"

Properties Supported in Feedback Files

Table 4-2 describes the properties supported in feedback files.

Table 4-2 Property names and values in feedback files

Property name Property Value Interpretation
branchentry symbolic name Defines the offset as the target of a branch instruction.
dataentry none Defines the offset to be data; causes fpx to interpret at that location
denorm fn Indicates that there is sometimes adenormalized number in SPARC floating register fn (where n is an even number between 0 and 30), at execution time. The offset can be any location inside the basic block where the problem occurs. The translator generates code to change any denormalized value in fn to a true zero. This is usually the correct action because denormalized numbers are not produced as the result of any floating-point computations on an Alpha system, and therefore usually result from coding errors.
full_fp length Enables the -full_fp behavior for the basic blocks containing the instructions at the SPARC address through address+length-1. Redundant if -full_fp is used at translation. Use this to avoid translating the entire program by using -full_fp if you know exactly where the program uses or produces nonfinite
numbers. However, be sure that you have found all of the places where the code can trap.
jalentry symbolic name Defines the offset as the target of a jmp instruction.
jalr "+offset2 +count" Defines the offset as the start of a basic block that ends with a jalr instruction that transferred to offset2 count times during the run. There may be multiple jalr properties for the same offset.
jr "+offset2 +count" Defines the offset as the start of a basic block that ends with a jal instruction that transferred to offset2 count times during the run. There may be multiple jr properties for the same offset.
sets List of registers and condition codes in the form:

"%0 %1..." 1

Specifies resources set by routine.
+sets List of registers and condition codes in the form:

"%0 %1..." 1

Adds resources to those specified by a sets property name for the same address.
to_alpha rn Indicates that the program expects that at location offset, register rn (where n is a number between 1 and 31) contains the address of an Alpha instruction in the program, but when actually running the translated program, rn contains the address of a SPARC instruction. This entry causes fpx to insert code just before offset that converts the contents from a SPARC code address to an Alpha code address. This can happen when a program places the address of SPARC code into the return address location on the stack, then tries to transfer control to it.
to_sparc rn Indicates that the program expects that at location offset, register rn (where n is a number between 1 and 31) contains the address of a SPARC instruction in the program, but when actually running the translated program, rn contains the address of an Alpha instruction. This entry causes fpx to insert code just before offset that converts the contents of rn from an Alpha code address to a SPARC code address. This can happen when the code in the original program fetches a program counter value from the stack or from a system structure, and then tries to decode the SPARC instruction stream at that address, or tries to look up the address in a hard-coded table of locations.
uses List of registers and condition codes in the form:

"%0 %1..." 1

Specifies resources used by routine.
+use List of registers and condition codes in the form:

"%0 %1..." 1

Adds resources to those specified by a uses property name for the same address.
-uses List of registers and condition codes in the form:

"%0 %1..." 1

Removes resources from those specified by a uses property name for the same address.

1 The SPARC registers are

%g0 - %g7 %o0 - %o7
%l0 - %l7 %i0 - %i7
%f0- %f31 %psr
%y %fsr

An .iif or .hif file can also reference ICC_NZ, ICC_V, ICC_C, ICC, FCC, RETURN, and M.unk.

If you have questions about FreePort Express, send email to fpx-info@scrugs.lkg.dec.com.

 

amt_feather.gif (2243 bytes) Click to send feedback

 

1.800.AT.COMPAQ .

privacy and legal statement