This chapter provides reference material for the Float Manager API as follows:
The Float Manager API is declared in the header file FloatMgr.h
. For more information on the Float Manager, see the section "Floating-Point" in the Palm OS Programmer's Companion, vol. I.
Float Manager Data Structures
FlpCompDouble Struct
Purpose
Float Manager functions accept and require values of type FlpDouble
. The FlpCompDouble
union allows you to declare values that can be interpreted either as a double or as an FlpDouble
. As well, this union contains fields that provide easy access to the component parts of the double-precision floating-point number.
Prototype
typedef union {
double d;
FlpDouble fd;
UInt32 ul[2];
FlpDoubleBits fdb;
} FlpCompDouble
Fields
-
d
- Provides access to the value as a
double
. -
fd
- Provides access to the value as a
FlpDouble
, which can be passed to or received from many Float Manager functions. -
ul
- Provides access to the value as two long integers.
-
fdb
- Provides access to specific fields.
FlpDoubleBits Struct
Purpose
This structure provides direct access to the component parts of an IEEE-754 double-precision floating-point number. Use the FlpCompDouble
union to convert numbers of type double
to and from FlpDoubleBits
.
Prototype
typedef struct {
UInt32 sign : 1;
Int32 exp : 11;
UInt32 manH : 20;
UInt32 manL;
} FlpDoubleBits
Fields
-
sign
- The sign bit. You can also use the
FlpGetSign()
macro to obtain the sign bit, and theFlpNegate()
,FlpSetNegative()
, andFlpSetPositive()
macros to set the sign bit. -
exp
- The bits that make up the exponent. You can also use the
FlpGetExponent()
macro to obtain the exponent value. -
manH
- The most-significant 20 bits of the mantissa.
-
manL
- The least-significant 32 bits of the mantissa.
Float Manager Functions
FlpAToF Function
Purpose
Convert a null-terminated ASCII string to a 64-bit floating-point number. The string must have the format:
[+|-][digits][.][digits][e|E[+|-][digits]]
Declared In
FloatMgr.h
Prototype
FlpDouble FlpAToF ( const Char *s )
Parameters
Returns
Returns the value of the string as a floating-point number.
Comments
The mantissa of the number is limited to 32 bits.
This function is close to being compatible with the ISO C library function atof
. atof
requires the form:
[+|-]digits[.][digits][(e|E)[+|-]digits]
In order to maintain backward compatibility with the Float Manager in Palm OS 1.0 (which could be used up to, but not including, Palm OS 4.0), this function considers all of the "digits" sections to be optional. Here's a table showing the ISO and Palm OS behavior with some sample strings:
< Palm OS 4.01 |
||||
---|---|---|---|---|
The old Float Manager doesn't allow a '+' sign in the exponent. |
||||
The old Float Manager doesn't allow a capital 'E' to mark the exponent. |
||||
The old Float Manager uses an unsigned long and wraps around. |
Unlike atof
, FlpAToF
doesn't accept leading white-space characters and it doesn't accept decimal point characters other than '.
'.
Compatibility
Implemented only if 2.0 New Feature Set is present. GCC users must use FlpBufferAToF()
instead of this function.
See Also
FlpBase10Info Function
Purpose
Extract detailed information on the base 10 form of a floating-point number: the base 10 mantissa, exponent, and sign.
Declared In
FloatMgr.h
Prototype
Err FlpBase10Info ( FlpDouble a, UInt32 *mantissaP, Int16 *exponentP, Int16 *signP )
Parameters
-
→
a
- The floating-point number.
-
←
mantissaP
- The base 10 mantissa.
-
←
exponentP
- The base 10 exponent.
-
←
signP
- The sign: 1 if the number is negative, 0 otherwise.
Returns
Returns 0 if no error, or flpErrOutOfRange
if the supplied floating-point number is either not a number (NaN) or is infinite.
Comments
The mantissa is normalized so it contains at least 8 significant digits when printed as an integer value.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetExponent()
, FlpGetSign()
FlpBufferAToF Function
Purpose
Convert a null-terminated ASCII string to a floating-point number. The string must be in the format: [-]x[.]yyyyyyyy[e[-]zz]
Declared In
FloatMgr.h
Prototype
void FlpBufferAToF ( FlpDouble *result, const Char *s )
Parameters
-
←
result
- Pointer to the structure into which the return value is placed.
-
→
s
- Pointer to the null-terminated ASCII string to be converted.
Returns
Returns the value of the string as a floating-point number.
Comments
See FlpAToF()
for a complete description of this function.
Compatibility
Implemented only if 2.0 New Feature Set is present. Because the Palm OS ABI was not well-specified in this area, GCC by default implemented structure return differently from the compiler used to build the ROM. As a result, GCC users must use this function instead of FlpAToF()
. CodeWarrior users can use either function; they are binary compatible.
FlpBufferCorrectedAdd Function
Purpose
Adds two floating-point numbers and corrects for least-significant-bit errors when the result should be zero but is instead very close to zero.
Declared In
FloatMgr.h
Prototype
void FlpBufferCorrectedAdd ( FlpDouble *result, FlpDouble firstOperand, FlpDouble secondOperand, Int16 howAccurate )
Parameters
-
←
result
- Pointer to the structure into which the return value is placed.
-
→
firstOperand
- The first of the two numbers to be added.
-
→
secondOperand
- The second of the two numbers to be added.
-
→
howAccurate
- The smallest difference in exponents that won't force the result to zero. The value returned from this function is forced to zero if the difference between exponents in the smaller of the two operands and the result exceeds this value. Supply a value of zero for this parameter to obtain the default level of accuracy (which is equivalent to a
howAccurate
value of 48).
Returns
Returns the calculated result.
Comments
See FlpCorrectedAdd()
for a complete description of this function.
Compatibility
Implemented only if 2.0 New Feature Set is present. Because the Palm OS ABI was not well-specified in this area, GCC by default implemented structure return differently from the compiler used to build the ROM. As a result, GCC users must use this function instead of FlpCorrectedAdd()
. CodeWarrior users can use either function; they are binary compatible.
FlpBufferCorrectedSub Function
Purpose
Subtracts two floating-point numbers and corrects for least-significant-bit errors when the result should be zero but is instead very close to zero.
Declared In
FloatMgr.h
Prototype
void FlpBufferCorrectedSub ( FlpDouble *result, FlpDouble firstOperand, FlpDouble secondOperand, Int16 howAccurate )
Parameters
-
←
result
- Pointer to the structure into which the return value is placed.
-
→
firstOperand
- The value from which
secondOperand
is to be subtracted. -
→
secondOperand
- The value to subtract from
firstOperand
. -
→
howAccurate
- The smallest difference in exponents that won't force the result to zero. The value returned from this function is forced to zero if the difference between exponents in the smaller of the two operands and the result exceeds this value. Supply a value of zero for this parameter to obtain the default level of accuracy (which is equivalent to a
howAccurate
value of 48).
Returns
Returns the calculated result.
Comments
See FlpCorrectedSub()
for a complete description of this function.
Compatibility
Implemented only if 2.0 New Feature Set is present. Because the Palm OS ABI was not well-specified in this area, GCC by default implemented structure return differently from the compiler used to build the ROM. As a result, GCC users must use this function instead of FlpCorrectedSub()
. CodeWarrior users can use either function; they are binary compatible.
FlpCorrectedAdd Function
Purpose
Adds two floating-point numbers and corrects for least-significant-bit errors when the result should be zero but is instead very close to zero.
Declared In
FloatMgr.h
Prototype
FlpDouble FlpCorrectedAdd ( FlpDouble firstOperand, FlpDouble secondOperand, Int16 howAccurate )
Parameters
-
→
firstOperand
- The first of the two numbers to be added.
-
→
secondOperand
- The second of the two numbers to be added.
-
→
howAccurate
- The smallest difference in exponents that won't force the result to zero. The value returned from
FlpCorrectedAdd
is forced to zero if, when the exponent of the result of the addition is subtracted from the exponent of the smaller of the two operands, the difference exceeds the value specified forhowAccurate
. Supply a value of zero for this parameter to obtain the default level of accuracy (which is equivalent to ahowAccurate
value of 48).
Returns
Returns the calculated result.
Comments
Adding or subtracting a large number and a small number produces a result similar in magnitude to the larger number. Adding or subtracting two numbers that are similar in magnitude can, depending on their signs, produce a result with a very small exponent (that is, a negative exponent that is large in magnitude). If the difference between the result's exponent and that of the operands is close to the number of significant bits expressible by the mantissa, it is quite possible that the result should in fact be zero.
There also exist cases where it may be useful to retain accuracy in the low-order bits of the mantissa. For instance: 99999999 + 0.00000001 - 99999999. However, unless the fractional part is an exact (negative) power of two, it is doubtful that what few bits of mantissa that are available will be enough to properly represent the fractional value. In this example, the 99999999 requires 26 bits, leaving 26 bits for the .00000001; this guarantees inaccuracy after the subtraction.
The problem arises from the difficulty in representing decimal fractions such as 0.1 in binary. After about three successive additions or subtractions, errors begin to appear in the least significant bits of the mantissa. If the value represented by the most significant bits of the mantissa is then subtracted away, the least significant bit error is normalized and becomes the actual result—when in fact the result should be zero.
This problem is only an issue for addition and subtraction.
Compatibility
Implemented only if 2.0 New Feature Set is present. GCC users must use FlpBufferCorrectedAdd()
instead of this function.
See Also
FlpCorrectedSub Function
Purpose
Subtracts two floating-point numbers and corrects for least-significant-bit errors when the result should be zero but is instead very close to zero.
Declared In
FloatMgr.h
Prototype
FlpDouble FlpCorrectedSub ( FlpDouble firstOperand, FlpDouble secondOperand, Int16 howAccurate )
Parameters
-
→
firstOperand
- The value from which
secondOperand
is to be subtracted. -
→
secondOperand
- The value to subtract from
firstOperand
. -
→
howAccurate
- The smallest difference in exponents that won't force the result to zero.The value returned from
FlpCorrectedSub
is forced to zero if, when the exponent of the result of the subtraction is subtracted from the exponent of the smaller of the two operands, the difference exceeds the value specified forhowAccurate
. Supply a value of zero for this parameter to obtain the default level of accuracy (which is equivalent to ahowAccurate
value of 48).
Returns
Returns the calculated result.
Comments
See the comments for FlpCorrectedAdd()
.
Compatibility
Implemented only if 2.0 New Feature Set is present. GCC users must use FlpBufferCorrectedSub()
instead of this function.
FlpFToA Function
Purpose
Convert a floating-point number to a null-terminated ASCII string in exponential format: [-]x.yyyyyyye[-]zz
Declared In
FloatMgr.h
Prototype
Err FlpFToA ( FlpDouble a, Char *s )
Parameters
Returns
Returns 0 if no error, or flpErrOutOfRange
if the supplied value is infinite or is not a number. In this case, the buffer is set to the string "INF", "-INF", or "NaN" as appropriate.
Comments
If the supplied floating-point number is zero, the returned string doesn't use the exponential format but is simply "0".
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetExponent Macro
Purpose
Returns the exponent of a 64-bit floating-point value. The returned value has the bias applied, so it ranges from -1023 to +1024.
Declared In
FloatMgr.h
Prototype
#define FlpGetExponent ( x )
Parameters
Returns
Returns a UInt32
containing the exponent of the specified value.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetSign Macro
Purpose
Returns the sign of a 64-bit floating-point value.
Declared In
FloatMgr.h
Prototype
#define FlpGetSign ( x )
Parameters
Returns
Returns a UInt32
with a nonzero value if the specified value is negative, and with a zero value if it is positive.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpBase10Info()
, FlpGetExponent()
, FlpNegate()
, FlpSetNegative()
, FlpSetPositive()
FlpIsZero Macro
Purpose
Returns whether the specified 64-bit floating-point value is zero.
Declared In
FloatMgr.h
Prototype
#define FlpIsZero ( x )
Parameters
Returns
Returns a UInt32
with a nonzero value if the specified value is zero, and with a zero value if the specified value is other than zero.
Compatibility
Implemented only if 2.0 New Feature Set is present.
FlpNegate Macro
Purpose
Changes the sign bit of a 64-bit floating-point number.
Declared In
FloatMgr.h
Prototype
#define FlpNegate ( x )
Parameters
Returns
Returns a 64-bit floating-point value which is the negative of the value specified by x
.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetSign()
, FlpSetNegative()
, FlpSetPositive()
FlpSetNegative Macro
Purpose
Ensures that a 64-bit floating-point number is negative.
Declared In
FloatMgr.h
Prototype
#define FlpSetNegative ( x )
Parameters
Returns
If the supplied 64-bit floating-point value is negative, that value is returned unchanged. If the supplied value is positive, the negative of that value is returned.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetSign()
, FlpNegate()
, FlpSetPositive()
FlpSetPositive Macro
Purpose
Ensures that a 64-bit floating-point number is positive.
Declared In
FloatMgr.h
Prototype
#define FlpSetPositive ( x )
Parameters
Returns
If the supplied 64-bit floating-point value is positive, that value is returned unchanged. If the supplied value is negative, its absolute value is returned.
Compatibility
Implemented only if 2.0 New Feature Set is present.
See Also
FlpGetSign()
, FlpNegate()
, FlpSetNegative()
FlpVersion Function
Purpose
Returns the version number of the Float Manager.
Declared In
FloatMgr.h
Prototype
UInt32 FlpVersion ( void )
Parameters
Returns
Returns the version number of the Float Manager. The current version is represented by the constant flpVersion
, which is defined in FloatMgr.h
.
Compatibility
Implemented only if 2.0 New Feature Set is present.