Decimals-64
From DocBase
64-bit Decimals by Contributors | ||
|
This is a REBOL Internals Recipe |
Add your own Cookbook Recipe | |
|
Authors include Carl Sassenrath, John Niclasen, Ladislav Mecir
REBOL decimals are 64-bit standard IEEE floating point numbers.
Precision
The number of digits to be used in forming decimal numbers is specified by:
system/options/decimal-digits
It defaults to 15 digits.
The forming precision can be set higher. Example of a situation, where a higher forming precision may be of use:
load mold 1.7976931348623157e308
** Math error: math or number overflow
** Where: transcode either if load
** Near: transcode/only data unless case [
:val = [rebol] [
...
** Note: use WHY? for more about this error
Generally, for the 64-bit IEEE754 binary floating point format it is sufficient to set the forming precision to 17 digits to achieve the maximum possible accuracy of the conversion, which means, that such a string is known to uniquely determine the original 64-bit IEEE754 binary floating point number.
For the above test, we obtain:
system/options/decimal-digits: 17 x: 1.7976931348623157e308 y: load mold x same? x y ; == true
Comparision
Comparing decimal numbers is somewhat complicated. The reason for this is that decimals are based on the IEEE floating point standard where decimals have a limited number of accurate digits. This can lead to very slight errors during computations, and those errors can also accumulate when multiple computations are performed on the results.
Checking for Zero
There are two primary ways to check for zero:
zero? number number = 0
The second has a few variations, which are described below in the comparison section.
Comparison functions
Decimal values support these equality comparison functions:
- same? - Returns TRUE if the values are identical.
- strict-equal? - Returns TRUE if the values are equal and of the same datatype.
- strict-not-equal? - Returns TRUE if the values are not equal and/or not of the same datatype.
- equal? - Returns TRUE if the values are nearly equal.
- not-equal? - Returns TRUE if the values are not nearly equal.
These relative comparison functions are also available:
- greater? - Returns TRUE if the first value is greater than the second value.
- lesser? - Returns TRUE if the first value is less than the second value.
- greater-or-equal? - Returns TRUE if the first value is greater than or equal to the second value.
- lesser-or-equal? - Returns TRUE if the first value is less than or equal to the second value.
- minimum - Returns the lesser of the two values.
- maximum - Returns the greater of the two values.
More information on the difference between these functions is described below.
Comparison operators
The above functions are also available as infix operators:
- =? - same
- = - equal
- <> - not equal
- == - strict-equal
- > - greater
- >= - greater or equal
- < - lesser (less than)
- <= - lesser or equal
Comparing two decimals
The precision related to decimal (floating point) numbers mentioned above can make comparing decimal values problematic in many cases.
The example below helps shows what you would expect as the result of:
>> 0.1 + 0.1 + 0.1 == 0.3
However, if you examine it a little closer, you will find:
>> system/options/decimal-digits: 17 >> 0.1 + 0.1 + 0.1 == 0.30000000000000004
So, strictly speaking adding 0.1 three times is not equal to 0.3 if a direct comparison is made!
To avoid this problem and make it a bit easier on users (especially those of us who do not want to be experts in floating point math errors), the equality comparison function has been changed to account for "nearness" in such cases. This is a relative-magnitude rounding comparison method.
For example:
>> 0.1 + 0.1 + 0.1 = 0.3 == true
So, that seems useful, but we know from the above that the values internally are not quite identical. If you need to know more, please see the notes below for more information.
Strictness of Equality
There are three levels of equality strictness. They are:
- sameness - the decimal numbers must be identical, that is the internal binary representations must match perfectly.
- strictly equal - as above, just the "negative zero" is considered strictly equal to "positive zero".
- nearly equal - the decimal numbers must be nearly equal. This is measured by a comparison algorithm that accounts for minor variations at near (but not exactly) the epsilon precision. For more details see the "Comparing using integers" section in the Comparing floating point numbers article.
The above strictness levels apply to the aforementioned functions in this way:
- sameness: same? (=?)
- strictly equal: strict-equal? (==) and strict-not-equal? (!=)
- nearly equal: equal? (=) and not-equal? (<>)
Rounding
Rounding to a multiple of a given SCALE value
The ROUND function rounds a given decimal value N to a multiple of a given (or implied) SCALE value. In case the SCALE value is not given, it is deemed to be equal to 1.0, i.e. the rounding to an integral value occurs.
Advantages
This method of rounding is the most flexible one allowing us to round to a multiple of essentially any SCALE value.
Disadvantages
Due to the fact, that some usual values like 0.1 or 0.01 are not exactly representable using the 64-bit IEEE754 binary floating point format, rounding to multiples of such values can be only approximate, no matter what method is used.
Overflow and Underflow
[More needed]
IEEE 754 Standard
The bits in a 64-bit decimal are laid out like this:
A number has value v:
v = s × 2e × m
Where
s = +1 when the sign bit is 0 (positive numbers and positive zero)
s = −1 when the sign bit is 1 (negative numbers and negative zero)
e = exponent − 1023 (in other words the exponent is stored as "biased with 1023"), if the exponent is greater than zero
e = −1022, if the exponent is zero (denormalized numbers)
m = 1.fraction in binary (that is, the significand is the binary number 1 followed by the radix point followed by the binary bits of the fraction), if the exponent is greater than zero. In this case 1 ≤ m < 2.
m = 0.fraction in binary (that is, the significand is the binary number 0 followed by the radix point followed by the binary bits of the fraction), if the exponent is zero (denormalized numbers). In this case 0 ≤ m < 1.
Sign
The first bit is the sign bit.
Positive zero looks like this: #{0000 0000 0000 0000}
Negative zero looks like this: #{8000 0000 0000 0000}
Exponent
The next 11 bits are the exponent. That gives 0 ≤ exponent < 211 − 1 = 2047. The exponent value 2047 is reserved for overflow and NaN (Not a Number) and the exponent value 0 is reserved for denormalized numbers. Therefore the above e, obtained by subtracting the bias 1023 from the exponent, can range from −1022 to 1023.
Numbers with exponents other than zero are said to be normalized. They preserve the full precision of the fraction, so they have 53 significant bits (including the 1 before the radix point in the calculation of m above).
Numbers with zero in the exponent are said to be denormalized and have from 1 to 52 significant bits.
Positive normalized numbers with fraction bits set to zero are in the range: #{0010 0000 0000 0000} - #{7FE0 0000 0000 0000}
That is 2046 different exponents for positive normalized numbers.
Positive denormalized numbers have exponent zero and e equal to −1022.
Negative normalized numbers with fraction bits set to zero are in the range: #{8010 0000 0000 0000} - #{FFE0 0000 0000 0000}
That is 2046 different exponents for negative normalized numbers.
Negative denormalized numbers have exponent zero and e equal to −1022.
Fraction
The remaining 52 bits are the fraction used in the calculation of v (see above). That gives 252 = 4'503'599'627'370'496 different fractions.
(The next 2 numbers are denormalized, because the exponent is zero.)
Positive number with exponent bits set to zero and all fraction bits set: #{000F FFFF FFFF FFFF}
Negative number with exponent bits set to zero and all fraction bits set: #{800F FFFF FFFF FFFF}
Powers of 2
If the fraction bits are all zero and the exponent is positive, the number is a power of 2. Because 1023 is subtracted from the exponent, the number 2.0 is: #{4000 0000 0000 0000}
The exponent is #400 = 1024 and 1024 − 1023 = 1. So we can calculate the number v = +1 × 21 × 1.0 = 2.0
Other powers of 2:
#{4010 0000 0000 0000} = 4.0
#{4020 0000 0000 0000} = 8.0
#{4030 0000 0000 0000} = 16.0
etc.
#{3FF0 0000 0000 0000} = 1.0
#{3FE0 0000 0000 0000} = 0.5
#{3FD0 0000 0000 0000} = 0.25
etc.
Range of values
See the range table.
Cookbook References
| Class | Documents |
|---|---|
| Usage | |
| Authoring | |
| See Also |
