How is NaN implemented at all, for single or double types? I wish I could define NaN or undefined for integer or real type variable, but it is not possible in typed language like Pascal.
that is a rather good question... every floating point number is encoded into a block of some number of bits. lets look at the example of a DOUBLE, that takes up 64 bits. the block of bits is divided up into THREE groups:
- sign: 1 bit
- exponent: 11 bits
- mantissa: 52 bits (plus an assumed leading "1.")
see:
https://en.wikipedia.org/wiki/Floating-point_arithmeticsign is obvious: the number is either positive or negative. so
+ or
−mantissa is the digits of the number that you would normally see written down, with an assumed "
1." at the start (if it were "0." at the start then we could just shift the bits left to get it back to "1." and adjust the exponent accordingly, so we can always assume "1.")
exponent a sort of multiplier, which moves the "
." left or right by some number of places. because we have 11 bits, and can move left or right, we have a range of -1024 to +1023, representing moving the "
." left or right by this many places. another way to view it is as the exponent represents
2-1024 to
21023, which then is multiplied by the sign+mantissa to get back the encoded number. note:
20 equals
1, which just means "leave the decimal point where it already is".
now, you will see that the range of numbers below 1 that can be represented is
twice as wide as the range of numbers above 1 that can be represented. we make use of this to 'borrow' a whole range of number for 'special' purposes. we co-opt the
exponent number containing all 1's (11111111111 in the case of a DOUBLE) to have the special meaning of "not a number", while the bits of the mantissa now being flags that can signify a whole load of different things, such as NaN, infinity, underflow, and whatever else we wish.
cheers,
rob :-)