Wednesday, 25 March 2015
Numbers are floating point
they are standard IEEE 754 64-bit double-precision numbers. Even though the IEEE
NaN value, which can be referenced in code as
This has immediate consequences: there are upper and lower limits to the stored value, and numbers can only have a certain precision.
var x=10000000000000000; if(x==(x+1)) alert("Oops");
You might think that you don't need the precision, but you quickly hit problems when using decimal fractions:
var x=0.2*0.3-0.01; if(x!=0.05) alert("Oops");
The rounding errors in the representations of the decimal fractions here mean
that the value of
x in this example is 0.049999999999999996, not 0.05 as you
Again, this isn't particularly strange, it's just an inherent property of the numbers being represented as floating point. However, what I found strange is that sometimes the numbers aren't treated as floating point.
Numbers aren't always floating point
The first place this happens is with the bitwise operators (
you use one of these then both operands are first converted to a 32-bit signed
integer. This can have surprising consequences.
Look at the following snippet of code:
var x=0x100000000; // 2^32 console.log(x); console.log(x|0);
What do you expect it to do? Surely
x? You might be excused for
thinking so, but no. Now,
x is too large for a 32-bit integer, so
it to be taken modulo 2^32 before converting to a signed integer. The low
32-bits are all zero, so now
x|0 is just 0.
OK, what about this case:
var x=0x80000000; // 2^31 console.log(x); console.log(x|0);
What do you expect now? We're under 2^32, so there's no dropping of higher order
bits, so surely
x now? Again, no.
x|0 in this case is
x is first converted to a signed 32-bit integer with 2s complement
representation, which means the most-significant bit is the sign bit, so the
number is negative.
I have to confess, that even with the truncation to 32-bits, the use of signed integers for bitwise operations just seems odd. Doing bitwise operations on a signed number is a very unusual case, and is just asking for trouble, especially when the result is just a "number", so you can't rely on doing further operations and having them give you the result you would expect on a 32-bit integer value.
For example, you might want to mask off some bits from a value. With normal 2s
x-(x&mask) is the same as
x&~mask: in both cases,
you're left with the bits set in
x that were not set in
x has bit 31 set.
var x=0xabcdef12; var mask=0xff; console.log(x-(x&mask)); console.log(x&~mask);
If you truncate back to 32-bits with
x|0 then the values are indeed the same,
but it's easy to forget.
In languages such as C and C++,
x<<y is exactly the same as
and after the operation. This can have surprising results.
var x=0xaa; console.log(x); console.log(x<<24); console.log(x*(1<<24));
x to a signed 32-bit integer, bit-shifts the value as a
signed 32-bit integer, and then converts that result back to a
Number. In this
x<<24 has the bit pattern 0xaa000000, which has the highest bit set when
treated as 32-bit, so is now a negative number with value -1442840576. On the
1<<24 does not have the high bit set, so is still positive, so
x*(1<<24) is a positive number, with the same value as 0xaa000000.
Of course, if the result of shifting would have more than 32 bits then the top
bits are lost:
0xaa<<25 would be truncated to 0x54000000, so has the value
1409286144, rather than the 5704253440 that you get from
For right-shifts, there are two operators:
>>>. Why two? Because the
operands are converted to signed numbers, and the two operators have different
semantics for negative operands.
What is 0x80000000 shifted right one bit? That depends. As an unsigned number,
right shift is just a divide-by-two operation, so the answer is 0x40000000, and
that's what you get with the
>>> operator. The
>>> operator shifts in
zeroes. On the other hand, if you think of this as a negative number (since it
has bit 31 set), then you might want the answer to stay negative. This is what
>> operator does: it shifts in a 1 into the new bit 31, so negative
numbers remain negative.
As ever, this can have odd consequences if the initial number is larger than 32 bits.
var x=0x280000000; console.log(x); console.log(x>>1); console.log(x>>>1);
0x280000000 is a large positive number, but it's greater than 32-bits long, so
is first truncated to 32-bits, and converted to a signed
0x280000000>>1 is thus not 0x140000000 as you might naively expect,
but -1073741824, since the high bits are dropped, giving 0x80000000, which is a
negative number, and
>> preserves the sign bit, so we have 0xc0000000, which
>>> just does the truncation, so it essentially treats the operand as an
unsigned 32-bit number.
0x280000000>>>1 is thus 0x40000000.
If right shifts are so odd, why not just use division?
Divide and conquer?
If you need to preserve all the bits, then you might think that doing a division
instead of a shift is the answer: after all, right shifting is simply dividing
is 1.5, not 1. You're therefore looking at two floating-point operations instead
of one integer operation, as you have to discard the fractional part either by
removing the remainder beforehand, or by truncating it afterwards.
var x=3; console.log(x); console.log(x/2); console.log((x-(x%2))/2); console.log(Math.floor(x/2));
This strikes me as a really strange choice for the language designers to make: doing bitwise operations on signed values is a really niche feature, whereas many people will want to do bitwise operations on unsigned values.