[Math] Machine numbers in a 32-bits (binary digits) computer

binarycomputer-arithmeticnumerical methods

I'm currently studying Numerical Analysis with the book "Numerical Analysis: Mathematics of Scientific Computing" by Kincaid.

In this book, the authors have introduced a computer called "Marc-32" which is a 32-bits computer representing a nonzero real number with the form:
x = ±q * 2^m

with the allocation:

  • sign of the real number x: 1 bit
  • biased exponent (integer e): 8 bits
  • mantissa part (real number f): 23 bits

Now a problem in the book asks to consider different given machine numbers and whether they are contained in Marc-32. For example the number 2^-1 + 2^-26.

My problem is that I honestly do not know how to determine this. I've tried reading the section about "marc-32" multiple times, but the closest I've got was that in the book, the authors write that "…23 bits means that our machine numbers have a limited precision of rougly six decimal places".

Therefore I thought that 2^-1 + 2^-26 = 0.10000000000000000000000001
being a number with 26 decimal places would not be contained in marc-32, but I'm really not sure if this is correct at all, and if it is the method I'm supposed to be using.

Therefore I would like to reach out to the math community here for any pieces of advice on how to tackle this problem. It would be much appreciated.

Best Answer

A 23 bit mantissa means that the first nonzero binary digit and last nonzero binary digit can be separated by 23 places (we get an "extra bit" for free by making the "units" place implicitly 1 without explicitly encoding it). So in your case you would want to write (in binary, not decimal) 1.(24 zeros)1 with an exponent of -1, which is a 25 bit mantissa. So that rounds to just 1/2 in your system.

Be careful about the difference between "separation between the first and last nonzero digit" and "the position of the last nonzero digit". For example, although $2^{-1}+2^{-26}$ is not representable in your number system, $2^{-26}$ is (because $-26$ can be stored as a 9 bit signed integer).

Related Question