What’s the minimum number of $2$s needed to write a positive integer

combinatoricsoptimizationrecreational-mathematics

This is just for fun and inspired by Estimating pi, using only 2s.

For a positive integer $n$, let $f(n)$ denote the minimum number of $2$s needed to express $n$ using addition, subtraction, multiplication, division, and exponentiation, together with the ability to concatenate $2$s, so for example $2 \times 22^2 + \frac{222}{2}$ is a valid expression. Other variants involving different sets of allowed operations are possible, of course. This function is very far from monotonic, so to smooth it out let's also consider

$$g(n) = \text{max}_{1 \le m \le n} f(m).$$

For example,

  • $f(1) = 2$ ($1 = \frac 22$)
  • $f(11) = 3$ ($11 = \frac{22}{2}$)

Question: What can you say about $f(n)$ and $g(n)$? Can you give exact values for small values of $n$? Can you give (asymptotic or exact) upper bounds? Lower bounds?

As a simple example we can write any positive integer $n$ in the form $2^k + n'$ where $n' < 2^k$ ($2^k$ is just the leading digit in the binary expansion of $n$), which gives $f(n) \le f(k) + 1 + f(n')$. If we write $\ell(n) = \lfloor \log_2 n \rfloor$ then iterating this gives something like

$$g(n) \le \sum_{k=1}^{\ell(n)} \left( g(k) + 1 \right).$$

This gives an upper bound growing something like $\ell(n) \ell^2(n) \ell^3(n) \dots$ which I think is pessimistic. For example, in my answer to the linked question I show that

$$f(14885392687) \le 36$$

and $\ell(14885392687) = 33$ so maybe we can expect something as good as $g(n) = O(\log n)$ for an upper bound. I have no idea about a lower bound, other than to write down an upper bound on the number of possible expressions that can be made with a given number of $2$s.

Edit: A related question involving $4$s and more allowed operations: How many fours are needed to represent numbers up to $N$?

Best Answer

I've been silly; we don't need to work with iterated logarithms. We can get a logarithmic upper bound by using the binary expansion slightly more cleverly. Namely, we can always write $n = 2n' + \left( n \bmod 2 \right)$, so either $2k = 2(k)$ or $2k+1 = 2(k) + \frac 22$, which gives

$$f(2k) \le f(k) + 1$$ $$f(2k+1) \le f(k) + 3.$$

Iterating these bounds gives

$$\boxed{ f(n) \le 3 \lceil \log_2 n \rceil - 1 \approx 4.32 \log n }$$

which corresponds to writing $n$ as $d_0 + 2(d_1 + 2(d_2 + \dots)))$ where $d_i$ are the binary digits of $n$. This uses only addition, multiplication and division and lots of optimizations are possible. This gives $f(14885392687) \le 3 \cdot 33 + 2 = 101$ which is at least within a factor of $3$ of the explicit result.

As an example of a possible optimization, we can improve the bound by working in base $22$, which gives

$$f(n) \le \left( 2 + g(21) \right) \lceil \log_{22} n \rceil.$$

My computations give $g(21) \le 5$ (the maximum value of $5$ is attained for $n = 7, 15, 17, 19$, at least if I'm not mistaken), so

$$\boxed{ f(n) \le 7 \lceil \log_{22} n \rceil \approx 2.26 \log n }$$

which is almost twice as good! This gives $f(14885392687) \le 56$ which still doesn't quite match the explicit result. Using subtraction we can replace $g(21)$ above by $g(10)$ but since $g(10) = 5$ also this doesn't actually help in this case.

We can write down a logarithmic lower bound on $g$ by writing down an exponential upper bound on the number $N(k)$ of possible expressions involving $k$ twos. (At least one number between $1$ and $N(k)+1$ can't be represented using $k$ twos, so $g(N(k) + 1) \ge k+1$.) We can do a more precise count than the following but this will suffice. An expression involving $k$ twos involves at most $k-1$ operations and at most $k-1$ pairs of parentheses, so altogether is a string of at most $4k-3$ symbols each of which can take the values $2, (, ), +, -, \times, \div$, or exponentiation (note that we don't need a symbol for concatenation). This gives the crude bound $N(k) \le 7^{4k-3}$, so

$$g(7^{4k-3} + 1) \ge k + 1$$

which after a bit of massaging gives

$$\boxed{ g(n) \ge \frac{\lceil \log_7 n \rceil + 3}{4} \approx 0.128 \log n }.$$

This gives $g(14885392687) \ge 4$ which is quite bad! Can anyone do substantially better, possibly after disallowing some of the operations? A lower bound given only addition, multiplication, and exponentiation would already be quite interesting, I think.

Related Question