[Math] How to evaluate this sum $\sum\limits_{i=1}^{n} \left \lfloor \sqrt{2} \cdot i \right \rfloor$

sequences-and-seriessummation

I have the following sum that I need to evaluate:

$$\sum\limits_{i=1}^{n} \left \lfloor \sqrt{2} \cdot i \right \rfloor$$

If the floor wasn't there it would have been too easy $\sqrt2 \sum\limits_{i=1}^{n} i$, but with the floor added I have no clue how to approach it. Is this sum solveable and if so how do I evaluate it?

Best Answer

I've found a fast enough algorithm to compute the exact value for $ n = 10^{100} $:

$$ \sum_{i=1}^{10^{100}} i\cdot\sqrt 2 = $$

$$ 70710678118654752440084436210484903928483593768847403658833986899536623923105351942519376716382078638821760123411090095254685423841027253480565451739737157454059823250037671948325191776995310741236436 $$

Here's the python code:

import math
import time

#
# Helpers for numbers in the form a + b*sqrt(2), encoded as a tuple (a,b)
#

# compare 2 numbers
def root2_cmp(x,y):
    x_a,x_b = x
    y_a,y_b = y

    a = x_a - y_a
    b = x_b - y_b

    a_sign = cmp(a,0)
    b_sign = cmp(b,0)

    if a_sign == b_sign: return a_sign

    if a_sign == 0: return b_sign
    if b_sign == 0: return a_sign

    # one of a_sign and b_sign is 1 and the other is -1
    # compare a^2 and (b*sqrt(2))^2 = 2*b^2

    aa = a*a
    bb2 = b*b*2

    # if a^2 > 2b^2, return a_sign otherwise the opposite sign
    return cmp(aa,bb2)*a_sign

def root2_max(x,y):
    return x if root2_cmp(x,y) > 0 else y

def root2_min(x,y):
    return x if root2_cmp(x,y) < 0 else y

def root2_add(x,y):
    return tuple(x_i+y_i for x_i,y_i in zip(x,y))

def root2_sub(x,y):
    return tuple(x_i-y_i for x_i,y_i in zip(x,y))

# Takes 2 sequence objects in the form (lower_bound, upper_bound, offset, sequence_data)
# Returns the same data for the concatenation of the sequences.

def seq_concat(a,b):
    a_lower, a_upper, a_offset, a_seq_data = a
    b_lower, b_upper, b_offset, b_seq_data = b

    c_lower = root2_max(a_lower, root2_sub(b_lower, a_offset))
    c_upper = root2_min(a_upper, root2_sub(b_upper, a_offset))

    # sanity check - should never be concatenating sequences that can't go together
    assert root2_cmp(c_upper, c_lower) > 0

    c_offset = root2_add(a_offset, b_offset)

    c_sequence = seq_data_concat(a_seq_data, b_seq_data)

    return c_lower, c_upper, c_offset, c_sequence

# Takes 2 sequence data objects in the form (length, slope, y-intercept)
# Returns the same data for the concatenation of the sequences.
def seq_data_concat(a,b):
    a_len, a_slope, a_intercept = a
    b_len, b_slope, b_intercept = b

    c_len = a_len + b_len
    c_slope = a_slope + b_slope
    c_intercept = a_slope*b_len + a_intercept + b_intercept

    return (c_len, c_slope, c_intercept)

# Compute the sum for i from 1 to n of floor(i*sqrt(2))
def root2_floor_sum(n):
    seqs = [
        #  lower  upper  offset  seq_data (length, slope, y-intercept)
        ( (0,0), (2,-1), (-1,1), (1,1,1) ),
        ( (2,-1), (1,0), (-2,1), (1,2,2) )
    ]

    prev_seq = None
    cur_seq_data = (0,0,0)
    cur_offset = (0,0)

    # while current sequence length is below n
    while cur_seq_data[0] < n:
        # remove too-big sequences
        max_len = n - cur_seq_data[0]
        seqs = filter(lambda (lb,ub,off,seq_data): seq_data[0] <= max_len, seqs)

        matching_seqs = filter(lambda (lb,ub,off,seq_data): root2_cmp(cur_offset, lb) >= 0 and root2_cmp(cur_offset, ub) < 0, seqs)
        next_seq = max(matching_seqs, key=lambda (lb,ub,off,seq_data): seq_data[0])

        if prev_seq != None:
            seq_to_add = seq_concat(prev_seq, next_seq)
            seqs.append(seq_to_add)

        next_lb, next_ub, next_off, next_seq_data = next_seq

        cur_seq_data = seq_data_concat(cur_seq_data, next_seq_data)
        cur_offset = root2_add(cur_offset, next_off)

        prev_seq = next_seq

    return cur_seq_data[2]

def verify_up_to(n):
    # verify the algorithm works for small n
    expected = 0
    root2 = math.sqrt(2)
    for i in range(1,n+1):
        expected += int(math.floor(root2*i))
        assert root2_floor_sum(i) == expected

verify_up_to(1000)

start_time = time.time()
result = root2_floor_sum(10**100)
end_time = time.time()

print result
print 'computed in',int((end_time - start_time)*1000),'ms'

Here's the output I get for the above program:

70710678118654752440084436210484903928483593768847403658833986899536623923105351942519376716382078638821760123411090095254685423841027253480565451739737157454059823250037671948325191776995310741236436
computed in 889 ms

Explanation of the algorithm

First, consider the sequence of differences of successive terms in this sum $ a_n = \lfloor n\sqrt 2 \rfloor - \lfloor(n-1) \sqrt 2 \rfloor $.

This sequence is all $ 1 $s and $ 2 $s: $ 1,1,2,1,2,1,1,2,1, ... $

Given this, we can compute the sum in question with $ \sum_{i=1}^n i \cdot a_{n-i+1} $. But this is obviously too slow with $ n = 10^{100} $. So rather than computing the sequence directly, we want to compute some info about it that still lets us compute the sum.

So we will only keep track of the following for the sequence $ a $:

  • The length of the sequence $ len(a) $

  • The function $ f_a(k) = \sum_{i=1}^{len(a)} (i+k) \cdot a_{len(a)-i+1} $. Then $ f_a(0) $ is the sum described above. You can think of this as being the above sum, but with $ a $ having $ k $ zeros appended to it. Because $ f_a $ is a finite sum of linear functions, it is linear. Therefore it is uniquely defined by its slope, $ slope(a) $ and $y$-intercept, $ intercept(a) $

Now suppose we have the above info for sequences $ a $ and $ b $. We can compute the same info for the concatenation of these sequences, $ c $.

  • $ len(c) = len(a) + len(b) $

  • $ f_c(k) = f_a(k + len(b)) + f_b(k) $, giving us

    • $ slope(c) = slope(a) + slope(b) $
    • $ intercept(c) = slope(a) \cdot len(b) + intercept(a) + intercept(b) $

So this means that we can build the needed info for the whole sequence by combining info of smaller sequences.

Next, we need the fact that this sequence repeats itself a lot (I mean look at it).

Imagine a line drawn from $ (n,\sqrt n) $ to $ (m,\sqrt m) $ with $ m > n $. Between each time this passes through a vertical line, it passes through either $ 1 $ or $ 2 $ horizontal lines. These $ 1 $s and $2$s are exactly the sequence of differences of terms we defined above at indices $ n $ through $m$. Because of the irrationality of $ \sqrt 2 $, we can move this line around slightly and still get the same sequences. If we've computed the sequence up to some $ k $ and find that the fractional part of the $ k $th term in the original sum, $ frac(m*sqrt(2)) $ is close to the fractional part of the $ n $th term, we can translate the line above to $ (k,\sqrt k) $ to $ (k+m-n, \sqrt {k+m-n}) $ and this won't change the sequence associated with it.

So the algorithm works by building bigger and bigger lines described above by appending them together, remembering the info described above about the sequences associated with the lines (length, slope, intercept).

Related Question