Distributing values into a histogram

combinatoricsdiscrete mathematicssequences-and-series

Let $b_1(n), b_2(n), \ldots, b_k(n)$ be the height of the $k$ bars of a fixed-width histogram with $n$ data points. Assume $k$ is constant, and $n\geq k$ is an integer which is increasing as we collect new data. Let $d_n$ be the value received at time $n$. I would like to calculate the height of $b_i(n)$ for each $i = 1, 2, \ldots, k$, with the data uniformly scaled to fit the width.

For example in case $k = 3$ and $n=6$, then each bar $b_i(6)$ will consist of an average of two data values: $$b_1(6) = \frac{1}{2}(d_1 + d_2), \ \ \ b_2(6) = \frac{1}{2}(d_3 + d_4),\ \ \text{and }\ b_3(6) = \frac{1}{2}(d_5 + d_6).$$

Continuing the example with $k=3$, if $n=7$, I think we should set:
$$b_1(7) = \frac{3}{7}d_1+\frac{3}{7}d_2 + \frac{1}{7}d_3 $$
$$b_2(7) = \frac{2}{7}d_3+\frac{3}{7}d_4 + \frac{2}{7}d_5 $$
$$b_3(7) = \frac{1}{7}d_5+\frac{3}{7}d_6 + \frac{3}{7}d_7 $$

The problem is to find a general formula to calculate $b_i(n+1)$ as a function of the constant $k$ and the values $n, d_1, \ldots d_{n+1}$.

Best Answer

We can follow a guide from an old joke, saying that in order to quickly count sheep in a field we have to count a number of their legs and divide it by four.

Assume that we have the values $c_1,\dots, c_{nk}$ such that $$c_1=c_2=\dots=c_k=d_1,$$ $$c_{k+1}=c_{k+2}=\dots=c_{2k}=d_2$$ $$\dots,$$ $$c_{(n-1)k+1}= c_{(n-1)k+2}=\dots=c_{nk}=d_n.$$

Then for each $i=1,2,\dots, k$ we have $$nb_i(n)=c_{(i-1)n+1}+c_{(i-1)n+2}+\dots+c_{in}.\label{1}\tag{1}$$

Since the sequence $\{c_t:1\le t\le nk\}$ consists of $n$ blocks of equals values of length $k$ each, we can calculate $b_i(n)$, in terms of $d_j$ as follows. The expression \eqref{1} starts from the last $k-r$ values $d_{j+1}$ of the $j+1$-th block for some $0<r<k$ and $0\le j$, then it has a contiguous sequence (possibly, empty) of complete blocks from $j+2$-th to $j’$-th for some $j’$, and finally it has $0<r’<k$ values $d_{j’+1}$ of the beginning of $j’+1$-th block. From the above we have $(i-1)n=jk+r$ and $in=j'k+r'$, so $$nb_i(n)=(k-r)d_{j+1}+k\sum_{\ell=j+2}^{j'} d_\ell+r'd_{j'+1}.$$

Related Question