Solved – the best way to discretize a 1D continuous random variable

Say I have a 1-dimensional continuous random variable $X$, with PDF $f(X)$, CDF $F(X)$ and inverse CDF $F^{-1}$. What is the best way to discretize $X$?

To keep things clear, let $Y$ denote the discretized version of $X$, and let $g(Y)$, $G(Y)$ and $G^{-1}$ refer to the PMF, CDF and inverse CDF of $Y$.

Ideally, I would like for $E[X] = E[Y]$ and $Var(X) = Var(Y)$. However, I understand if these properties cannot hold for finite $N$ due to discretization error.

Some thoughts:

Should I just discretize $X$ into $N$ equally spaced points? Or should I discretize more at regions of high probability, by first discretizing a Uniform $[0,1]$ into equidistant points $u_1,u_2…u_N$ and then generating discrete values $Y_1 = F^{-1}(u_1), Y_2 = F^{-1}(u_2)$ and $Y_N = F^{-1}(u_N)$
Say I have discretized values $Y_1, Y_2… Y_N$. What is the best way to create $g(Y)$, $G(Y)$ and $G^{-1}$ from these values? My thoughts here are that each $Y_k$ corresponds to an interval of $X$ with width $w_k$, and that the $Y_k$ should be the "midpoints" of these intervals. That is, $g(Y_k) = p(Y_k – w_k < X < Y_k + w_k) = F(Y_k + w_k) – F(Y_k – w_k)$.

Best Answer

Hint: quantization might be a better keyword to search information.

Designing an "optimal" quantization requires some criterion. To try to conserve the first moment of the discretized variable ... sounds interesting, but I don't think it's very usual.

More frequently (especially if we assume a probabilistic model, as you do) one tries to minimize some distortion: we want the discrete variable to be close to the real one, in some sense. If we stipulate minimum average squared error (not always the best error measure, but the most tractable), the problem is well known, and we can easily build a non-uniform quantizer with minimum rate distortion, if we know the probability of the source; this is almost a synonym of "Max Lloyd quantizer".

Because a non-uniform quantizer (in 1D) is equivalent to pre-applying a non-linear transformation to a uniform quantizer, this kind of transformation ("companding") (in probabilistic terms, a function that turns our variable into a quasi-uniform) are very related to non uniform quantization (sometimes the concepts are used interchangeably). A pair of venerable examples are the u-Law and A-Law specifications for telephony.

Best Answer

Related Solutions

Continuous Data vs Discrete Data – Key Differences Explained

Solved – What methods exist for finding optimal splits to discretize continuous data with respect to a target variable

Related Question