This might seem a silly question, but I have Googled in vain for hours to find an answer, so here goes:
I have two variables measuring the same physical parameter. Let's call these variables A and B. A is a discrete variable that can go from 0 to 5, and B is a continuous variable that normally ranges from 0 to 1000. However, A is a very noisy variable.
Based on the data that I have, I want to estimate the range of values in B that corresponds to every value in A.
Example output:
╔═══╦══════════╗
║ A ║ B ║
╠═══╬══════════╣
║ 0 ║ 0-50 ║
║ 1 ║ 50-200 ║
║ 2 ║ 200-500 ║
║ 3 ║ 500-750 ║
║ 4 ║ 750-800 ║
║ 5 ║ 800-1000 ║
╚═══╩══════════╝
How do I estimate these ranges? Any help would be greatly appreciated.
Best Answer
The problem you have might be interpreted as a classification one. You want to find a classifier in form of the intervals, we obtain what is needed. To construct them, you have to construct limits $\beta_k$ where $k=1,\dots,v-1$ where $v=6$ is number of elements of $A$.
Each limit shall be constructed as $f(b=\beta_k|a=k)=f(b=\beta_k|a={k+1})$ where $f(\cdot|\cdot)$ stands for conditional probability density function. These density functions can be approximated e.g. by normal distributions.
Simple code in Matlab: