Additive non-sequential pdf sampling from a multidimensional distribution

probability distributionsprobability theorysamplingstatistics

I've developed a method for sampling from a multidimensional distribution and want to ask the MathExchange community two questions: (a) is it correct, and (b) if so, what is the best way to notate it?

For a 1-D probability distribution, drawing a sample is a straightforward inversion of the CDF $F(x)$:
$$X=F^{-1}(U[0,1])$$

where $F(x)=\int_{-\infty}^x f(u)\,du$ and $f(x)\,dx$ is the probability of $X$ being found between $x$ and $x+dx$.

This equation is deployed as follows:

  1. pick a random number between 0 and 1
  2. starting at negative infinity, take steps $dx$ in the $+x$ direction, adding increments of probability density $f(x)dx$ as you go. Once this sum reaches the random number drawn in step 1, the value of $x$ at which this occurs is your sample.

My question is about the validity of a similar method which can be used in multiple dimensions where the concept of inverting $F(x)$ to produce a sample is not as straight forward.

The idea is to approximate a sample by selecting random points $x_j$ and then accumulating the corresponding values of pdf $f(x) \, \Delta x$ increments until a $u_*=U[0,1]$ target value is exceeded. Just like steps 1 and 2 listed above, but instead of integrating sequentially, you add up pdf increments from randomly chosen points.

Key concepts are (a) the starting point for the integration doesn't matter (b) you do not need to perform the integration/summation sequentially.

I had previously posted (link/citation below) about performing a sequential integral spiraling outward from origin (one could also spiral inward or scan left to right downward, etc)

previous post of mine (since deleted my Cross Validated account due to moderation disputes) on spiral path sampling

The point which finally causes the sum to exceed the "step 1" $U[0,1]$ draw is the point taken as sample. My attempt to notate this is as follows (asking for improved notation as part of my question).

$$X \approx x_n \text{ where } n= \min n \ni U[0,1]<\sum_{j=1}^{n} f(x_j=U[a,b])\,\Delta x$$

where the hyper-rectangular region framed by $a$ and $b$ contain the meat of the pdf $f(x)$. Is there a name for this "non-sequential integral" alternative/method?

It seems to provide a relative convenient (though computationally intensive) way to sample from a multidimensional distribution.
Please let me know if there is an easier/more correct way to express this idea. (my first use of $\ni$ "such that" (?) for instance)

1-d sampling spreadsheet example

JMP fit to sampled Normal 2 Mixture

Thanks in advance and comments are welcome.

Bonus question: what is the distribution of $x_{n-1}$?

VBA code example of 3-D sampling

Function drawabp(datarange As Range, 
astart As Double, aend As Double, 
bstart As Double, bend As Double, 
pstart As Double, pend As Double, 
dx As Double, mle As Double) As String

    Dim ptest As Double
    Dim atest As Double
    Dim btest As Double
    Dim zsum As Double
    Dim Ftarget As Double


    zsum = 0
    Ftarget = Rnd()

    While zsum < Ftarget
        ptest = (pend - pstart) * Rnd() + pstart
        atest = (aend - astart) * Rnd() + astart
        btest = (bend - bstart) * Rnd() + bstart
        zsum = zsum + Exp(lnlklhd(datarange, ptest, atest, btest) - mle) * dx
    Wend
    drawabp = ptest & "," & atest & ";" & btest
    End Function

1-D spreadsheet example
Example of 2-D sampling

Function drawabp(datarange As Range, 
astart As Double, aend As Double, 
bstart As Double, bend As Double,  
dx As Double, mle As Double) As String

    Dim atest As Double
    Dim btest As Double
    Dim zsum As Double
    Dim Ftarget As Double


    zsum = 0
    Ftarget = Rnd()

    While zsum < Ftarget
        atest = (aend - astart) * Rnd() + astart
        btest = (bend - bstart) * Rnd() + bstart
        zsum = zsum + Exp(lnlklhd(datarange, atest, btest) - mle) * dx
    Wend
    drawabp = atest & "," & btest
    End Function

Some notes on the method:

  1. It seems as though the pdf $f(x)$ doesn't necessarily need to be properly normalized, but it does need to be "small enough" (which I haven't quantified).
  2. The intuitive reason for this is that since the addition of pdf voxels doesn't have a required starting point, or a required path, the only thing that matters is that the number of additions in order to sum to unity is "large enough" (again not quantified)
  3. I would be interested on co-authoring a paper on this with anyone who can help me flesh out the ideas and solidify the notation, etc.
  4. This probably goes without saying, but you can set a limit on the number of pdf samples to add up and if you exceed that limit you can simply abandon the sum and begin again. So, for example (in my spreadsheet example $n$=35"), if after adding 35 pdf samples, instead of adding a 36th in an attempt to reach the $U[0,1]$ target value, you can simply discard the sum and begin again.

Best Answer

Here is something you could want to do if you are provided the function $F:\mathbb{R}^n\rightarrow [0,1]$ such that $F(\mathbf{x})=\mathbb{P}(X_1\leq x_1, \dots , X_n\leq x_n)$.

Observe that you can evaluate the marginal $F(x_1)=\mathbb{P}(X_1\leq x_1)=\lim_{x_2\to \infty}\lim_{x_3\to\infty}\dots\lim_{x_n\to\infty} F(\mathbf{x})$ and the conditional CDF $F(x_2, \dots , x_n \mid x_1) = \mathbb{P}(X_2\leq x_2,\dots, X_n\leq x_n\mid X_1\leq x_1)=F(\mathbf{x})/F(x_1)$. you can apply this strategy recursively to get all the $F(x_1)$, $F(x_2\mid x_1)$, ... , $F(x_n\mid x_1, \dots , x_{n-1})$. Now observe that in the end $F(\mathbf{x})$ is the product of those. You can now generate $p_1\in [0,1]$, then solve as you did $F(x_1)=p_1$, plug this value into $F(x_2\mid x_1)$, generate $p_2$ and find $x_2$, etc...

Does that make sense ?

I will make an EDIT later on how you can probably use newton's method to solve for the $1-D$ case faster, then you can apply it for $n-D$.

EDIT: So we try to generate a sample using the CDF $F$ and PDF $f$, generate $p\in[0,1]$ uniformly, we use the Newton's method to find the roots of $F(x)-p$ (be careful on when it would converge and the speed of convergence). The derivative of this function with respect to $x$ is $f(x)$. you can take a initial guess $x_0$ and then update steps are $$ x_{n+1}=x_n-\frac{F(x_n)-p}{f(x_n)} $$ and you can see how you converge using $F(x_n)-p$. You should also know that if the function $f$ is not continuous, for example when the random variable is discrete, then there maybe some problems. To solve this problem, you are supposed to take the smallest of $x$ such that $F(x)=p$.

I would suggest that you take a close look at this, your method may be better than this one for certain cases.

Related Question