Here, assume both $p(y\mid x)$ and $p(y)$ are too complicated to get closed forms, and we can only draw samples from them. Is there any way to estimate or draw samples from $p(x)$?
Obtain $p(x)$ given samples from $p(y|x)$ and $p(y)$
bayesianconditional probabilitymarginal-distributionsampling
Related Solutions
As indicated in the earlier comments, once you get a sample from the joint distribution of $(X_1,X_2,X_3)$, $$(x_1^1,x_2^1,x_3^1),\ldots,(x_1^t,x_2^t,x_3^t)$$ the marginal sample $$(x_1^1,x_2^1),\ldots,(x_1^t,x_2^t)$$is indeed a sample from the marginal joint distribution of $(X_1,X_2)$ and you can ignore the simulated $x_3^j$'s. They can however be useful in Monte Carlo evaluations through a technique called Rao-Blackwellisation since the average$$\frac{1}{t}\sum_{i=1}^t h(x_1^i,x_2^i)$$is improved by the average$$\frac{1}{t}\sum_{i=1}^t \mathbb{E}[h(x_1^i,x_2^i)|x_3^i]$$as a conditional expectation shares the same expectation as the original but reduces the variance. See for instance this discussion on Cross Validated.
To quote from Wikipedia
The probability density function of the Rayleigh distribution is$$f(x;\sigma) = \frac{x}{\sigma^2} e^{-x^2/(2\sigma^2)}, \quad x \geq 0,$$where $\sigma$ is the scale parameter of the distribution. The cumulative distribution function is$$F(x;\sigma) = 1 - e^{-x^2/(2\sigma^2)}$$for $x \in (0,\infty).$
Therefore the density of $X_{(N)}$, the largest order statistic, is $$f_{(N)}(x)=N\frac{x}{\sigma^2} e^{-x^2/(2\sigma^2)}[1 - e^{-x^2/(2\sigma^2)}]^{N-1}\quad x\ge 0\tag{1}$$ Simulating from $f_{(N)}$ can proceed by
- acceptance-rejection for (1) if a tight upper bound can be found (see 3.)
- direct simulation of a sample of $N$ Rayleigh variates and derivation of the largest value
- an expansion of $[1 - e^{-x^2/(2\sigma^2)}]^{N-1}$ found in (1) into$$\sum_{i=0}^{N-1} (-1)^i{N-1 \choose i}e^{-ix^2/(2\sigma^2)}$$ and the exploitation of the resulting density as a signed mixture, whose positive elements can be used as proposal in an acceptance-rejection algorithm
- taking advantage of the representation of a Rayleigh random variate as$^*$$$X_i=\sigma\sqrt{-2\log U_i}\qquad U_i\sim\mathcal U(0,1)$$since it implies that the largest value$$X_{(N)}=\sigma\sqrt{-2\log U_{(1)}}\qquad U_{(1)}\sim\mathcal Be(1,N)$$corresponds to the smallest Uniform variate, $U_{(1)}$, distributed from the $\mathcal Be(1,N)$ Beta distribution.
- inverting the cdf of $X_{(N)}$,$$F_{(N)}(x)=[1 - e^{-x^2/(2\sigma^2)}]^{N}\quad x\ge 0$$into $$X_{(N)}=F_{(N)}^{-1}(U)=\sigma\sqrt{-2\log(1-U^{1/N})}\qquad U\sim\mathcal U(0,1)$$
Options 4 and 5 are equivalent and obviously the most efficient options since their computing time does not increase with $N$ and does not involve rejection.
A quick benchmark
"beta"={
N=339
F=0;for(t in 1:1e3)F=F+rbeta(1,1,N)
},
"inverse"={
N=339;ooN=1/N
F=0;for(t in 1:1e3)F=F+sqrt(-log(1-runif(1)**ooN))
F=F*sqrt(2)
},
"inverse+"={
N=339
F=0;for(t in 1:1e3)F=F+sqrt(-log(1-exp(-rexp(1,N))))
F=F*sqrt(2)
},
replications=1e3}
gives the Beta simulation as fastest (in R):
test replications elapsed relative user.self sys.self
1 beta 1000 4.208 1.00 4.203 0.003
2 inverse 1000 5.091 1.21 5.092 0.000
3 inverse+ 1000 5.304 1.26 5.306 0.000
However, a vectorial version
"beta"={
N=339
mean(rbeta(1e3,1,N))
},
"inverse"={
N=339;ooN=1/N
mean(sqrt(-log(1-runif(1e3)**ooN)))*sqrt(2)
},
"inverse+"={
N=339
mean(sqrt(-log(1-exp(-rexp(1e3,N)))))*sqrt(2)
}
leads to the exact opposite:
test replications elapsed relative user.self sys.self
1 beta 1000 0.142 1.797 0.143 0
2 inverse 1000 0.081 1.025 0.081 0
3 inverse+ 1000 0.079 1.000 0.079 0
$^*$ This representation follows from the interpretation of the standard Rayleigh distribution as the length of a standard Normal bidimensional vector, $X=\sqrt{X_1^2+X_2^2}$, which can be equally written as$$X=\sqrt{-2\cos(2\pi V)^2\log(U)-2\sin(2\pi V)^2\log(U)}$$ as in the Box-Muller algorithm.
Best Answer
This problem is equivalent to solving a Fredholm integral equation of the first kind. This is, solving for $p(x)$ such that:
$$ p(y) = \int_{\text{supp}(X)} p(y\mid x)\, p(x)\, \text{d}x,\quad \forall y\in \text{supp}(Y) $$
In general, this is an ill-posed inverse problem, and thus it is challenging for both analytical and numerical approaches. As @Glen_b mentions, a characterization of the support of $X$ is needed, as some approaches start off with its discretization.
For instance, notice that:
$$ p(x) = \int p(x\mid y)\,p(y)\,\text{d}y = \int \frac{p(y\mid x)\, p(x)}{\int p(y\mid \tilde{x})\, p(\tilde{x})\,\text{d}\tilde{x}}\,p(y)\,\text{d}y $$
This motivates a simple fixed-point iteration to solve for $p(x)$ in the case of $X$ being discrete with support $\{c_1,\cdots, c_d\}$, $d\geq 2$, and $p(y\mid x)$ being available for direct evaluation:
$$ p_{n}(x) = \frac{1}{m}\sum_{j=1}^m \frac{p(y_i\mid x)\, p_{n-1}(x)}{\sum_{i=1}^d p(y_j\mid c_i)\, p_{n-1}(c_i)},\quad \forall x\in\{c_1,\cdots, c_d\} $$
The motivation is that, for a sufficiently large $N$, $p_N(x)\approx p(x)$ in its support. This is based on:
There are more refined approaches including: