Assuming that the theoretical distributions of $x_1$ and $x_2$ are not known, a naive algorithm for determining independence would be as follows:
Define $x_{1,2}$ to be the set of all co-occurences of values from $x_1$ and $x_2$. For example, if $x_1 = { 1, 2, 2 }$ and $x_2 = { 3, 6, 5}$, the set of co-occurences would be $\{(1,3), (1, 6), (1, 5) , (2, 3), (2,6), (2,5), (2, 3), (2,6), (2,5))\}$.
- Estimate the probability density functions (PDF's) of $x_1$, $x_2$ and $x_{1,2}$, denoted as $P_{x_1}$, $P_{x_2}$ and $P_{x_{1,2}}$.
- Compute the mean-square error $y=sqrt(sum(P_{x_{1,2}}(y_1,y_2) - P_{x_1}(y_1) * P_{x_2}(y_2))^2)$, where $(y_1,y_2)$ takes the values of each pair in $x_{1,2}$.
- if $y$ is close to zero, it means that $x_1$ and $x_2$ are independent.
A simple way to estimate a PDF from a sample is to compute the sample's histogram and then to normalize it so that the integral of the PDF sums to 1. Practically, that means that you have to divide the bin counts of the histogram by the factor $h * sum(n)$ where $h$ is the bin width and $n$ is the histogram vector.
Note that step 3 of this algorithm requires the user to specify a threshold for deciding whether the signals are independent.
The integral2
function of the pracma package is a possibility:
library(pracma)
f <- function(x ,y) x + y
integral2(f, xmin = 0, xmax = 3, ymin = function(x) x, ymax = function(x) x+2)
If you are not allowed to use a package, you can nest the integrate
function:
inner <- function(x) integrate(function(y) x+y, lower = x, upper = x + 2)$value
integrate(Vectorize(inner), lower = 0, upper = 3)
Both methods give 24
, an approximate value of the double integral.
Finally, you can use the SimplicialCubature package after noticing that the region of integration can be split into two triangles (= simplicies). Moreover, the integrand is a polynomial function, and then SimplicialCubature offers the possibility to get the exact value of the integral.
library(SimplicialCubature)
S1 <- cbind(c(0,0), c(3,3), c(0,2)) # first triangle
S2 <- cbind(c(0,2), c(3,3), c(3,5)) # second triangle
S <- array(c(S1, S2), dim = c(2, 3, 2))
P <- definePoly(coef = c(1,1), k = cbind(c(1,0), c(0,1)))
printPoly(P) # x + y
integrateSimplexPolynomial(P, S)
# 24
Best Answer
Answered in comments:
As you mentioned, the joint pdf of two random variables is $f_{X,Y}(x,y)=f_X(x)f_Y(y)$ if they are independent. So if $X,Y\overset{iid}{\sim}\text{Geometric}(p)$ then $f_{X,Y}(x,y) = (1-p)^{x-1}p(1-p)^{y-1}p = (1-p)^{x+y-2}p^2$