A kernel density estimate is a mixture distribution; for every observation, there's a kernel. If the kernel is a scaled density, this leads to a simple algorithm for sampling from the kernel density estimate:
repeat nsim times:
sample (with replacement) a random observation from the data
sample from the kernel, and add the previously sampled random observation
If (for example) you used a Gaussian kernel, your density estimate is a mixture of 100 normals, each centred at one of your sample points and all having standard deviation $h$ equal to the estimated bandwidth. To draw a sample you can just sample with replacement one of your sample points (say $x_i$) and then sample from a $N(\mu = x_i, \sigma = h)$. In R:
# Original distribution is exp(rate = 5)
N = 1000
x <- rexp(N, rate = 5)
hist(x, prob = TRUE)
lines(density(x))
# Store the bandwith of the estimated KDE
bw <- density(x)$bw
# Draw from the sample and then from the kernel
means <- sample(x, N, replace = TRUE)
hist(rnorm(N, mean = means, sd = bw), prob = TRUE)
Strictly speaking, given that the mixture's components are equally weighted, you could avoid the sampling with replacement part and simply draw a sample a size $M$ from each components of the mixture:
M = 10
hist(rnorm(N * M, mean = x, sd = bw))
If for some reason you can't draw from your kernel (ex. your kernel is not a density), you can try with importance sampling or MCMC. For example, using importance sampling:
# Draw from proposal distribution which is normal(mu, sd = 1)
sam <- rnorm(N, mean(x), 1)
# Weight the sample using ratio of target and proposal densities
w <- sapply(sam, function(input) sum(dnorm(input, mean = x, sd = bw)) /
dnorm(input, mean(x), 1))
# Resample according to the weights to obtain an un-weighted sample
finalSample <- sample(sam, N, replace = TRUE, prob = w)
hist(finalSample, prob = TRUE)
P.S. With my thanks to Glen_b who contributed to the answer.
Assuming that the only thing that you have is an empirical distribution, the simplest way to go is to draw values from $\hat F$ and reject if $X_1 \leq c_1$ and $X_2 \leq c_2$. Less naive implementation would be to subset $(X_1, X_2)$ values so to drop the values below threshold and draw from $\hat F_\text{trunc}$, the same way as you would do with any other discrete distribution. Simple example in R of such approach can be find below.
set.seed(123)
c1 <- -1
c2 <- 0.5
X <- data.frame(X1 = rnorm(100), X2 = rnorm(100)) # creating fake data
X_trunc <- subset(X, X1 > c1 & X2 > c2) # subset
# draw 1000 samples
X_trunc[sample(nrow(X_trunc), 1000, replace = TRUE), ]
The above example assumes that you have the full data, however if the only thing that you have is the empirical distribution tables with probabilities for $(x_1,x_2)$, than the procedure is the same but you draw the $(x_1,x_2)$ pairs with $\hat F(x_1,x_2)$ probabilities as in the example below.
library(dplyr)
# lets calculate the probabilities for x1,x2 pairs
FX <- group_by(X, X1, X2) %>%
summarise(n = n()) %>%
ungroup() %>%
mutate(prob = n/sum(n))
# next, we subset and sample as above but from F(X1,X2)
FX_trunc <- subset(FX, X1 > c1 & X2 > c2)
# notice that here we sample with parameter prob set to F(x1,x2)
FX_trunc[sample(nrow(FX_trunc), 1000, replace = TRUE, prob = FX_trunc$prob), ]
Drawing from bivariate distribution does not differ in here from drawing from univariate distribution, the values to be drawn are pairs, or more precisely indexes for those pairs.
Best Answer
If you want to sample (Age,Income) pairs, use any old sampling method directly on your joint distribution. Using numpy: