We will assume that the $X_j$ are independent. This assumption is not automatically built into the definition of random sampling, but it is necessary if we are to give a complete answer.
If $F(x)$ is the cumulative distribution function for the population, and $F_n(x_1, x_2,\dots, x_n)$ is the joint sample (cumulative) distribution function, then, more or less as you wrote, we have
$$F_n(x_1,x_2,\dots,x_n)=P(X_1 \le x_1)(P(X_2\le x_2)\cdots P(X_n \le x_n).$$
Please note that $F_n$ is a function of the $n$ real variables $x_1, x_2, \dots,x_n$. (No caps!) The reasoning that you used to get to this stage was correct.
We can therefore write
$$F_n(x_1,x_2,\dots,x_n)=F(x_1)F(x_2)\cdots F(x_n).$$
The final displayed expression in the post, namely $[P(X_1 \le x_1)]^n$, is not correct, and cannot be correct, for it does not mention the variables $x_2$ to $x_n$. In the form $[F(x)]^n$, it does occur in the calculation of the distribution of the largest sample value, but that is not the problem you were asked to look at.
If we want the joint density function $f(x_1,x_2,\dots,x_n)$, we just multiply the individual density functions $f(x_j)$.
The (marginal) distribution of any $X_j$ is, by independence, the same as the population distribution. So if you want to specify the distribution by using a cdf, the answer would be simply $F(x_j)$. If we are in a continuous situation, and $F'(x)=f(x)$, then the (marginal) density function of $X_j$ is $f(x_j)$.
You have computed $H(X_1,\ldots,X_n)$ using the markov chain. Note that for each realisation of $(X_1,\ldots,X_n)$ there is a unique realisation of $(Y_1,\ldots,Y_n)$. In other words, $Y_i=f(X_i,Y_{i-1})$ is a deterministic function so entropy is preserved and $$H(Y_1,\ldots,Y_n)=H(X_1,\ldots,X_n),$$ by induction.
Best Answer
If $$P(X_1 = z_1, \ldots, X_j = z_j,\ldots, X_n = z_n)=P(Y_1 = z_1, \ldots, Y_j = z_j,\ldots, Y_n = z_n)$$ for all $z_1,\ldots,z_j,\ldots,z_n$ is what is meant by "equal joint distributions" then you might say the marginal probabilities are $$\displaystyle P(X_i=z_i) = \sum_{z_1}\cdots \sum_{z_j, j\not = i} \cdots \sum_{z_n} P(X_1 = z_1, \ldots, X_i = z_i,\ldots, X_j = z_j,\ldots, X_n = z_n)$$ and $$\displaystyle P(Y_i=z_i) = \sum_{z_1}\cdots \sum_{z_j, j\not = i} \cdots \sum_{z_n} P(Y_1 = z_1, \ldots, Y_i = z_i, \ldots, Y_j = z_j, \ldots, Y_n = z_n)$$ and these are clearly equal to each other since they are sums of equal probabilities over the same indicies.