Consider a quantum system with state (Hilbert) space $\mathcal H$. For simplicity, let the Hamiltonian $H$ of the system have discrete spectrum so that there exists a basis $|n\rangle$ with $n=0,1,2,\dots$ for the state space consisting of eigenvectors of the Hamiltonian. Let $\epsilon_n$ denote the energy corresponding to each eigenvector $|n\rangle$, namely
\begin{align}
H|n\rangle = \epsilon_n|n\rangle
\end{align}
Now, it may happen that one or more of the energy eigenvectors $|n\rangle$ have the same energy. In this case, we say that their corresponding shared energy eigenvalue is degenerate. It is therefore often convenient to have a the concept of the energy levels $E_j$ of the system which are simply defined as the sequence of distinct energy eigenvalues in the spectrum of the Hamiltonian. So, whereas one can have $\epsilon_n = \epsilon_m$ if $n\neq m$, one cannot have $E_n = E_m$ if $n\neq m$. Moreover, it is often convenient to label the energy levels in increasing index order so that $E_m < E_n$ whenever $m<n$.
The degeneracy $g_n$ of the energy level $E_n$ is defined as the number of distinct energy eigenvalues $\epsilon_m$ for which $\epsilon_m=E_n$. For simplicity, we assume that none of the levels is infinitely degenerate so that $g_n\geq 1$ is integer for all $n$.
The partition function of a system in the canonical ensemble is given by
\begin{align}
Z = \sum_n e^{-\beta\epsilon_n}
\end{align}
In other words, the sum is over the state labels, not over the energy levels. However, noting that whenever there is degeneracy, sum of the terms in the sum will be the same, we can rewrite the partition function as a sum over levels
\begin{align}
Z = \sum_{n} g_n e^{-\beta E_n}
\end{align}
The degeneracy factor is precisely what counts the number of terms in the sum that have the same energy.
As for a simple example, consider a system consisting of two, noninteracting one-dimensional quantum harmonic oscillators. The eigenstates of this system are $|n_1, n_2\rangle$ where $n_1,n_2 = 0, 1, 2, \dots$ and the corresponding energies are
\begin{align}
\epsilon_{n_1,n_2} = (n_1+n_2+1)\hbar\omega.
\end{align}
The canonical partition function is given by
\begin{align}
Z = \sum_{n_1,n_2=0}^\infty e^{-\beta \epsilon_{n_1, n_2}} = \underbrace{e^{-\beta\hbar\omega}}_{n_1=0,n_2=0} + \underbrace{e^{-\beta(2\hbar\omega)}}_{n_1=1,n_2=0} + \underbrace{e^{-\beta(2\hbar\omega)}}_{n_1=0,n_2=1} + \cdots
\end{align}
If you think about it for a moment, you'll notice that, in fact, the energy levels of this composite system are
\begin{align}
E_n = n\hbar\omega, \qquad n_1+n_2 = n
\end{align}
and that the degeneracy of the $n^\mathrm{th}$ energy level is
\begin{align}
g_n = n
\end{align}
so that the partition function can also be written in the form that uses energy levels and degeneracies as follows:
\begin{align}
Z = \sum_{n=1}^\infty g_n e^{-\beta E_n} = \underbrace{e^{-\beta\hbar\omega}}_{n=1} + \underbrace{2e^{-\beta(2\hbar\omega)}}_{n=2} + \underbrace{3e^{-\beta(3\hbar\omega)}}_{n=3} + \cdots
\end{align}
When considering a partition function of a system composed of several distinguishable subsystems you never add the separate partition functions up, and always multiply them.
The reason is that the partition function covers the possible states of a system, and when for a system composed of subsystem we can set subsystem $A$ to a certain state and then we have to cover all of the states of subsystem $B$. Then change the state of subsystem $A$ and again sum over all states of subsystem $B$. This is multiplication
$$ Z_{AB} = \sum_{A,B} e^{-\beta(E_A+E_B)} = \sum_{A} e^{-\beta E_A} \sum_B e^{-\beta E_B} = Z_A Z_B$$
and the generalization to more than two subsystems is immediate.
Note that for this to be valid the subsystems must be separate and distinguishable. If they are interacting, for example, then you might have $E_{AB} \neq E_A + E_B$. If the particles are identical and indistinguishable, then they cannot be separated into subsystems $A$ and $B$ to begin with.
By the way - this rule of multiplication of partition functions is valid for classical systems as well as quantum ones.
Best Answer
Maybe expanding the equation $Z=z^N$ explicitly will be helpful $$ \begin{aligned} Z&=z^N\\ &=(\sum_ie^{-\beta \epsilon_i})^N\\ &=(\sum_{i_1}e^{-\beta \epsilon_{i_1}})\cdot(\sum_{i_2}e^{-\beta \epsilon_{i_2}})\cdots(\sum_{i_N}e^{-\beta \epsilon_{i_N}})\\ &=\sum_{i_1,i_2,\cdots,i_N}e^{-\beta(\epsilon_{i_1}+\epsilon_{i_2}+\cdots+\epsilon_{i_N})} \end{aligned} $$ And now we can see $$ E_i = \epsilon_{i_1}+\epsilon_{i_2}+\cdots+\epsilon_{i_N} $$ So for the first question, $\epsilon$ is different from $E$.
For the second question, I think $$ \begin{aligned} U=\bar{E}&=\sum_iP_i(E_i)E_i\\ &=\sum_jP(E_j)E_j\\ &=\sum_j\frac{w(E_j)e^{-\beta E_j}}{Z}E_j \end{aligned} $$