If the Hamiltonian is time-dependent the evolution of a pure state is
$$
|\psi(t)\rangle = {\mathcal T} exp\left[ -\frac{i}{\hbar} \int_{t_0}^t d\tau H(\tau)\right] |\psi(0)\rangle = U(t; t_0) |\psi(0)\rangle
$$
where $\mathcal T$ is the time-ordering operator and
$$
i\hbar\dot{U} = H(t) U(t, t_0)
$$
Then a density matrix evolves according to
$$
\rho(t) = U(t; t_0) \rho(0) U^\dagger (t; t_0)
$$
and you can check that taking the time derivative gives
$$
i\hbar\dot{\rho} = \left[ H(t), \rho(t)\right]
$$
The density operator is hermitian. This means that you can find one orthonormal basis $|\phi_i\rangle$ of eigenvectors for it. By definition this means that there are real numbers $p_i$ such that
$$\rho |\phi_i\rangle = p_i |\phi_i\rangle.$$
Now, it is known that if $f(x)$ is one ordinary function, and $A$ is one hermitian operator with orthonormal basis of eigenvectors $|a_i\rangle$ then we define $f(A)$ on this basis to be
$$f(A)|a_i\rangle=f(a_i)|a_i\rangle,$$
which in turn defines $f$ on the whole Hilbert space, since the $|a_i\rangle$ are a basis.
Now this is how $f(\rho)=\rho \ln \rho$ is defined. In the basis of $\rho$ we have
$$f(\rho)|\phi_i\rangle=p_i \ln p_i |\phi_i\rangle.$$
Now remember you can compute the trace in any basis you want. We compute it in this basis, remembering that $f$ has matrix elements:
$$\langle \phi_j |f(\rho)|\phi_i\rangle = p_i \ln p_i \delta_{ij}.$$
Thus the trace is exactly
$$S(\rho)=-\operatorname{Tr}(\rho\ln \rho)=-\sum_{i}p_i \ln p_i.$$
Now, as a remark, in my opinion things can be thought the other way around. This latter expression for entropy was known before QM, with $p_i$ being the probabilities for the microstates. In order to generalize to Quantum Mechanics, remember that $\rho$ represents one ensemble, so that when you have a mixed state, you don't actualy know the actual microstate. On the other hand $\rho$ encodes the probabilities for the microstates as those $p_i$ above.
Thus one could start with the previous knowledge of what $S$ should be in terms of these $p_i$ and arrive at a general expression involving just $\rho$. In simple terms: $S(\rho)$ is defined to yield this result.
Best Answer
Hint: Use the spectral decomposition to write
$$\rho(0) := \sum\limits_k \lambda_k \,|k\rangle \langle k| \tag{1} ,$$ and then find an expression for $\rho(t)$ in terms of $\lambda_k$. Especially note that $\rho(t)$ has the same eigenvalues as $\rho(0)$. Finally, again using the spectral theorem, derive that $$ S[\rho(t)] = -\mathrm{Tr} \sum\limits_k \lambda_k \ln \lambda_k \, U(t)|k\rangle\langle k| U^\dagger(t) \tag{2} \quad .$$ The cyclic properties of the trace then yield the desired result, i.e. $S[\rho(t)]=S[\rho(0)]$.