First, for convenience rewrite $J(p,p')$ as
\begin{align}
J(p,p') &= -D(p\Vert p')+\sum_{i}p_{i}\sum_{j}Q_{ij}\log\frac{Q_{ij}}{q_{j}}+\sum_{j}q_{j}\log\frac{q_{j}}{q'_{j}} \\
&= I({p})-\left[D(p\Vert p')-D(q\Vert q')\right]\\
& = I(p) - \Delta D(p\Vert p')
\end{align}
where I have used your definition of $I(p)$, and defined
\begin{align}
\Delta D(p\Vert p')= D(p\Vert p')-D(q\Vert q')
\end{align}
$\Delta D(p\Vert p')$ reflects the "contraction" of Kullback-Leibler divergence between $p$ and $p'$ under the stochastic matrix $Q$. $\Delta D$ is convex in the first argument. To see why, consider any two distributions $p$ and $\bar{p}$, and define the convex mixture $p^\alpha = (1-\alpha) p + \alpha \bar{p}$. We will show convexity by demonstrating that the second derivative with respect to $\alpha$ is non-negative.
First, compute the first derivative w.r.t. $\alpha$ as
\begin{align*}
& {\textstyle \frac{d}{d\alpha}}\Delta D(p^{\alpha}\Vert q) ={\textstyle \frac{d}{d\alpha}}\left[\sum_{i}p_{i}^{\alpha}\log p_{i}^{\alpha}-\sum_{i}p_{i}^{\alpha}\sum_{j}Q_{ij}\log q_{j}^{\alpha}+\sum_{i}p_{i}^{\alpha}\sum_{j}Q_{ij}\log\frac{q_{j}'}{p'_{i}}\right]\\
& =\sum_{i}(\bar{p}_{i}-p_{i})\log p_{i}^{\alpha}-\sum_{i}(\bar{p}_{i}-p_{i})\sum_{j}Q_{ij}\log q_{j}^{\alpha}+\sum_{i}(\bar{p}_{i}-p_{i})\sum_{j}Q_{ij}\log\frac{q_{j}'}{p'_{i}}
\end{align*}
Then, we compute the second derivative at $\alpha=0$ as
\begin{align}
{\textstyle \frac{d^{2}}{d\alpha^{2}}}\Delta D(p^{\alpha}\Vert q)\vert_{\alpha=0}=\sum_{i}\frac{\left(\bar{p}_{i}-p_{i}\right)^{2}}{p_{i}}-\sum_{j}\frac{\left(\bar{q}_{j}-q_{j}\right)^{2}}{q_{j}}
\end{align}
$\sum_{i}\frac{\left(\bar{p}_{i}-p_{i}\right)^{2}}{p_{i}}$ is the so-called $\chi^2$ divergence from $p$ to $\bar{p}$, and $\sum_{j}\frac{\left(\bar{q}_{j}-q_{j}\right)^{2}}{q_{j}}$ is the same once $Q$ is applied to $p$ and $\bar{p}$.
Note that $\chi^2$ divergence is a special case of a $f$-divergence, and therefore obeys a data-processing inequality (see e.g. Liese and Vajda, IEEE Trans on Info Theory, 2006, Thm. 14). In particular, that means that ${\textstyle \frac{d^{2}}{d\alpha^{2}}}\Delta D(p^{\alpha}\Vert q)\vert_{\alpha=0} \ge 0$.
At the same time, $\Delta D(p\Vert p')$ is not convex in the second argument. Consider $p = (0.5,0.5,0)$, $q=(0.5,0.25,0.25)$, and $Q = \left( \begin{smallmatrix} 0.95 & 0.05 \\ 1 & 0 \\ 0 & 1\end{smallmatrix} \right)$. Here is a plot of $\Delta D$ where the first argument is $p$ and the second argument is a convex mixture of between $p$ and $q$, $\alpha p + (1-\alpha) q$, for different $\alpha$:
(see code at https://pastebin.com/q8XLnGK8)
By visual inspection, it can be verified $\Delta D(p\Vert p')$ is neither convex nor concave in the second argument.
Regarding your questions:
(1) $I({p})$ is known to be concave in $p$ (Theorem 2.7.4 in Cover and Thomas, 2006). As we've shown $\Delta D(p\Vert p')$ is convex in $p$, so $-\Delta D(p\Vert p')$ is concave in $p$. Since the sum of concave functions is concave, $J(p,p') = I(p) - \Delta D(p\Vert p')$ is concave in $p$.
At the same time, as a function of the second argument $p'$,
$J(p,p') = \mathrm{const} - \Delta D(p\Vert p')$, and we've shown above that $\Delta D(p\Vert p')$ is neither convex nor concave in the second argument. Thus, $J(p,p')$ is not concave in the second argument.
(2) $\Delta D(p\Vert p') \ge 0$ by the data processing inequality for KL divergence (Csiszar and Körner, 2011, Lemma 3.11). That means that
\begin{align}
J(p,p') = I(p) - \Delta D(p\Vert p') \le I(p) ,
\end{align}
from which $J(p,p') \le \max_s I(s)$ follows immediately.
Best Answer
$$\begin{align} H(p')&=-\sum_i\left(\sum_j a_{ij}p_j\right)\log\left(\sum_k a_{ik}p_k\right)\\ &\leq -\sum_i\sum_j a_{ij}p_j\log\left(p_j\right)\quad\because\text{Jensen's inequality}\\ &=H(p)\sum_i a_{ij}\\ &=H(p)\quad\because\text{Doubly stochastic matrix} \end{align}$$