This is a continuation of my answer to your easier version of this question on MathOverflow. As I wrote out this argument, I realized that it's not quite as easy as I thought when I originally wrote my comment on the other post, and I can now only claim a weaker result, but it is still sufficient to answer this question.
In the MO answer, we already established that $\mathrm{Con}(\mathsf{PA})$ is a first-order sentence that $\mathsf{PA}(Q_{\mathrm{Brch}})$ entails but which is not entailed by $\mathsf{PA}$.
I will show that there is a c.e. extension $\mathsf{Z}_2^{US}$ of $\mathsf{Z}_2$ (in a bigger language) with the property that if $(M,S)$ is a model of $\mathsf{Z}_2^{US}$, then there is a (first-order) elementary extension $(M',S') \succeq (M,S)$ such that $M' \models \mathsf{PA}(Q_{\mathrm{Brch}})$. By Gödel, this establishes that $\mathsf{PA}(Q_{\mathrm{Brch}}) \not \models \mathrm{Con}(\mathsf{Z}_2^{US})$. (I will denote models of second-order arithmetic like $(M,S)$, with $M$ the numbers sort and $S$ the set sort.)
This argument is partially based on the proof of compactness for $\mathcal{L}(Q_{\mathrm{Brch}})$ in Section 3.1 of this paper of Mekler and Shelah. Although this argument is actually easier, because we can lean on the coding structure of arithmetic.
Let $\mathcal{L}^U$ be the language of second order arithmetic augmented with a single new unary predicate on the set sort. Let $\mathcal{L}^{US}$ be this language enriched with Skolem functions in both sorts. Let $\mathsf{Z}_2^{US}$ be the $\mathcal{L}^{US}$ theory which consists of
- $\mathsf{Z}_2$ with comprehension extended to all $\mathcal{L}^{US}$-formulas,
- the assertion that $U$ is an (internal) non-principal ultrafilter on the number sort (i.e., any set of the form $[a,\infty)$ satisfies $U$; if $X$ satisfies $U$, then any superset of $X$ satisfies $U$; and if $X$ and $Y$ satisfy $U$, then $X\cap Y$ satisfies $U$), and
- the assertion that the Skolem functions in $\mathcal{L}^{US}$ are Skolem functions.
By taking the standard model of $\mathsf{Z}_2$ and appending a non-principal ultrafilter and Skolem functions, we can see that $\mathsf{Z}_2^{US}$ is consistent. It is clearly a c.e. theory, and by Gödel we have that $\mathsf{Z}_2^{US}\not \vdash \mathrm{Con}(\mathsf{Z}_2^{US})$.
Note that for any $(M,S) \models \mathsf{Z}_2^{US}$, the predicate $U$ defines a complete type over $MS$ in the number sort. Call this type $p|_{MS}$. Note furthermore that if $a$ realizes this type, then every element of the Skolem hull of $MSa$ realizes an $MS$-definable type over $MS$.
Fix $(M,S)\models \mathsf{Z}_2^{US}$. Fix an uncountable regular cardinal $\kappa > |MS|$. Let $(M_0,S_0) = (M,S)$. For each positive ordinal $\alpha < \kappa$, given $(M_\beta,S_\beta)$ for $\beta < \alpha$,
- if $\alpha = \gamma +1$, then let $(M_\alpha,S_\alpha)$ be the Skolem hull of $M_\gamma S_\gamma a$, where $a$ is a realization of $p|_{M_\gamma S_\gamma}$ and
- if $\alpha$ is a limit ordinal, let $(M_\alpha,S_\alpha) = \bigcup_{\beta < \alpha} (M_\beta,S_\beta)$.
Finally, let $(M',S') = \bigcup_{\alpha < \kappa} (M_\alpha,S_\alpha)$. By a not-too-difficult induction argument, one can show that for any $\alpha$, and any $c \in (M',S')$, the type of $c$ over $M_\alpha S_\alpha$ is $M_\alpha S_\alpha$-definable. Note also that for each $\alpha < \kappa$, $|M_\alpha S_\alpha| < \kappa$.
Fact. For any formula $\varphi(x,y,\bar{z})$, there is a formula $\psi(x,y,\bar{z})$ such that for any $\bar{a}$, if $\varphi(x,y,\bar{a})$ defines a directed partial order on the number sort with no largest element, then $\psi(x,y,\bar{a})$ defines a function from the number sort to itself which is strictly order preserving (from the ordinary order to $\varphi(x,y,\bar{a})$) and which is cofinal in the partial order.
Proof. Easy induction argument. $\square$
This implies that in each $(M_{\alpha+1},S_{\alpha+1})$, for any directed $M_{\alpha}S_\alpha$-definable partial order $<$ on $M_\alpha$, there is some $a \in M_{\alpha+1}$ such is $<$-greater than every element of $M_{\alpha}$.
Lemma. If $(<_1,<_2,r)$ are a pair of $M'S'$-definable partial orders and an $M'S'$-definable function on the number sort that form a level tree and if $B \subseteq M'$ is a branch of this tree, then $B \in S'$ (in the sense that there is some element of $S'$ with the same members as $B$.
Proof. First note that since this is a level tree, definable initial segments of $B$ are cofinal in $B$ (consider sets of the form $x <_1 a$). For any $M_\alpha$, we can find $b \in B$ such that the rank $r(b)$ is greater than any of the ranks of elements of $M_\alpha \cap B$. This implies that $M_\alpha \cap B = M_\alpha \cap (-\infty, b]$ (since branches are linearly ordered). By one of the above comments, the type of $b$ over $M_\alpha S_\alpha$ is $M_\alpha S_\alpha$-definable. This implies that the set $M_\alpha \cap (-\infty,b]$ (and therefore $M_\alpha \cap B$) is actually definable in $(M_\alpha,S_\alpha)$. Since formulas only use finitely many parameters, this implies that for each limit ordinal $\alpha<\kappa$, there is a strictly smaller ordinal $\beta_\alpha$ such that $M_\alpha \cap B$ is definable in $M_\alpha$ with parameters from $M_{\beta_\alpha} S_{\beta_\alpha}$. By Fodor's lemma, there is a single $\delta <\kappa$ such that $M_\alpha \cap B$ is definable with parameters from $M_\delta S_\delta$ for a stationary (and therefore unbounded) set of $\alpha$'s. By the pigeonhole principle, there must be a single formula $\varphi(x,\bar{b})$ with $\bar{b} \in M_\delta S_\delta$ such that for an unbounded set of $\alpha$'s, $\varphi(M_\alpha, \bar{b}) = M_\alpha\cap B$. Therefore, by elementarity, $\varphi(x,\bar{b})$ actually defines $B$ in $(M',S')$. $\square$
For any formula $\varphi(x,\bar{y})$, we have that for any $\bar{b}$, there is a set $X \in S'$ such that $X = \varphi(M',\bar{b})$. (This is just comprehension.) This implies that given any $\mathcal{L}(Q_{\mathrm{Brch}})$ formula $\psi(\bar{x})$ and any $\bar{a} \in M'$, $M' \models \psi(\bar{a})$ if and only if $(M',S')$ satisfies the internal interpretation of it. Therefore, since no $M'S'$-definable subset of $M'$ is a cut, we have that $M' \models \mathsf{PA}(Q_{\mathrm{Brch}})$. If we start with a model $(M,S)$ of $\mathsf{Z}_2^{SU} + \neg \mathrm{Con}(\mathsf{Z}_2^{SU})$, then we get an $M' \models \mathsf{PA}(Q_{\mathrm{Brch}})$ which does not satisfy true arithmetic, establishing that $\mathcal{L}(Q_{\mathrm{Brch}})$ is $\mathsf{PA}$-intermediate.
I vaguely seem to recall that there is actually a forcing-style reduction of of 'second-order arithmetic + an ultrafilter predicate' to $\mathsf{Z}_2$ which gives a relative consistency argument, but that doesn't seem to be enough to get consistency of $\mathsf{Z}_2^{SU}$ from consistency of $\mathsf{Z}_2$.
Let $\sigma = \{P,(c_\alpha)_{\alpha<\omega_1},f\}$, and let $\sigma' = \{P,(c_\alpha)_{\alpha<\omega_1},(d_n)_{n<\omega}\}$. Here $P$ is a unary relation symbol, $f$ is a unary function symbol, and the $c_\alpha$ and $d_n$ are constant symbols. Then $\sigma\cap \sigma' = \{P,(c_\alpha)_{\alpha<\omega_1}\}$ Now define:
\begin{align*}\Psi &= \{P(c_\alpha)\land P(c_\beta)\land c_\alpha\neq c_\beta\mid \alpha<\beta<\omega_1\}\cup \{\exists x_1\dots\exists x_n(\bigwedge_{i=1}^n \lnot P(x_i)\land \bigwedge_{i\neq j} x_i\neq x_j)\mid n<\omega\}\\
\varphi &: (\forall x\, (P(x)\rightarrow \lnot P(f(x)))\land (\forall x\forall y\, (f(x) = f(y)\rightarrow x = y))\\
\varphi' &: \forall x\, (\lnot P(x)\rightarrow \bigvee_{n<\omega} (x = d_n))
\end{align*}
A model of $\Psi$ has $\omega_1$-many distinct elements named by constants and satisfying $P$ (as well as possibly other elements satisfying $P$), and infinitely many elements satisfying $\lnot P$. I claim that any two such models, say $M$ and $N$, are $\mathcal{L}_{\omega_1,\omega}(\sigma\cap \sigma')$-equivalent. Since any sentence of $\mathcal{L}_{\omega_1,\omega}$ only mentions countably many symbols, it suffices to show that for any countable signature $\sigma^*\subseteq (\sigma\cap \sigma')$, the reducts $M|_{\sigma^*}$ and $N|_{\sigma^*}$ are $\mathcal{L}_{\omega_1,\omega}(\sigma^*)$-equivalent. Now $M|_{\sigma^*}$ and $N|_{\sigma^*}$ consist of countably-many distinct elements named by constants and satisfying $P$, infinitely many other elements satisfying $P$, and infinitely many elements satisfying $\lnot P$. By the infinite Ehrenfeucht-Fraïssé game, $M|_{\sigma^*}$ and $N|_{\sigma^*}$ are $L_{\infty,\omega}(\sigma^*)$-equivalent.
A model of $\Psi\cup \{\varphi\}$ is a model $M$ of $\Psi$ together with an injective function $f\colon M\to M$ which maps $P$ into $\lnot P$. This is satisfiable, e.g. by taking $P$ to be $\omega_1$, with $c_\alpha = \alpha$, taking $\lnot P$ to be a disjoint set $X$ of size $\aleph_1$, and taking $f = g\cup g^{-1}$, where $g$ is a bijection $X\to \omega_1$.
A model of $\Psi\cup \{\varphi'\}$ is a model $M$ of $\Psi$ such that every element of $\lnot P$ is named by some constant $d_n$. This is satisfiable, e.g. by taking $P$ to be $\omega_1$, with $c_\alpha = \alpha$, taking $\lnot P$ to be a disjoint countably infinite set, each element of which is the interpretation of one of the constants $d_n$.
But $\Psi\cup \{\varphi,\varphi'\}$ is not satisfiable, because $\varphi$ forces $\lnot P$ to be uncountable, while $\varphi'$ forces $\lnot P$ to be countable.
Best Answer
It’s not symmetric. Here are two examples. The idea is to take $\Sigma,\phi$ to be $\Pi,\psi$ augmented with hard-to-compute but uniquely determined structure.
I’ll make use of results for $AC^0/\text{poly}$ circuits. A first-order formula can be implemented as $AC^0$ circuits by replacing quantifiers by conjunctions or disjunctions over the domain. Details are given in Immerman’s Descriptive Complexity Theorem 5.22 (and Meta-Proposition 11.19).
No use is made of the assumption that $\lambda$ is first-order.
First example:
$\Sigma,\phi$: a total order $<$ and a compatible BIT predicate, and a 1-ary relation $R_1$ that is an initial segment of $<$
$\Pi,\psi$: no axioms, just a 1-ary relation $S_1$
The BIT predicate can be axiomatized by specifying that the least element has all zero bits, and the bits of the successor of $n$ are calculated by flipping the least significant bits of $n$ up to and including the least significant zero.
There are $|X|+1$ models of each theory. $\langle \phi_1,\psi_1\rangle$ is definably solvable by taking $\theta_1(x)=R_1(x).$
Suppose that $\langle \psi,\phi\rangle$ were definably solvable.Then there would be quasipolynomial size constant depth circuits for computing the parity function, contradicting Håstad's switching lemma bound.
Fix $X$ and a $\Gamma$-structure satisfying $\lambda.$ Let $N_{X}$ be the set of $\lfloor n/2\rfloor$ possible values of $|R_1|$ that you get from inputs with $|S_1|$ odd. (More precisely: values of $|\theta_1^{\mathcal X}|$ where $\mathcal X$ is a $\Pi\cup\Gamma$ structure with the chosen $\Gamma$ structure as a reduct, and with $|S_1^{\mathcal X}|$ odd.) The circuit takes (the characteristic function of an interpretation of) $S_1$ as input, and can apply circuits for the $\theta_i$ formulas to compute $<,$ BIT, and $R_1$ as polynomial-size, constant depth subcircuits. It guesses the $<$-first $\log n$ elements, trying all $n^{\log n}$ options in parallel, verifying using circuits for $<.$ It then tests the BIT representation of the $<$-first element $x$ satisfying $\neg R_1(x)$ (if any) against each of the values in $N_X.$ The circuit outputs $1$ if any of these values is hit, or if $|X|\in N_X$ and $\forall x.R_1(x)$ holds.
Second example:
$\Sigma,\phi$: a total order, and a 1-ary relation $R_1,$ and a deterministic easily-verifiable derivation of the output of the function $h$ in Sampling Lower Bounds: Boolean Average-Case and Permutations, Emanuele Viola, Theorem 2, applied to the characteristic function of $R_1$
$\Pi,\psi$: a total order, and a 1-ary relation $S_1$
For example, the derivation could be an encoding of a complete record of every step taken by a fixed polynomial-time Turing machine computing $h,$ including the entire tape at each step as a polynomial-sized bitstring, together with basic arithmetic operations to make all this data verifiable by a first-order predicate.
There are $2^{|X|}$ models of each theory. $\langle \phi,\psi\rangle$ is definably solvable by taking $\theta_1(x)=R_1(x).$ Suppose that $\langle \psi,\phi\rangle$ were definably solvable. Then there would be polynomial size constant depth circuits giving the correct distribution on $(R_1,h(R_1))$ when given a uniformly chosen bitstring $S_1$ as input. Viola’s Theorem 2 says that such a small circuit cannot give anything like the correct distribution.