Remember that the word "capacitance" is just that, a word. The standard definition of capacitance given in introductory textbooks only applies when the two objects have charges $Q$ and $-Q$, in which case we define
$$C = \frac{Q}{V}$$
where $V$ is the potential difference between the objects. There is no inherently "correct" way to generalize this definition to unequal charges; there are just more and less useful ways. Furthermore, you never need the idea of capacitance. You can just solve for everything directly using Coulomb's and Gauss's laws. Capacitance is merely a definition, not a law.
That said, there can certainly be bad definitions. When you have unequal charges $Q_1$ and $Q_2$, it seems many people in this thread and others would like to define capacitance so that it stays the same as before, which they claim happens if you define $C = |Q_1 - Q_2| / 2 V$. However, this claim is wrong. As a simple example, consider two distant spheres of different radii $r_i$. Their pairwise capacitance is certainly nonzero, but if you give the spheres the same charge, $Q_1 = Q_2 = Q$, then $V_1 \approx k Q_1 / r_1$ and $V_2 \approx k Q_2 / r_2$. As a result, their claimed definition of capacitance yields zero!
The problem is that you can't summarize the response of two conductors to general charges with a single number; instead, you need three, which are roughly the original pairwise capacitance and the self-capacitances of each one alone. Only by using all three of these quantities can you compute the voltages. Similarly, for $n$ conductors you need to specify $\binom{n}{2} + n$ numbers.
In electrical engineering courses, this information is summarized in an elegant way. To motivate it, suppose you had $n$ conductors, which had charges $Q_i$. The charge on any one conductor affects the potentials and charge distributions on all of the other conductors in a complicated way. However, the equations governing the system are still linear, which means the superposition principle works. In particular, that means the potentials $V_i$ of the conductors in the general case can be found by adding together the potentials you get by only charging the first conductor, and then only charging the second conductor, and so on, assuming the potential is set to zero at infinity.
Therefore, the $Q_i$ and $V_i$ are always related by a linear transformation, and we define
$$Q_i = \sum_j C_{ij} V_j.$$
The matrix of $C_{ij}$ is called the Maxwell capacitance matrix. For example, in the special case where there are only two conductors with opposite charges, you can show that the "usual" simple definition of pairwise capacitance is related to these coefficients by
$$C = \frac{C_{11} C_{22} - C_{12}^2}{C_{11} + C_{22} + 2 C_{12}}.$$
Furthermore, it can be shown that $C_{ij} = C_{ji}$, and that the usual self-capacitances are the $C_{ii}$. For more about this, see section 3.6 of Purcell and Morin, Electricity and Magnetism.
Once again, you never need this idea to solve problems, because it is derived from more basic things, such as Gauss's law. But it's the most useful way to parametrize things in certain contexts.
You have already stated correctly that there is no potential difference between the plates and the battery terminals due to the (ideal) wire connections. Moving the plates doesn't change this observation. However, there are transient currents in the wires during the movement. E.g., when you move the plates further apart, the capacitance diminishes in time and thus the charge $Q$ on the plates is reduced as a function of distance. This means, that a reverse current has to flow into the battery. If your wire has a finite resistance, this causes a voltage drop between the plates and the terminals during the movement.
Best Answer
1) I would not call this a capacitor. Your typical parallel plate capacitor has two charged plates kept at some potential difference (by being hooked up to opposite terminals of a battery, for example). These are just two charged plates that end up being connected and the charges balance out on each side. In otherwords, I would not say you are "storing" charge here like what you would expect a capacitor to do. You could still define a capacitance for the system, but it would not take the general form $C=\frac QV$, since we do not have a single $Q$ to reference.
In general, you can define a capacitance matrix $C_{ij}$ such that $$Q_1=C_{11}V_1+C_{12}V_2$$ $$Q_2=C_{21}V_1+C_{22}V_2$$
Of course, this is more useful when the potentials of the plates are given. However, there is such thing as an "elastance matrix" $P_{ij}$, which is the inverse of the capacitance matrix:
$$V_1=P_{11}Q_1+P_{12}Q_2$$ $$V_2=P_{21}Q_1+P_{22}Q_2$$
These matrices are symmetric so that $C_{12}=C_{21}$ and $P_{12}=P_{21}$. These terms are related to the mutual capacitance between the plates. The diagonal terms deal with the self capacitance.
2) Due to symmetry and the fact that we are dealing with perfect conductors, the charge on each plate must be equal
$$Q_1'=Q_2'=\frac{Q_1+Q_2}{2}$$
3) You can still figure out the potential energy difference between the two plates. If the plate separation is small, then between the plates we are looking at distances very close to plates, so we can treat them as infinite planes of charge. Using Gauss's law, we get that $E_1=\frac{\sigma _1}{2\epsilon _0}$ and $E_2=\frac{\sigma _2}{2\epsilon _0}$. Therefore, in between the plates, the field is
$$E=E_1-E_2=\frac{\sigma _1-\sigma _2}{2\epsilon _0}$$
Therefore, the potential difference between the plates is just
$$V=Ed=\frac{\sigma _1-\sigma _2}{2\epsilon _0}d$$.
However, you cannot express this in terms of $U=\frac 12 \epsilon _0 E^2$ for $E$ just inside the plates because the field is not $0$ outside of the plates.
4) There is no energy stored in the system, at least in the sense of energy typically stored in a typical capacitor. There is potential energy since the excess charges on each plate are interacting, but it would take no work to move one charge from one plate to the other since a perfect conductor is an equipotential surface. (Once you move that charge though, then moving another charge would require work, but this would involve some external force keeping the first charge in place, like from a battery, which then makes the system not an ideal conductor). Typically when you talk about energy being stored on a capacitor, you are talking about the energy needed to separate the charges and maintain that separation.
5) The electric field does work to move charges from one plate to the other. This is where the energy goes.
I am not an expert on capacitance, so anyone can correct my reasoning here if something is off. I think something we take for granted in relating the energy stored in the capacitor to the energy in the fields is that in the typical parallel plate capacitor the field is $0$ outside of the system so that the potential difference and the energy and the field are easily to relate. I think in your initial set up you have to be careful in thinking about if you just want to consider the field between the plates or the overall field.