The surface can be any orientable surface whose boundary is the chosen loop.
For reference, Ampère's law with Maxwell's correction in integral form:
$$\oint_{\partial S} \mathbf{B} \cdot d\mathbf{l} = \mu_0 I_{enc} + \mu_0 \epsilon_0 \iint_S \frac{\partial\mathbf{E}}{\partial t} \cdot d\mathbf{a}$$
The value of the right-hand side is independent of the surface chosen. To see this, suppose two surfaces, $S_1$ and $S_2$, both have the same boundary $\partial S$. Then take the difference in the right-hand side evaluated on the two surfaces. You will end up with a term of the form $\mu_0 I + \mu_0 \epsilon_0 \iint \partial\mathbf{E}/\partial t \cdot d\mathbf{a}$ evaluated on a closed surface. Using Gauss's law, the second term can be converted into $\mu_0 \partial Q/\partial t$ where $Q$ is the charge enclosed. But $\mu_0(I + \partial Q/\partial t) = 0$ by charge conservation.
Note that this depends crucially on the displacement current term. If it is omitted, then the right-hand side may differ for two choices of the surface, if charge is building up in the volume between them. In many textbooks, this discrepancy is used to motivate the presence of the displacement current term.
Ampère's law cannot really be proved in any meaningful fashion: it is the static version of the Ampère-Maxwell law, which is one of Maxwell's equations. Those are the heart of electrodynamics, and they are essentially postulated as they are, with validation for their form and content coming from the ultimate success of the theory.
However, there is some work to be done to connect Maxwell's electrodynamics to the 'naïve' electrodynamics of Coulomb, Faraday and Ampère. For example, it is an important exercise to prove the equivalence of Gauss' law for the electric field and Coulomb's force law, which guarantees that no information is lost in choosing the first one as the corresponding foundational corner of the theory. Similarly, one must prove that, in the magnetostatic régime, Ampère's law is equivalent to the Biot-Savart force law, and it is this proof which, in essence, you're asking about.
To make this goal explicit, one needs to prove that
$$
\oint_C \mathbf{B}\cdot\text d\mathbf{s}=\mu_0 I_\text{encl},
$$
where $I_\text{encl}$ is the current enclosed by the path $C$, where $\mathbf{B}$ is given by the Biot-Savart law from some specified set of currents. The problem with this is that these set of currents can be relatively hard to specify; in particular, they could be line, surface, or volume currents, and each of those requires a slightly different (but equivalent) treatment.
To keep things simple, I will work here with volume currents, i.e. some current density $\mathbf{J}$ throughout space. This can be reduced to surface or line currents by taking the appropriate limit of very high current densities over very small areas, so this is also quite general. In this setting, the Biot-Savart law can be expressed as
$$
\mathbf{B}(\mathbf{r})=\frac{\mu_0}{4\pi}\int
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
\text d^3\mathbf{r}'.
$$
The proof then reduces to a calculation:
$$
\begin{align}
I_\text{encl}&\stackrel{?}{=}
\frac{1}{\mu_0}\oint_C \mathbf{B}(\mathbf{r})\cdot\text d\mathbf{s}
=
\frac{1}{4\pi}\oint_C \left(
\int
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
\text d^3\mathbf{r}'
\right)\cdot\text d\mathbf{s}.
\end{align}
$$
The way to tackle this integral is to (1) change the line integral to a surface integral using Stokes' theorem, and then (2) change the order of integration, to perform the $\mathbf{r}$ integral first. Thus:
$$
\begin{align}
I_\text{encl}&\stackrel{?}{=}
\frac{1}{4\pi}\int_S \nabla_\mathbf{r}\times\left(
\int
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
\text d^3\mathbf{r}'
\right)\cdot\text d\mathbf{A}
\\ & =
\frac{1}{4\pi}
\int \left[
\int_S \nabla_\mathbf{r} \times\left(
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
\right)\cdot\text d\mathbf{A}
\right] \text d^3\mathbf{r}'.
\end{align}
$$
To calculate the curl, you can use a nifty trick: the fraction with the cross product can actually be expressed as a curl itself:
$$
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
=\nabla_\mathbf{r}\times \frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
$$
(which has a whole lot of physical content in itself: if you integrate with respect to $\mathbf{r}'$, it will return a vector potential $\mathbf{A}$ such that $\mathbf{B}=\nabla\times \mathbf{A}$). This is a direct application of a standard identity. If you now apply the double curl, you get
$$
\begin{align}
\nabla_\mathbf{r} \times\left(
\frac{\mathbf{J}(\mathbf{r}')\times(\mathbf{r}-\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|^3}
\right)
& =
\nabla_\mathbf{r} \times\left(
\nabla_\mathbf{r}\times \frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
\right)
\\& =
\nabla_\mathbf{r} \left(\nabla_\mathbf{r} \cdot
\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
\right)
-\nabla_\mathbf{r}^2
\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}.
\end{align}
$$
These derivatives must be treated with care, because the Coulomb kernel that's being differentiated is singular at $\mathbf{r}=\mathbf{r}'$. This can be done correctly, and to do this right you should use the in-the-distributional-sense derivative
$$
\nabla_\mathbf{r}^2 \frac{1}{|\mathbf{r}-\mathbf{r}'|}=-4\pi\delta(\mathbf{r}-\mathbf{r}').
$$
With a bit of foresight, you can see that this is the term that matters, and that the other one must integrate out to zero. This is indeed the case; the simplest proof I can think of hinges on bouncing the derivatives over to $\mathbf{r}'$.
To do that, calculate the following derivative:
$$
\nabla_{\mathbf{r}'}\cdot\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
=
\frac{1}{|\mathbf{r}-\mathbf{r}'|}\nabla_{\mathbf{r}'}\cdot\mathbf{J}(\mathbf{r}')
+
\mathbf{J}(\mathbf{r}')\cdot\nabla_{\mathbf{r}'}\frac{1}{|\mathbf{r}-\mathbf{r}'|}
=
-\mathbf{J}(\mathbf{r}')\cdot\nabla_{\mathbf{r}'}\frac{1}{|\mathbf{r}-\mathbf{r}'|},
$$
because the current is divergenceless in the magnetostatic régime, and you can change the integration with respect to $\mathbf{r}$ and $\mathbf{r}'$ in any function of the form $f(\mathbf{r}-\mathbf{r}')$ by simply changing the sign. This is precisely the term I had before! I can therefore put this expression in, and after changing the order of integration again I get
$$
\int \left[
\int_S
\nabla_\mathbf{r} \left(\nabla_\mathbf{r} \cdot
\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
\right)
\cdot\text d\mathbf{A}
\right] \text d^3\mathbf{r}'
=
-\int_S
\nabla_\mathbf{r} \left(
\int
\nabla_{\mathbf{r}'}\cdot\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
\text d^3\mathbf{r}'
\right)
\cdot\text d\mathbf{A}.
$$
The volume integral over $\mathbf{r}'$ is now an integral of a divergence over all of space. That means I can apply the divergence theorem to get the flow of the integrand at a surface at infinity in $\mathbf{r}'$, and this will vanish if my source is confined to a finite volume or it tapers off to infinity fast enough.
With that term taken care of, all I have left is the important one:
$$
\begin{align}
I_\text{encl}&\stackrel{?}{=}
-\frac{1}{4\pi}
\int \left[
\int_S
\nabla_\mathbf{r}^2
\frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r}-\mathbf{r}'|}
\cdot\text d\mathbf{A}
\right] \text d^3\mathbf{r}'
=
\int \left[
\int_S
\mathbf{J}(\mathbf{r}')\delta(\mathbf{r}-\mathbf{r}')
\cdot\text d\mathbf{A}
\right] \text d^3\mathbf{r}'.
\end{align}
$$
Finally, by changing the order of integration and killing off the volume $\mathbf{r}'$ integral with the delta function, you get
$$
\begin{align}
I_\text{encl}&\stackrel{?}{=}
\int_S
\mathbf{J}(\mathbf{r}')
\cdot\text d\mathbf{A},
\end{align}
$$
and the integral on the right is precisely the definition of $I_\text{encl}$ for a volume current, so this completes the proof.
$$
\quad\tag{$\blacksquare$}
$$
Best Answer
Here is an annotated picture of a butterfly net which shows that all you need is a loop and an open surface which adjoins the loop.
So the closed loop can be any shape you like as can the open surface linked to it.
You can choose the shape of your surface to suit your problem and so for the ideal capacitor which has no edge effects choose a surface part of which is at right angles to the electric field. It makes the integration a lot easier.
This is because the electric field will only be present between the plates and if you choose the surface to be at right angles to the electric field, and hence the rate of change of electric field, then the integral will become $\mu_o \epsilon_o \frac {dE}{dt}A$ where A is the area of a capacitor plate.
$E=\frac {q}{\epsilon_o A}$ so the integral is $\mu_o I$ because $\frac{dq}{dt}=I$ which makes it the "more familiar" right hand side of Ampere's law