Raymond Smullyan gave a very general formulation in terms of representation systems. They appear in his "Theory of Formal Systems", and in the first and last chapters of "Godel's Incompleteness Theorems". They generalise first- and higher-order systems of logic, type theories, and Post production systems.
A representation system consists of:
A countably infinite set $E$ of expressions.
A subset $S \subseteq E$, the set of sentences.
A subset $T \subseteq S$, the set of provable sentences.
A subset $R \subseteq S$, the set of refutable sentences.
A subset $P \subseteq E$, the set of (unary) predicates.
A function $\Phi : E \times \mathbb{N} \rightarrow E$ such that, whenever $H$ is a predicate, then $\Phi(H,n)$ is a sentence.
The system is complete iff every sentence is either provable or refutable. It is inconsistent iff some sentence is both provable and refutable.
We say a predicate $H$ represents the set $A \subseteq \mathbb{N}$ iff $A = \{ n : \Phi(H,n) \in T \}$.
Let $g$ be a bijection from $E$ to $\mathbb{N}$. We call $g(X)$ the Godel number of $X$.
We write $E_n$ for the expression with Godel number $n$.
Let $\overline{A} = \mathbb{N} \setminus A$ and $Q^* = \{ n : \Phi(E_n,n) \in Q \}$.
We have:
(Generalised Tarski Theorem) The set $\overline{T^*}$ is not representable.
(Generalised Godel Theorem) If $R^*$ is representable, then the system is either inconsistent or incomplete.
(Generalised Rosser Theorem) If some superset of $R^* $ disjoint from $T^*$ is representable, then the system is incomplete.
In case it's not clear: in a first-order system, we can take $P$ to be the set of formulas whose only free variable is $x_1$, and $\Phi(H,n) = [\overline{n}/x_1]H$.
Even if logic were extended to allow infinitely long proofs, your attempted proof of the countable axiom of choice would still have a gap or two. After the infinitely many steps asserting that there exists an $a_i$ in $A_i$ (one step for each $i$), you still need to justify the claim that there's a function assigning, to each $i$, the corresponding $a_i$. The immediate problem is that your infinitely many steps haven't exactly specified which (of the many possible) $a_i$'s are the corresponding ones; the $a_i$'s in your formulas are just bound variables. Worse, even if the meaning of "the corresponding $a_i$" were perfectly clear, so that there's no doubt about which ordered pairs $(i,a_i)$ you want to have in your choice function, you'd still need to prove that there is a set consisting of just those ordered pairs. No ZF axiom does that job. I think you'd need an infinitely long axiom saying "for all $x_1,x_2,\dots$, there exists a set whose members are exactly $x_1,x_2,\dots$."
If you're willing to accept not only proofs consisting of infinitely many statements but also single statements of infinite length, and if you're willing to add some such infinite statements as new axioms, then I think you can "prove" the countable axiom of choice (and fancier choice principles if you allow even longer new axioms). But, as long as you need to add some axioms to ZF for this purpose, it seems simpler to just add the countable axiom of choice. It's a finite statement, so you can reason with it using the usual rules of logic.
One could view the axiom of choice as a sort of finitary (and therefore usable) surrogate for the infinitely long axioms and proofs that would come up in your approach. In fact, some of Zermelo's later work (he introduced the axiom of choice in 1904, and the work I'm thinking of dates from the late 20's or early 30's) takes an "infinitary logic" approach to the foundations of set theory (and is, in my opinion, not entirely clear).
Best Answer
You've hit on an area of research that's picking up some momentum at the moment. It involves connections between proof theory, homotopy theory and higher categories. The idea is that a proof or deduction is something like a path (from the premiss to the conclusion), and when you "deform" one proof into another by a sequence of trivial steps, that's something like a homotopy between paths. Or, in the language of higher-dimensional categories, a deduction is a 1-morphism, and a deformation of deductions is a 2-morphism. You can keep going to higher deductions.
There are close connections with type theory too. If you have the right kind of background, the following papers might be helpful:
Awodey and Warren, Homotopy theoretic models of identity types, http://arxiv.org/abs/0709.0248
Van den Berg and Garner, Types are weak omega-groupoids, http://arxiv.org/abs/0812.0298