There are four definitions which are classically equivalent to being finite. Let $[n]$ be the set $\{0, 1, ..., n - 1\} = \{m \in \mathbb{N} \mid m < n\}$.
Definition 1: A set $S$ is finite if and only if there exists some $n \in \mathbb{N}$ and a bijection $S \cong [n]$.
Definition 2: A set $S$ is subfinite if and only if there is some $n \in \mathbb{N}$ and some injection $S \to [n]$.
Definition 3: A set $S$ is finitely enumerable if and only if there is some $n \in \mathbb{N}$ and some surjection $[n] \to S$.
Definition 4: A set $S$ is sub-finitely enumerable if and only if there is some finitely enumerable $S'$ and some injection $S \to S'$.
Definition 5: A set $S$ is subfinitely enumerable if and only if there is some subfinite set $S'$ and some surjection $S' \to S$.
Note that all these definitions are bijection-invariant. That is, if $A \cong B$ (that is, if there is a bijection $A \to B$), then $A$ satisfies definition $i$ if and only if $B$ does.
You may have noticed that definitions 4 and 5 are similarly named. This is because
Thm. $S$ is sub-finitely enumerable (definition 4) if and only if $S$ is subfinitely enumerable (definition 5).
Proof: Suppose that $S$ is sub-finitely enumerable (definition 4). Without loss of generality, suppose $S \subseteq K$ where $K$ is finitely enumerable. Take a surjection $f : [n] \to K$. Consider $S' = f^{-1}(S) \subseteq [n]$, which is clearly subfinite. Then $f|_{S'} : S' \to S$ is surjective. Therefore, $S$ is subfinitely enumerable (definition 5).
Now suppose that $S$ is subfinitely enumerable (definition 5). Without loss of generality, let $K \subseteq [n]$ and $f : K \to S$ be a surjection. Define $\sim \subseteq [n]^2$ by $a \sim b$ if and only if either $a = b$ or $a, b \in K$ and $f(a) = f(b)$. Consider the quotient set and map $\pi : [n] \to [n] / \sim$. Note that $\pi$ is surjective, demonstrating that $[n] / \sim$ is finitely enumerable.
Define $g : S \to [n] / \sim$ by $g(f(x)) = \pi(x)$. Note that $g$ is well-defined, since if $f(x) = f(y)$ then $x \sim y$ and thus $\pi(x) = \pi(y)$, so $g(f(x)) = \pi(x) = \pi(y) = g(f(y))$. And note that since $f$ is surjective, it's defined on all of $S$.
I claim that $g$ is an injection. For suppose that $g(a) = g(b)$. Write $a = f(x)$, $b = f(y)$. Then $\pi(x) = \pi(y)$. Then $x \sim y$. Then $f(x) = f(y)$. Then $a = b$.
Therefore, $S$ is sub-finitely enumerable (definition 4). $\square$
From here on out, we use sub-finitely enumerable and subfinitely enumerable interchangably.
We see that subfinitely enumerable is the definition you're looking for.
If $S$ is subfinitely enumerable, then we have an injection $f : S \to K$ where $K$ is finitely enumerable. If we have $S' \subseteq S$, then we have $f|_{S'} : S' \to K$ is also an injection, as the composition of the inclusion injection $S' \subseteq S$ and the injection $f : S \to K$. So $S'$ is subfinitely enumerable. So subfinite enumerability is closed under the taking of subsets.
If $S$ is subfinitely enumerable, then we have a surjection $f : K \to S$ where $K$ is subfinite. Suppose we have some equivalence relation $\sim$ on $S$. Then $\pi : S \to S / \sim$ is a surjection, so $\pi \circ f : K \to S / \sim$ is a surjection as the composition of surjections. Therefore, $S / \sim$ is subfinitely enumerable.
Note that "subfinitely enumerable" is therefore the smallest class $\mathbf{C}$ of sets such that (1) if $A \approx B$ and $A \in \mathbf{C}$, then $B \in \mathbf{C}$ (2) if $S \in \mathbf{C}$ and $S' \subseteq S$ then $S' \in \mathbf{C}$, (3) if $S \in \mathbf{C}$ and $\sim$ is an equivalence relation on $S$, then $S / \sim \in \mathbf{C}$, and (4) for all $n \in \mathbb{N}$, $n \in \mathbf{C}$.
Assuming the existence of power sets and of $\mathbb{N}$, the full subcategory of subfinitely enumerable sets and functions between them is essentially small. This is because every subfinitely enumerable set can be placed into a bijection with some subset of $[n] / \sim$ for some equivalence relation $\sim$ on $[n]$. So a subfinitely enumerable set is exactly a set which can be placed into bijection with some element of $\{S \mid S \subseteq [n] / \sim, n \in \mathbb{N}, \sim$ is an equivalence relation on $[n]\}$.
One final note: if you're familiar with topos theory, the proof that definitions (4) and (5) coincide can be made quite simple. For (4) implies (5), we can simply take the pullback of $[n] \to K \leftarrow S$ and use the fact that the pullback of a mono is a mono, and the pullback of an epi is an epi (in a topos). For (5) implies (4), we can take the pushout of $[n] \leftarrow k \to S$ and use the fact that the pushout of a mono is mono (in a topos), and the pushout of an epi is an epi.
This is a good question. Although there is something missing. When you state the Axiom of Extensionality, you are stating it as an “if and only if” condition. This axioms actually states that
$$\forall x \forall y ((\forall z (z \in x \leftrightarrow z \in y)) \rightarrow x = y)\,.$$
The converse of this axiom, which is also true, is a particular instance of a standard theorem in any first order theory with equality (in this case, we consider the $\in$ as our relational symbol).
Putting both together we in fact obtain that
$$\forall x \forall y ((\forall z (z \in x \leftrightarrow z \in y)) \leftrightarrow x = y)\,.$$
But this is not the definition of equality of sets. Rather, this is a characterization of the primitive notion of equality of sets.
Equality of sets is itself a primate notion (in this case a priority relation between sets), i.e. we don’t define it. Just in the same way that we don’t define the notion of set.
Best Answer
If one does not have equality as part of the underlying logic, then there is indeed a way of defining it inside set theory,
$x = y \overset{def}\Longleftrightarrow (\forall z(z \in x \iff z \in y) \land \forall z (x \in z \iff y \in z))$
which of course is just a modification of extensionality: the new bit about belonging to exactly the same sets is just so that one is able to recover the substitution principle of equality
$\forall x,y (x = y \implies (\varphi(x) \iff \varphi(y)))$
for any $\varphi(-)$, from separation/comprehension, which I suppose is not possible using only
$x = y \overset{def}\Longleftrightarrow \forall z(z \in x \iff z \in y)$
edit: if one defines equality by that first phrase up there, then it only makes sense that the 'new', appropriate axiom of extensionality is/becomes
Ax.Ex.: $\forall x,y (\forall z(z \in x \iff z \in y) \iff \forall z (x \in z \iff y \in z))$