Better late than never. Let me first list three (I think important) reasons why we focus on asymptotic unbiasedness (consistency) of estimators.
a) Consistency is a minimum criterion. If an estimator doesn't correctly estimate even with lots of data, then what good is it? This is the justification given in Wooldridge: Introductory Econometrics.
b) Finite sample properties are much harder to prove (or rather, asymptotic statements are easier). I am currently doing some research myself, and whenever you can rely on large sample tools, things get much easier. Laws of large numbers, martingale convergence theorems etc. are nice tools for getting asymptotic results, but don't help with finite samples. I believe something along these lines is mentioned in Hayashi (2000): Econometrics.
c) If estimators are biased for small samples, one can potentially correct or at least improve with so called small sample corrections. These are often complicated theoretically (to prove they improve on the estimator without the correction). Plus, most people are fine with relying on large samples, so small sample corrections are often not implemented in standard statistics software, because only few people require them (those that can't get more data AND care about unbiasedness). Thus, there are certain barriers to using those uncommon corrections.
On your questions. What do we mean by "large sample"? This depends heavily on the context, and for specific tools it can be answered via simulation. That is, you artificially generate data, and see how, say, the rejection rate behaves as a function of sample size, or the bias behaves as a function of sample size. A specific example is here, where the authors see how many clusters it takes for OLS clustered standard errors, block bootstraped standard errors etc. to perform well. Some theorists also have statements on the rate of convergence, but for practical purposes the simulations appear to be more informative.
Does it really take $n\to \infty$? If that's what the theory says, yes, but in application we can accept small, negligible bias, which we have with sufficiently large sample sizes with high probability. What sufficiently means depends on the context, see above.
On question 3: usually, the question of unbiasedness (for all sample sizes) and consistency (unbiasedness for large samples) is considered separately. An estimator can be biased, but consistent, in which case indeed only the large sample estimates are unbiased. But there are also estimators that are unbiased and consistent, which are theoretically applicable for any sample size. (An estimator can also be unbiased but inconsistent for technical reasons.)
To side-step dependencies arising when we consider the sample variance, we write
$$(n-1)s^2 = \sum_{i=1}^n\Big((X_i-\mu) -(\bar x-\mu)\Big)^2$$
$$=\sum_{i=1}^n\Big(X_i-\mu\Big)^2-2\sum_{i=1}^n\Big((X_i-\mu)(\bar x-\mu)\Big)+\sum_{i=1}^n\Big(\bar x-\mu\Big)^2$$
and after a little manipualtion,
$$=\sum_{i=1}^n\Big(X_i-\mu\Big)^2 - n\Big(\bar x-\mu\Big)^2$$
Therefore
$$\sqrt n(s^2 - \sigma^2) = \frac {\sqrt n}{n-1}\sum_{i=1}^n\Big(X_i-\mu\Big)^2 -\sqrt n \sigma^2- \frac {\sqrt n}{n-1}n\Big(\bar x-\mu\Big)^2 $$
Manipulating,
$$\sqrt n(s^2 - \sigma^2) = \frac {\sqrt n}{n-1}\sum_{i=1}^n\Big(X_i-\mu\Big)^2 -\sqrt n \frac {n-1}{n-1}\sigma^2- \frac {n}{n-1}\sqrt n\Big(\bar x-\mu\Big)^2 $$
$$=\frac {n\sqrt n}{n-1}\frac 1n\sum_{i=1}^n\Big(X_i-\mu\Big)^2 -\sqrt n \frac {n-1}{n-1}\sigma^2- \frac {n}{n-1}\sqrt n\Big(\bar x-\mu\Big)^2$$
$$=\frac {n}{n-1}\left[\sqrt n\left(\frac 1n\sum_{i=1}^n\Big(X_i-\mu\Big)^2 -\sigma^2\right)\right] + \frac {\sqrt n}{n-1}\sigma^2 -\frac {n}{n-1}\sqrt n\Big(\bar x-\mu\Big)^2$$
The term $n/(n-1)$ becomes unity asymptotically. The term $\frac {\sqrt n}{n-1}\sigma^2$ is determinsitic and goes to zero as $n \rightarrow \infty$.
We also have $\sqrt n\Big(\bar x-\mu\Big)^2 = \left[\sqrt n\Big(\bar x-\mu\Big)\right]\cdot \Big(\bar x-\mu\Big)$. The first component converges in distribution to a Normal, the second convergres in probability to zero. Then by Slutsky's theorem the product converges in probability to zero,
$$\sqrt n\Big(\bar x-\mu\Big)^2\xrightarrow{p} 0$$
We are left with the term
$$\left[\sqrt n\left(\frac 1n\sum_{i=1}^n\Big(X_i-\mu\Big)^2 -\sigma^2\right)\right]$$
Alerted by a lethal example offered by @whuber in a comment to this answer, we want to make certain that $(X_i-\mu)^2$ is not constant. Whuber pointed out that if $X_i$ is a Bernoulli $(1/2)$ then this quantity is a constant. So excluding variables for which this happens (perhaps other dichotomous, not just $0/1$ binary?), for the rest we have
$$\mathrm{E}\Big(X_i-\mu\Big)^2 = \sigma^2,\;\; \operatorname {Var}\left[\Big(X_i-\mu\Big)^2\right] = \mu_4 - \sigma^4$$
and so the term under investigation is a usual subject matter of the classical Central Limit Theorem, and
$$\sqrt n(s^2 - \sigma^2) \xrightarrow{d} N\left(0,\mu_4 - \sigma^4\right)$$
Note: the above result of course holds also for normally distributed samples -but in this last case we have also available a finite-sample chi-square distributional result.
Best Answer
By the Delta theorem, if
$${\sqrt{n}[\bar Z_n-\mu]\,\xrightarrow{D}\,\mathcal{N}(0,\sigma^2)}$$
then
$${\sqrt{n}[g(\bar Z_n)-g(\mu)]\,\xrightarrow{D}\,\mathcal{N}(0,\sigma^2\cdot[g'(\mu)]^2)}$$
I guess you can do the rest.