Better late than never. Let me first list three (I think important) reasons why we focus on asymptotic unbiasedness (consistency) of estimators.
a) Consistency is a minimum criterion. If an estimator doesn't correctly estimate even with lots of data, then what good is it? This is the justification given in Wooldridge: Introductory Econometrics.
b) Finite sample properties are much harder to prove (or rather, asymptotic statements are easier). I am currently doing some research myself, and whenever you can rely on large sample tools, things get much easier. Laws of large numbers, martingale convergence theorems etc. are nice tools for getting asymptotic results, but don't help with finite samples. I believe something along these lines is mentioned in Hayashi (2000): Econometrics.
c) If estimators are biased for small samples, one can potentially correct or at least improve with so called small sample corrections. These are often complicated theoretically (to prove they improve on the estimator without the correction). Plus, most people are fine with relying on large samples, so small sample corrections are often not implemented in standard statistics software, because only few people require them (those that can't get more data AND care about unbiasedness). Thus, there are certain barriers to using those uncommon corrections.
On your questions. What do we mean by "large sample"? This depends heavily on the context, and for specific tools it can be answered via simulation. That is, you artificially generate data, and see how, say, the rejection rate behaves as a function of sample size, or the bias behaves as a function of sample size. A specific example is here, where the authors see how many clusters it takes for OLS clustered standard errors, block bootstraped standard errors etc. to perform well. Some theorists also have statements on the rate of convergence, but for practical purposes the simulations appear to be more informative.
Does it really take $n\to \infty$? If that's what the theory says, yes, but in application we can accept small, negligible bias, which we have with sufficiently large sample sizes with high probability. What sufficiently means depends on the context, see above.
On question 3: usually, the question of unbiasedness (for all sample sizes) and consistency (unbiasedness for large samples) is considered separately. An estimator can be biased, but consistent, in which case indeed only the large sample estimates are unbiased. But there are also estimators that are unbiased and consistent, which are theoretically applicable for any sample size. (An estimator can also be unbiased but inconsistent for technical reasons.)
Best Answer
It is for example useful to do so in order to be able to quantify the sampling uncertainty of an estimator, or the null distribution of a test.
Recall that normal random variables take 95% of their realizations in the interval $\mu\pm1.96\sigma$. So if you can demonstrate that (typically, a scaled version of) an estimator is asymptotically normal, then you know it behaves normally at least in large samples, so you can easily construct confidence intervals, for example.
Whether or not the approximation is useful to settings in which (as always in practice) your sample is finite is in general unfortunately indeed not known analytically - if could derive the finite-sample distribution analytically, that is what we would work with. Unfortunately, that only works in very rare cases (for example, when sampling from a normal distribution, the t-statistic follows a t distribution).
Typically, simulations are then used to at least get an idea of the usefulness of the approximation in relevant cases.