We don't get to choose here. The "normalizing" factor, in essence is a "variance-stabilizing to something finite" factor, so as for the expression not to go to zero or to infinity as sample size goes to infinity, but to maintain a distribution at the limit.
So it has to be whatever it has to be in each case. Of course it is interesting that in many cases it emerges that it has to be $\sqrt n$. (but see also @whuber's comment below).
A standard example where the normalizing factor has to be $n$, rather than $\sqrt n$ is when we have a model
$$y_t = \beta y_{t-1} + u_t, \;\; y_0 = 0,\; t=1,...,T$$
with $u_t$ white noise, and we estimate the unknown $\beta$ by Ordinary Least Squares.
If it so happens that the true value of the coefficient is $|\beta|<1$, then the the OLS estimator is consistent and converges at the usual $\sqrt n$ rate.
But if instead the true value is $\beta=1$ (i.e we have in reality a pure random walk), then the OLS estimator is consistent but will converge "faster", at rate $n$ (this is sometimes called a "superconsistent" estimator -since, I guess, so many estimators converge at rate $\sqrt n$).
In this case, to obtain its (non-normal) asymptotic distribution, we have to scale $(\hat \beta - \beta)$ by $n$ (if we scale only by $\sqrt n$ the expression will go to zero). Hamilton ch 17 has the details.
Efficiency is a "per se" concept in the sense that it is a measure of how variable (and biased) the estimator is from the "true" parameter. There is an actual numeric value for efficiency associated with a given estimator at a given sample-size for a given loss function. This actual number is related to the estimator AND the sample-size AND the loss function.
Asymptotic efficiency looks at how efficient the estimator is as the sample size increases. More important is how rapidly the estimator becomes efficient but this can be more difficult to determine.
Relative efficiency looks at how efficient the estimator is relative to an alternative estimator (typically at a GIVEN sample-size).
Efficiency requires the specification of some loss function. Originally, this was variance when only unbiased estimators were considered. These days, this is most often MSE (mean-squared-error which accounts for bias and variability). Other loss-functions can be used. The classical Cramer-Rao bound was for unbiased estimators only but was extended to many of these other loss functions (most especially for MSE loss).
An important adjunct concept is admissibility and domination of estimators.
The Wikipedia entry has many links.
Best Answer
Unbiasedness means that under the assumptions regarding the population distribution the estimator in repeated sampling will equal the population parameter on average. This is a nice property for the theory of minimum variance unbiased estimators. However I think unbiasedness is overemphasized. The mean square error is a good measure of the accuracy of an estimator. It equals the square of the estimator's bias plus the variance. Sometimes estimators with small bias have smaller mean square error than unbiased estimators that have large variances.
Estimators that are bias can be asymptotically unbiased meaning the bias tends to 0 as the sample size gets large. If the estimator is both asymptotically unbiased and the variance goes to 0 as the sample size gets large then the estimator is consistent (in probability). Technically in measure theory there is a difference between convergence in probability and convergence almost surely. The Cramer Rao Lower bound is a mathematical result that shows in a particular parametric family of distributions that no unbiased estimator can have a variance less than the bound. So if you can show that your estimator achieves the Cramer Rao lower bound you have an efficient estimator.