The flaw in your argument is that $\hat\Lambda(t)-\Lambda(t)$ is not a martingale for all times $t$. It is only a martingale up to the time $T$ when the experiment ends, that is, when the last survivor either dies or becomes censored (i.e. drops out of the study). After that, $\Lambda(t)$ continues to increase but $\hat\Lambda(t)$ does not.
Now, if the last survivor in the sample actually dies, then at this point it doesn't matter that $\hat\Lambda(t)-\Lambda(t)$ stops being a martingale, because $\hat S(t)=0$ and therefore $\hat S(t)/S(t)$ will continue to be a martingale. So without censored data, the K-M estimator is in fact unbiased (it is pretty easy to show this directly without stochastic processes).
However, if censoring exists, this raises the possibility that the last remaining survivor will become censored rather than die. In this case $\hat S(t)$ will never drop to zero, and so $\hat S(t)/S(t)$ will cease to be a martingale at time $T$. This possibility -- that the last survivor becomes censored -- is the source of bias in the K-M estimator.
I'll give an explanation that is very close to that of Maarten Buis but just a little more elaborate. As always in survival analysis, different time scales can be applied. I think that age is maybe the more intuitive time scale in your setting, so that's where I'll start my answer. Afterwards, I'll try to use that intuition to answer the question.
Let $C_i$ be time of birth. From your data we can easily calculate ages of entering the study,
$$
A_i = t_0 - C_i
$$
and age of exiting the study,
$$
B_i = \min\{T - C_i, D_i\},
$$
where $D_i$ is age at death. Now note, that we have some age interval, $(A_i, B_i]$ where the $i$'th subject is under observation. On this time scale, the study subjects do not enter the study at the same time. Let's denote the minimum of the age at entering the study,
$$
\alpha = \min_i A_i.
$$
What survival information do we have before time $\alpha$? None. This is why we can't say anything about the probability of surviving the age interval $(0, \alpha]$. Necessarily, our Kaplan-Meier estimate must be conditional on survival until age $\alpha$. To give an example: Let's say that $\alpha$ is $1$ year. Would we be able to calculate the survivor function at time $5$ years, $S(5) = P(D > 5)$. Could we calculate how many children would live to see their fifth birthday? No, because we simply don't know how dangerous the first year is. We can calculate only the conditional survivor function $P(D > 5|D > 1)$. Actually, this can again be explained by a change in time scale: there is nothing special about 0, your Kaplan-Meier estimate doesn't have to start at time zero, it can start at some other time, which corresponds to e.g. the time scale defined by age minus $\alpha$. In your data, you write that $\alpha$ is very small as some children are included very young, thus, for $s > \alpha$
$$
P(D > s | D > \alpha) = S(s)/S(\alpha) \simeq S(s)
$$
and actually there is equality in the limit $\alpha \rightarrow 0$ if we assume $S$ to be continuous.
Let's change back to your original time scale, plain calender time. You have no idea how dangerous the time before $t_0$ is, therefore your estimate must be conditional on surviving until $t_0$. This stems from the fact that no children are observed before time $t_0$. On this time scale, it doesn't make much of a difference how close the times of birth are to $t_0$ as we have assumed the same hazard for all ages (instead of an age-specific hazard as above). To sum up, on this time scale (using calender time), the interpretation would of the Kaplan-Meier estimate would be that of (for $t \in (t_0, T]$),
$$
P(X > t | X > t_0).
$$
This is not as intuitive as on the age time scale, however, it just means that when doing a study in calender time, we condition on the subjects having survived the time from birth until the start of our study.
To answer the last part of the question, you do not condition on $T_{first}$ nor on $L_1$, you condition on survival until $t_0$ as this is the minimum of entering times. I think part of the confusion is due to the fact that all the children enter your study at the same time, which is not necessarily the case in all applications, as is evident from using age as the time scale above.
Finally, you could easily say that non-truncation corresponds to the truncation time being smaller than or equal to 0 (or some other natural starting point on a time scale).
Best Answer
As I understand from a comment, the OP didn't realize that the Kaplan-Meier estimate is nothing but the empirical estimate of the survival function in case when there is no censoring.
Let me tell a word about that. Consider two independent random variables $X$ and $Y$ with continuous distributions, and independent replicated observations $x_i$ and $y_i$, $i=1, \ldots, n$. In the context of the Kaplan-Meier estimate, $Y$ is considered as the censoring variable and one observes the minima $t_i=\min(x_i,y_i)$ together with the indicators $\delta_i={\boldsymbol 1}_{x_i \leq y_i}$, independent replicated observations of $T=\min(X,Y)$ and $\Delta={\boldsymbol 1}_{X \leq Y}$ respectively.
Note that $\Pr(T >t)=\Pr(X>t)\Pr(Y>t)$, that is to say $\boxed{S^T(t)=S^X(t)S^Y(t)}$ by denoting $S^T$, $S^X$ and $S^Y$ the survival functions of $T$, $X$ and $Y$ respectively.
The usual empirical survival function $\hat{S}^T$ of $T$ is available from the data. When seeking estimates $\hat{S}^X$ and $\hat{S}^Y$ of $S^X$ and $S^Y$, it is natural to require the empirical analogous of the previous boxed formula, that is to say $\boxed{\hat{S}^T(t)=\hat{S}^X(t)\hat{S}^Y(t)}$.
Then remember that:
The Kaplan-Meier estimates of $S^X$ and $S^Y$ satisfy this relation (at least when there are no ties, I don't know and I have not checked when there are ties). The case when $Y=+\infty$ corresponds to the absence of censoring, in this case $T=X$, $S^Y\equiv 1$, $\hat{S}^Y\equiv 1$ and one gets $\hat{S}^T(t)=\hat{S}^X(t)$: the Kaplan-Meier estimate is nothing but the empirical estimate of the survival function.
In fact (at least when there are no ties), the Kaplan-meier estimates can even be derived from the required relation $\boxed{\hat{S}^T(t)=\hat{S}^X(t)\hat{S}^Y(t)}$, after requiring in addition that $\hat{S}^X$ and $\hat{S}^Y$ are step functions jumping at the observations of $x_i$ ($t_i$ when $\delta_i=1$) and $y_i$ ($t_i$ when $\delta_i=0$) respectively.