I'll give an explanation that is very close to that of Maarten Buis but just a little more elaborate. As always in survival analysis, different time scales can be applied. I think that age is maybe the more intuitive time scale in your setting, so that's where I'll start my answer. Afterwards, I'll try to use that intuition to answer the question.
Let $C_i$ be time of birth. From your data we can easily calculate ages of entering the study,
$$
A_i = t_0 - C_i
$$
and age of exiting the study,
$$
B_i = \min\{T - C_i, D_i\},
$$
where $D_i$ is age at death. Now note, that we have some age interval, $(A_i, B_i]$ where the $i$'th subject is under observation. On this time scale, the study subjects do not enter the study at the same time. Let's denote the minimum of the age at entering the study,
$$
\alpha = \min_i A_i.
$$
What survival information do we have before time $\alpha$? None. This is why we can't say anything about the probability of surviving the age interval $(0, \alpha]$. Necessarily, our Kaplan-Meier estimate must be conditional on survival until age $\alpha$. To give an example: Let's say that $\alpha$ is $1$ year. Would we be able to calculate the survivor function at time $5$ years, $S(5) = P(D > 5)$. Could we calculate how many children would live to see their fifth birthday? No, because we simply don't know how dangerous the first year is. We can calculate only the conditional survivor function $P(D > 5|D > 1)$. Actually, this can again be explained by a change in time scale: there is nothing special about 0, your Kaplan-Meier estimate doesn't have to start at time zero, it can start at some other time, which corresponds to e.g. the time scale defined by age minus $\alpha$. In your data, you write that $\alpha$ is very small as some children are included very young, thus, for $s > \alpha$
$$
P(D > s | D > \alpha) = S(s)/S(\alpha) \simeq S(s)
$$
and actually there is equality in the limit $\alpha \rightarrow 0$ if we assume $S$ to be continuous.
Let's change back to your original time scale, plain calender time. You have no idea how dangerous the time before $t_0$ is, therefore your estimate must be conditional on surviving until $t_0$. This stems from the fact that no children are observed before time $t_0$. On this time scale, it doesn't make much of a difference how close the times of birth are to $t_0$ as we have assumed the same hazard for all ages (instead of an age-specific hazard as above). To sum up, on this time scale (using calender time), the interpretation would of the Kaplan-Meier estimate would be that of (for $t \in (t_0, T]$),
$$
P(X > t | X > t_0).
$$
This is not as intuitive as on the age time scale, however, it just means that when doing a study in calender time, we condition on the subjects having survived the time from birth until the start of our study.
To answer the last part of the question, you do not condition on $T_{first}$ nor on $L_1$, you condition on survival until $t_0$ as this is the minimum of entering times. I think part of the confusion is due to the fact that all the children enter your study at the same time, which is not necessarily the case in all applications, as is evident from using age as the time scale above.
Finally, you could easily say that non-truncation corresponds to the truncation time being smaller than or equal to 0 (or some other natural starting point on a time scale).
That "you will not start treatment until the end of accrual period" is not the case.
Participants enroll in clinical trials over time. The total time over which all participants enroll is the accrual time. Nevertheless, each individual begins treatment soon after enrollment, according to the terms of the trial design. The starting time = 0
for each participant might be defined as the date of the start of that participant's therapy. So there is no left truncation with respect to that choice of time = 0
in that situation.
There's a possibility of left truncation if you define time = 0
to be the date of initial diagnosis and there's a delay between diagnosis and study enrollment. Therneau and Grambsch discuss that in Section 3.7.3 of "Modeling Survival Data," for an example when a patient entered a study at the Mayo Clinic referral center 1175 days after initial diagnosis at a local healthcare provider
the patient was not at risk for an observable death during the first 1,175 days of that interval. Such data, where the patient enters the risk set after time 0, is said to be left truncated.
Usually there isn't such a large time difference between diagnosis and enrollment, so any minor left truncation might be ignorable.
Left truncation can also be an issue when you define time = 0
to be date of birth or something similar. Klein and Moeschberger provide many examples of different types of censoring and truncation and how to deal with them.
Best Answer
First a disclaimer: I've never had to use the time start/end variable in this way and although I'm familiar with mixed effects models I have never really had to use them IRL. Feel free to correct me if I've made a mistake
The problem consists out of two things as I see it:
For the first point I think using a mixed effects model is a must. I use R and there is a
coxme
package recently developed by prof. Therneau. The vignette documentation is excellent and it seems easy to deploy.For the second point you just need to add the start and end point to the survival object. This is fairly easy in R although I have never had to use it myself. Below is an example that should work:
You might want to consider what you want to achieve with the cox regression model in this case. I am not sure that hazards make sense in this setting, although this is very difficult to know without going through the full study protocol. Make sure that others have used cox regressions in similar settings prior to this analysis. It seems to me that a good alternative would be a mixed effect logistic regression where you simply use odds for admission and add the number of days at risk as a predictor, preferably as a natural spline or something that allows a non-linear relationship.
Minor update from the discussion
When it comes to time-dependant covariates I have found this to be a little tricky when trying to deploy. I had a CV-question a while ago on this subject that you may want to look into. As I wrote in the comments, in the end the time dependence was a little more than I could conveniently display and explain to my colleagues. Furthermore the model was not strongly affected by this effect so I dropped it and switched to an early and late dataset. I recommend you consider who your audience is and if the time-varying coefficients will add that much to the model.
You have a potentially very serious problem where some patients start their period discharged from the hospital while some are untainted. I think you need to think about possible effect modification between these two groups - do they belong to the same population or not? It is easy to make a case that medication-compliance has a much bigger admission-avoidance impact in the discharged population. I think you at least should have a variable indicating if the patient has started a period straight after hospitalization or not (I've added an example in the code).
I have recently done a medication adherence study, if you haven't read this article I strongly recommend it. In my study I was also able to deduce from the prescription text 94 % of the cases using Python's very powerful regular expressions. I'm planning on doing a post on my blog once the article gets published, the text interpretation is in Swedish but you can very easily use the structure as most prescriptions follow a similar pattern (let me know if this would be useful and I can write up the post a little earlier). The advantage is that you want to identify exactly when a patient is expected to be without medication because you will probably have a very close relationship between that and readmission.