The answer is that both are used, unfortunately. In the continuous case, you are right the distinction is unimportant. In the discrete case, the interpretation would be slightly different and therefore clarity is important.
In my experience, the most common definition of the survival function is $S(t) = Pr(T>t)$ and so would match your yellow column. This is the one used in the derivation of the Kaplan-Meier estimator: $\hat{S}(t) = \frac{\text{individuals with } T>t}{\text{total individuals}} = \prod_{j=1}^k{(1-\frac{d_j}{r_j})} $ where $d_j$ is the number of events in interval $j$, $r_j$ is the number of individuals at risk in interval $j$, and $\frac{d_j}{r_j} = h(t)$
An important note is that the survival function should start with $S(0) = 1$ if 0 is the first time point, in the absence of left censoring (i.e. assuming no one starts follow-up already having had the event). In the case of your example, I presume that someone who opens and closes a bank account in January 2012 opens the account before they close it; so if time intervals were shortened (for example using weeks or days as the time scale) then S(0) would equal 1 in both cases.
How much the distinction between the two definitions matters may depend on the specific application. The degree of divergence between the two calculations will likely depend on the length of follow-up, the frequency of the event, and the number of ties and how these are considered.
In addition, in many applications we are interested in comparing hazards or survival between two groups rather than the absolute survival or hazard in a specific group. In this case, I think the distinction should be even less important, but I would have to check into that to be sure.
For more detail on survival analysis where $S(t)$ is clearly defined as $Pr(T>t)$, see:
Allison: Survival Analysis Using the SAS System
For more detail, with $S(t) = Pr(T>=t)$, see:
Collett: Modelling Survival Data in Medical Research, 2nd ed.
(note, most of the analytic details will be the same as in Allison, but the interpretations may differ)
I also found the word "censoring" to be confusing when I first started survival analysis.
"Censored" individuals aren't removed from analysis; they're just treated differently from those with events/deaths. If censoring is non-informative, as @swmo discussed, then a censored individual provides information that the event did not occur up to the censoring time. Just doesn't provide the exact time.
A standard survival curve includes the censored patients, noting the censoring times with a mark on the curve at the censoring time. The survival curve only drops at times of (noncensored) events, with a drop given by the ratio of events at that time to the total at risk at that time, including those with later censoring times. So the survival curves for the wonder drug in your example would in fact look quite good, with the few early events leading to small drops in the curve (as the fraction of individuals dying early was small) and then a high survival fraction thereafter.
Also, you're not usually comparing censored to uncensored patients within a single treatment group, as the question seems to suggest. Rather, you're comparing the timing of events in treatment group A to those in treatment group B. So in a test of a poor drug A versus wonder drug B, there would be many events/deaths in group A and few in group B, or at least events would tend to happen earlier in group A.
If most patients in group B are "cured" and they are not otherwise at high risk of death, then the "survival times" of the censored individuals in that group would mostly be determined by the duration of the study. A longer survival time for censored versus non-censored individuals may just mean that the study went on long enough to pick up most of those who were not "cured" by drug B.
Best Answer
The simplest solution is to choose 1st January 2015 as your
time = 0
for all stores. That might not seem as satisfying as modeling from the original store-opening date, but it might be the best you can do.With those time-varying covariates you don't have information about values prior to that date, so you couldn't properly model survival at prior times anyway. A Cox proportional hazards model has an advantage here in that the risk of an event is assumed to be a function of the instantaneous covariate values. Thus even with a 1st January 2015 time reference you will still get useful information about the hazards associated with your covariates since that date, given that a store was open on that date.
This thread discusses a similar situation in modeling customer churn in the insurance industry. Following on from that discussion, one thing that might help here would be to include the length of time that a store had been open prior to 1st January 2015 as a fixed-time covariate in your model, if you could get just that single piece of information for each store. If you can't, the best you can do is to work with the information that you have.