As far as my understanding goes, it is the average lifetime of a collection of nuclei undergoing disintegration. But doesn't each nucleus take an infinite amount of time to decay? Is that not why we use the concept of half-life? Then, shouldn't the mean life be infinite as well?
Mean Life of Radioactive Substances – Nuclear Physics Insights
half-lifenuclear-physicsradioactivity
Related Solutions
The right way to think about this is that, over 5,730 years, each single carbon-14 atom has a 50% chance of decaying. Since a typical sample has a huge number of atoms1, and since they decay more or less independently2, we can statistically say, with a very high accuracy, that after 5,730 years half of all the original carbon-14 atoms will have decayed, while the rest still remain.
To answer your next natural question, no, this does not mean that the remaining carbon-14 atoms would be "just about to decay". Generally speaking, atomic nuclei do not have a memory3: as long as it has not decayed, a carbon-14 nucleus created yesterday is exactly identical to one created a year ago or 10,000 years ago or even a million years ago. All those nuclei, if they're still around today, have the same 50% probability of decaying within the next 5,730 years.
If you like, you could imagine each carbon-14 nucleus repeatedly tossing a very biased imaginary coin very fast (faster than we could possibly measure): on each toss, with a very, very tiny chance, the coin comes up heads and the nucleus decays; otherwise, it comes up tails, and the nucleus stays together for now. Over a period of, say, a second or a day, the odds of any of the coin tosses coming up heads are still tiny — but, over 5,730 years, the many, many tiny odds gradually add up to a cumulative decay probability of about 50%.
1 A gram of carbon contains about 0.08 moles, or about 5 × 1022 atoms. In a typical natural sample, about one in a trillion (1 / 1012) of these will be carbon-14, giving us about 50 billion (5 × 1010) carbon-14 atoms in each gram of carbon.
2 Induced radioactive decay does occur, most notably in fission chain reactions. Carbon-14, however, undergoes spontaneous β− decay, whose rate is not normally affected by external influences to any significant degree.
3 Nuclear isomers and other excited nuclear states do exist, so it's not quite right to say that all nuclei of a given isotope are always identical. Still, even these can, in practice, be effectively modeled as discrete states, with spontaneous transitions between different states occurring randomly with a fixed rate over time, just as nuclear decay events do.
Congratulations on deriving the exponential law for yourself, one learns a great deal about science working like this. Now to your last question:
If I had a group of atoms that have an 'average lifetime' of say 5 seconds, after 5 seconds has elapsed, what is the 'average lifetime' of the remaining atoms? I don't think I can arbitrarily choose some reference time to begin ticking away at the atoms' remaining time, does that mean at any point of time that their 'average lifetime' or expected lifetime is always a constant, and never actually diminishes as time goes on?
Yes indeed the average lifetime is constant. And the exponential distribution you have derived is the unique lifetime distribution with this property. Another way of saying this is that the decaying particle is memoryless: it does not encode its "age": there is nothing inside the particle that says "I've live a long time, now its time to die". Yet another take on this - as a discrete rather than continuous probability distribution - is the geometric distribution of the number of throws before a coin turns up heads, and the observation that a coin has no memory that counters the famous gambler's fallacy.
To understand this uniqueness, we encode the memorylessness condition into the basic probability law
$$p(A\cap B) = p(A) \, p(B|A)$$
Suppose after time $\delta$ you observe that your particle has not decayed (event $A$). If $f(t)$ is the propability distribution of lifetimes, then the probability the particle has lasted at least this long, i.e. the probability that it does not decay in time interval $[0,\,\delta]$ is:
$$p(A) = 1-\int_0^\delta f(u)du$$
The a priori probability distribution function that the particle will last until time $t+\delta$ and then decay in the time interval $dt$ (event $B$) is
$$p(B\cap A) = f(t+\delta) dt$$.
This is events $B$ and $A$ observed together, which is the same as plain old $B$ since the particle cannot last unti time $t + \delta$ without living to $\delta$ first! Therefore, the conditional probability density function is
$$p(B|A) = \frac{f(t+\delta)\,dt}{1-\int_0^\delta f(u)du}$$
But this must be the same as the unconditional probability density that the particle lasts a further time $t$ measured from any time, by assumption of memorylessness. Thus we must have:
$$\left(1 - \int_0^\delta f(u)du\right)\,f(t) = f(t+\delta),\;\forall \delta>0$$
Letting $\delta\rightarrow 0$, we get the differential equation $f^\prime(t) = - f(0) f(t)$, whose unique solution is $f(t) = \frac{1}{\tau}\exp\left(-\frac{t}{\tau}\right)$. You can readily check that this function fulfills the general functional equation $\left(1 - \int_0^\delta f(u)du\right)\,f(t) = f(t+\delta)$ for any $\delta > 0$ as well.
As Akhmeteli's answer says, true memorylessness is actually incompatible with simple quantum models. For example, one can derive the exponential lifetime for an excited fluorophore from a simple model of a lone excited two state fluorophore equally coupled to all the modes of the electromagnetic field. The catch is that the derivation rests on approximating an integral over positive energy field modes by an integral over all energies, both positive and negative. This of course is unphysical, but an excellent approximation since only modes near to the two state atom's energy gap will be excited: the fluorophore "tries" to excite all modes equally, but destructive interference prevents significant coupling to modes of greatly different energy than the difference between the energies of the states on either side of the transition.
I show how this analysis is done in this answer here and here.
Linewidths are mostly extremely narrow compared to the frequencies of the photons concerned, so I find it surprising and quite wonderful that Ahkmeteli cites a paper giving experimental evidence of the nonconstant lifetime.
Best Answer
What do you mean by your question:
As far as I know, this is not true. A nucleus will start in one state, and end in another "decayed state" + radiation ($\alpha^{2+}, \beta^\pm, \gamma$ or whatever), and this is not an infinitely long process.
A nucleus has a probability of decaying within the next time interval, say $\delta t$, or not. Thanks to how statistics and probability work, if we have a large number of these nuclei, they will collectively exhibit a "mean lifetime" (i.e. we are able to obtain an average time it takes for one nucleus to decay).
Perhaps you're getting confused by this formula:
$$N = N_0e^{-\lambda t} = N_0e^{-t/\tau}$$
where $N$ is the number of non-decayed nuclei present in your sample, and $N_0$ is the number of initial non-decayed nuclei.
In this case, yes it takes (in theory) an infinite amount of time for $N$ to reach $0$, though this assumes $N$ can vary continuously (such as taking values like $N=0.01$, which is non-physical - $N$ can only take integer values). As $N$ and $N_0$ get larger, this equation better describes the situation.
Here, $\tau = 1/\lambda$ is in fact the mean lifetime, and is related to the half life,$\tau_{1/2}$ via
$$\tau = \frac{\tau_{1/2}}{\ln 2}$$
(from http://hyperphysics.phy-astr.gsu.edu/hbase/Nuclear/meanlif.html)