When you flip the first coin there are two equally probable results: $\rm H$ or $\rm T$. The probability for each is $1/2$.
Now if that result was a head (half of the total probability), you must flip the coin a second time, and again there are two equally probable results branching from that point. This give the probabilities of two of the outcomes $\rm (H,H), (H,T)$ as each being half of the half: $1/4$.
Now if the first toss were a tail, you would toss a die. This time their would be six outcomes branching off that initial result, all equally likely from that point. This give the probabilities of these remaining six outcomes $\rm (T,1), (T,2), (T,3), (T,4), (T,5), (T,6)$ as each being $1/12$.
This is nothing more than the definition of conditional probability.
$$\mathsf P((X,Y){=}(x,y)) ~=~ \mathsf P(X{=}x)~\mathsf P(Y{=}y\mid X{=}x)\\\mathsf P((X,Y){=}({\rm T},6)) ~=~ \mathsf P(X{=}{\rm T})~\mathsf P(Y{=}6\mid X{=}{\rm T})
\\=~\tfrac 1 2\times \tfrac 16$$
And such.
PS: the probability that there die shows greater than 4 given that there is at least one tail is obviously:
$$\mathsf P(Y\in\{5,6\}\mid X{=}{\rm T}\cup Y{=}{\rm T})=\dfrac{\mathsf P((X,Y)\in\{~({\rm T},5),({\rm T},6)~\}) }{\mathsf P((X,Y)\in\{~({\rm H},{\rm T}),({\rm T},1),({\rm T},2),({\rm T},3),({\rm T},4)({\rm T},5), ({\rm T},6)~\})}$$
My thoughts: the experiment here is actually "choose a coin, flip it
twice" meaning that a possible outcome of this experiment is, for
example, "biased coin, two heads" or "fair coin, head, tail" etc.
Therefore this means that this "biased coin, two heads" and this "fair
coin, head, tail" are elementary events.
Consequently our sample space is something like
$$ S = \{ \text{(F, TT), (F, HT), (F, TH), (F, HH), (B, HH)} \} $$
Yes, your explanation and illustration of the terminology (which I've boldfaced) are exactly correct.
Therefore events should be subsets of the $S$,
Yes, and this experiment has $2^5=32$ distinct events associated with it.
but author's event is
like "choose a coin" or "flip coin first time" which are definitely
can't be subsets of $S$.
Yes, because these are experiment trials, not events.
Although it is common to associate trials with events— e.g., the event of landing Tail in the first flip is $\{FTT,FTH\}$ and the event of landing Head in the first flip is $\{FHT,FHH,BHH\}$ —this clearly isn't what author means either.
Best Answer
For any event $A$, a certain event $B$, and an impossible event $C$, where $A$, $B$ and $C$ are all independent, we need $A$ and $B$ happening to be as probable as $B$, $B$ and $C$ happening to be as probable as $C$, and $A$ and $C$ happening to be as probable as $C$. Written out with the definition of independence, this means that:
$$P(AB) = P(A)P(B) = P(A)$$ $$P(BC) = P(B)P(C) = P(C)$$ $$P(AC) = P(A)P(C) = P(C)$$
The events $A$ and $C$ are also disjoint ($C$ won't happen whenever $A$ happens because $C$ can't happen), and since we need the probability of either happening to equal the probability of just $A$ happening, we need: $$P(A \cup C) = P(A) + P(C) = P(A)$$
These are all true only if $P(B) = 1$ and $P(C) = 0$. Put differently, in order for independence to distribute through probabilities, we need certainty to correspond with the multiplicative identity 1 and impossibility to correspond with the additive identity 0. Formally, this is true in any probability space where the events form a field.
Edit: better justification for impossibility being 0