Let $(S, \mathcal{F}, P)$ be a probability triplet: $S$ is the sample space, $\mathcal{F}$ is a sigma algebra on $S$ (containing all the events), and $P:\mathcal{F}\rightarrow \mathbb{R}$ is a probability measure.
Suppose $\{X_i\}_{i=1}^{\infty}$ is a sequence of mutually independent and identically distributed (i.i.d.) random elements. In particular
$$X_i:S\rightarrow \{H, T\}$$
is a measurable map from the sample space to the set $\{H,T\}$, so for each $i \in \{1, 2, 3,...\}$ we have
$$\{\omega \in S: X_i(\omega) = H\} \in \mathcal{F}$$
Assume $P[X_i=H]=P[X_i=T]=1/2$.
Claim 1: For each sequence $\{h_i\}_{i=1}^{\infty}$ with $h_i \in \{H,T\}$, we have
$$\cap_{i=1}^{\infty} \{X_i=h_i\}= \cap_{i=1}^{\infty} \{\omega \in S: X_i(\omega)=h_i\} \in \mathcal{F}$$
Proof: Since $\mathcal{F}$ is a sigma algebra, the countable intersection of events in $\mathcal{F}$ is in $\mathcal{F}$. $\Box$
Claim 2: It is possible to construct $(S, \mathcal{F}, P)$, for which such i.i.d. random elements $\{X_i\}$ exist, in these example cases:
a) $S = \{red, blue\} \cup [0,1)$
b) $S = A$, where $A=\{(h_1, h_2, h_3, ...) : h_i \in \{H,T\}\quad \forall i \in \{1, 2, 3,...\}\}$.
c) $S = A \setminus \{(T, T, T, T, ...)\}$.
In particular, examples (a) and (c) can be viewed as "counter-examples" to your claim that the sample space must contain all binary sequences of H/T. The example (c) contains all binary sequences of H/T except for the all-Tails sequence $(T, T, T, T,...)$ (just start with the probability triplet in part (b) but throw away the probability-0 outcome of all tails).
Quick justifications for (a)-(c):
a) Use $P[\{red\}]=P[\{blue\}]=0$ and choose $\omega \in [0,1)$ according to the Borel sigma algebra and Borel measure. Write $\omega\in [0,1)$ as
$$ \omega = \sum_{i=1}^{\infty} \omega_i 2^{-i}$$
where $\{\omega_i\}$ is the unique binary expansion that does not contain an infinite tail of 1s. Define for each $i \in \{1,2,3,...\}$
$$X_i(red)=X_i(blue)=H$$
For $\omega \in [0,1)$ define
$$X_i(\omega) = \left\{\begin{array}{cc}
H & \mbox{if $\omega_i=1$}\\
T & \mbox{else}
\end{array}\right.$$
b) This is the standard one.
c) Just start with (b) and throw away an outcome of probability 0.
Your error is in this line:
My thought process here was the following: because the question says until there has been at least one success, then for k trials, we need at least one success, i.e., we can get up to k success.
The key detail is that you're counting until the first success. This doesn't mean you can have up to $k$ successes; it means instead that your first $k-1$ trials could not have had a success, but that your $k^{\text{th}}$ one did, making it the first of them. That's the role of the word until in this context; it implies that we stop when we find a success. (Yes, the use of "at least one" becomes necessarily strange in this formulation.)
Best Answer
As David K states implies, your process is exactly a Bernoulli process with non-random success probability $p=3/4$. The expected number of flips is then $4/3\approx1.333$.
Your argument & approach are good. You can* construct an iid sequence $U_i$ of $U[0,1]$ variables and another, $S_i$, iid $U[1/2,1]$, and consider the sequence of coupled binary outcomes $(X_i,Y_i)$ where $X_i = 1$ exactly when $U_i\le 1/2$ and $Y_i = 1$ exactly when $U_i\le S_i$. Then the $X_i$ process has the same probability distribution as the standard Bernoulli process and the $Y_i$ process has the same probability distribution as your $P$ process, and $X_i\le Y_i$ with probability $1$.
Footnote: If you are afraid your original probability space $(\Omega,\mathcal A, P)$ is not rich enough to support all these newly constructed rvs, don't worry. It is rich enough to support a $U[1/2,1]$ random variable, and hence is a so-called standard probability space. If it supports a uniform rv, that rv's binary digits are an iid sequence of fair coin flips, and by Cantor, a countable sequence of such sequences, and thus a countable sequence of uniforms, and so on. The resulting $X_i$ and $Y_i$ constructed this way will not be equal $\omega$ by $\omega$ to what you started out with, but will have the same distributional properties.