The full, formal, exact definitions are very much not the point, it's a good exercise to do once in your life, partially. Other than that, to quote one of my professors when I was a freshman "convince yourself that it's true".
The idea is that terms, or formulas, or whatnot, which are defined recursively, have a sequence of previously-defined terms.
A string in a language is a term if and only if it is an atomic term, or there is a (finite) sequence such that every member of the sequence is either atomic or is obtained from terms appearing before it in the sequence, and the last thing in the sequence is the term itself.
So, $f(x+y)$ is given by $(x,y,x+y,f(x+y))$, for example. Note that these sequences need not be unique!
And we can write the whole thing into the language of set theory, it's just relatively long and ugly. Although it's getting slightly better once you've written some basic formulas such as "$f$ is a finite sequence" etc.
Upgrading my comments to an answer. I agree that Marker's definition appears slightly vague at first glance; what does the "smallest set" with that property mean? Let me flesh out this kind of construction a bit more generally. Given some property $P$, we often want to construct a set $\mathcal{T}$ that is the "smallest" set satisfying $P$. We mean by this that we want the following two things to hold:
- $\mathcal{T}$ satisfies $P$.
- If $Y$ is any set satisfying $P$, then $\mathcal{T}\subseteq Y$.
Note that such a set is necessarily unique; indeed, suppose $\mathcal{T}'$ is another set satisfying both of these conditions. Since $\mathcal{T}$ satisfies condition $2$ and $\mathcal{T}'$ satisfies condition $1$, we have $\mathcal{T}\subseteq\mathcal{T}'$, and since $\mathcal{T}'$ satisfies condition $2$ and $\mathcal{T}$ satisfies condition $1$, we have $\mathcal{T}'\subseteq\mathcal{T}$. In other words, $\mathcal{T}=\mathcal{T}'$ as claimed.
Now, how do we actually "construct" this set $\mathcal{T}$? The idea is to take $\mathcal{T}$ to be the intersection of all sets satisfying property $P$. Now, this is not necessarily well-defined; for instance, if there does not exist a set satisfying $P$, then we cannot do this, since the intersection of the empty set is not defined. (Why not?) So we need for there to exist at least one set satisfying $P$; in practice, there is often a somewhat "natural" choice for this. Suppose for the purpose of the discussion that we have a set $X$ in mind that satisfies property $P$. By condition $2$ above, our desired set $\mathcal{T}$ will necessarily be a subset of $X$. So, in order to construct $\mathcal{T}$, it suffices to think exclusively about subsets of $X$. In particular, we can define $\mathcal{T}$ as the intersection of all subsets of $X$ that satisfy property $P$.
Now, does $\mathcal{T}$ satisfy condition $1$? Unfortunately, not necessarily; for instance, if $P$ is the property of being non-empty, then the intersection of all subsets of (say) $\mathbb{N}$ satisfying $P$ will itself be empty and hence not satisfy $P$. There is an implicit assumption lurking in the background; namely, that $P$ is stable under intersections. In other words, if $(A_i)_{i\in I}$ is a family of sets satisfying $P$, then $\bigcap_{i\in I}A_i$ also satisfies $P$. So, suppose $P$ is stable under intersections. Then we immediately have that $\mathcal{T}$ satisfies property $P$, since by definition it is an intersection of sets satisfying $P$! So we're done with condition $1$.
Does $\mathcal{T}$ also satisfy condition $2$? Indeed; let $Y$ be any set satisfying $P$. Then $Y\cap X$ satisfies $P$, since $P$ is stable under intersections, and $Y\cap X$ is a subset of $X$. Thus, by definition – since $\mathcal{T}$ is the intersection of all subsets of $X$ satisfying $P$ – we have $\mathcal{T}\subseteq Y\cap X$ and hence $\mathcal{T}\subseteq Y$, as desired. So we are done!
Now let's specialize to the example of Marker. In this case, $P$ is defined by taking $Y$ to satisfy $P$ if and only if $(1)$ $v_i\in Y$ for each $i$, $(2)$ $c\in Y$ for each constant symbol $c\in\mathcal{L}$, and $(2)$ $f(t_1,\dots,t_{n_f})\in Y$ for each function symbol $f\in\mathcal{L}$ and all elements $t_i\in Y$.
Exercise: Show that $P$ is stable under intersections, in the sense defined above. $\blacksquare$
Now, to define $\mathcal{T}$ by the process above, we first need to find some set $X$ satisfying $P$. First define $\mathcal{L}'$ to be the union of $\mathcal{L}$ with the set of variables symbols and the symbols "$($", "$)$", and "$,$". Now let $X$ be the set of all finite strings (ie finite sequences) of symbols of $\mathcal{L}'$.
Exercise: Show that $X$ satisfies property $P$. $\blacksquare$
Once you have shown these two exercises, you are done! Since we can now define $\mathcal{T}$ to be the intersection of all subsets of $X$ that satisfy property $P$. In particular however, note that $\mathcal{T}\subseteq X$. Thus every element of $\mathcal{T}$ is a finite string of symbols of $\mathcal{L}'$, and so, to answer your question – no, there are no elements of $\mathcal{T}$ with infinite nesting as you describe.
Best Answer
I prefer to use the term word instead of string. Successive notions of words over an alphabet $A$ have been defined and studied: finite words, infinite words, bi-infinite words, words over ordinals and words over linear orderings. See [1] for a survey. Cyclic words were also considered: they are equivalence classes of cyclic permutations of finite words.
[1] Véronique Bruyère and Olivier Carton, Automata on linear orderings, J. Comput. System Sci. 73 (2007), no. 1, 1--24. (See also here).