The formal definition of ‘string’

definitionformal-languagesterminology

I just found out about uncountable and infinitary languages and I'm not sure what "string" means now. Originally, I had conceived of a string being, at most, a countable sequence of symbols with a least (leftmost) member. Since this fails to account for several types of infinite-length string (for example, strings that have no first or last symbol, uncountable strings), I'm at a loss as to what "string" means in the first place.

At the very least a string consists of a set [or class?] of symbols$^0$ with some kind of order, but I'm not sure where the line between "string" and "digraph" is drawn. If there are no restrictions, then the notion of "string" seems sufficiently general to be indistinguishable from "binary relation."

Edit:

To give something of a starting point, I had originally thought to define a general "closure" operation that gives the set of all strings up to a certain (possibly infinite) length.

Let $\gamma$ be some fixed ordinal, then…

$$Cl_\gamma(X)=\bigcup_{i<\gamma}X^i$$

This readily generates the set of $\omega$-length strings when $\gamma=\omega+1$. However, the closure described above excludes strings which are not "well ordered." For example, strings with no leftmost symbol, which can be generated by infintary left-recursive grammars. Even extending the above to count reverse-order types for each ordinal$^1$ excludes the case of "cyclic" and "dense" strings (provided that such things are permitted).


$^0$ when referring to the characters of a string as a set [class?], assume that each character is identified with a pair $(k,s)$, where $s$ is the symbol and $k$ is a "tag" distinguishing different instances of $s$. If you take this one step further, then a string is a function from some set [class?] $X$ to a set $S$ of primitive symbols, plus a binary relation [order?] on $X$.

$^1$ $X^{\omega^*}$, appropriately defined, would account for $\omega$-length "left-recursive" strings

Best Answer

I prefer to use the term word instead of string. Successive notions of words over an alphabet $A$ have been defined and studied: finite words, infinite words, bi-infinite words, words over ordinals and words over linear orderings. See [1] for a survey. Cyclic words were also considered: they are equivalence classes of cyclic permutations of finite words.

[1] Véronique Bruyère and Olivier Carton, Automata on linear orderings, J. Comput. System Sci. 73 (2007), no. 1, 1--24. (See also here).

Related Question