Integer-valued metrics from a topological view point

general-topologymetric-spaces

I am working a bit in applied mathematics, but not in theoretical mathematics. I dabble a bit in general topology, but if possible would like to ask you to consider my level of know-how and add some intuition into your answer. I can handle some formulas, but since I am not a professional mathematician I am not fluent in all the lingo… so basic terminology is appreciated.


In data analytics / data science, we have premetrics, like Dynamic time warping, which can be calculated between two real-valued sequences, even if the sequences do not have the same length (so they are not vectors). As such, we can say that the space created by the premetric is a topological space.

When we are dealt with nominal-valued sequences (aka strings), like human words, we can have only equality as an operation at our disposal for comparing elements of the sequences. To calculate a metric distance between such, we have e.g. the Levenshtein distance. This metric only outputs integer-values as distances. This would create a metric space, I guess, however not in the normal sense that we have infinitesimal close neighbors for each sequence/point.

Usually a metric space would induce a topological space. Usually topological spaces concretize the notion of what "the closest next point" is. Here, the "closest next point" is not infinitesimally close. Can we still talking about topological spaces in this context?

I have not read anywhere that the codomain of the metric of a metric space needs to be a field (which integers are not), but are we still talking about a metric space for the Levenshtein distance?

How should I think about the fact that the Levenshtein distance have integers as output?

Best Answer

This question is predicated on a false assumption, namely that

in the normal sense [of "metric space"] we have infinitesimal close neighbors for each sequence/point.

Most basically, there are no such things as "infinitesimally close points" in a metric space: the distance between any two points is always a (nonnegative) real number, and there are no infinitesimal real numbers.

Moreover, the claim "topological spaces concretize the notion of what "the closest next point" is" is wrong for the same reason: we don't think about "closest next points" almost ever.

Even ignoring that, it sounds like you're thinking of metric spaces without isolated points (or maybe complete metric spaces, or complete and connected metric spaces, or something similar). These are just a particular type of metric space; there's no requirement that all metric spaces "look like that."

Indeed, a metric space is merely any set $X$ equipped with a function $\delta: X\times X\rightarrow\mathbb{R}_{\ge 0}$ satisfying three basic properties:

  • $\delta(x,y)=0\leftrightarrow x=y$.

  • $\delta(x,y)=\delta(y,x)$.

  • $\delta(x,y)+\delta(y,z)\ge\delta(x,z)$.

The metric function $\delta$ could be (nonnegative-)integer valued; that's totally fine.


That said, there is a sense in which integer-valued metrics are "less topology-flavored" than one might expect.

Suppose $(X,\delta)$ is a metric space with $ran(\delta)\subseteq\mathbb{Z}$. Then the topology on $X$ induced by $\delta$ is discrete: every set is open. In particular, any two integer-valued metrics on the same set yield the same topology on that set.

This means that if I want to compare integer-valued metrics - or more generally, discrete metric spaces, which are spaces where for each point $x$ there is some $u>0$ such that every point other than $x$ is at distance at least $u$ from $x$ - I have to use "finer-grained" notions than the usual ones coming from topology. But that doesn't mean that integer-valued metrics aren't allowed, it just means that I might have to be careful about what questions I ask about them if I want to get meaningful answers.

Related Question