Understanding big O for multiple variables in the context of optimization theory.

asymptoticsdefinitionoptimization

I am trying to understand how big O is used for multiple variables, especially within the context of mathematical optimization. For example, in this paper, they mention a previous iteration complexity result (on page 2, under "Our results and their relation to previous work:"), which studied a support vector machine optimization algorithm and found that it takes $O(m^4/\epsilon)$ iterations in order to be within $\epsilon$ of optimality, where $m$ is the number of variables. This usage of big O has multiple variables, which are $m$ and $\epsilon$. The exact line is quoted below:

Combining these results, we see that the decomposition algorithm of Hush and Scovel for problem SVO is within $\epsilon$ of optimality after $O(m^4/\epsilon)$ iterations.

I am currently confused by the exact meaning of this use of big O. So far, any references I have found that discuss big O with multiple variables solely discuss the case where the variables of interest approach infinity (see Wikipedia). However, for the example I have provided, I feel like they are interested in the case where $m\rightarrow \infty$, and $\epsilon \rightarrow 0$. I know that big O can be defined in the univariate case so that it covers both finite and infinite limits as follows:
$$ f(x) \text{ is } O(g(x)) \text{ as } x \rightarrow a \text{ if } \limsup_{x \rightarrow a} \frac{|f(x)|}{g(x)} < \infty,$$ where $a \in \overline{\mathbb{R}}$ with $\overline{\mathbb{R}}:=\mathbb{R}\cup\{-\infty,+\infty\}$. Intuitively speaking, I think this definition can be extended to the case where $f,g$ are defined on $\mathbb{R}^d$ as follows:
$$ f(x) \text{ is } O(g(x)) \text{ as } x \rightarrow a \text{ if } \limsup_{x \rightarrow a} \frac{|f(x)|}{g(x)} < \infty,$$
where $ a \in \overline{\mathbb{R}}^d$. Using this definition and letting $N(m,\epsilon)$ denote the iteration complexity, my current interpretation of the result in the example is that $N(m,\epsilon) \text{ is } O(m^4/\epsilon)$ as $(m,\epsilon)\rightarrow (+\infty,0)$.

This leads me to several related questions:

  1. Is the definition I have provided for big O in multiple dimensions something that is standard, or is it just jibberish?
  2. Is my interpretation according to the definition I provided correct? If not, what is the correct interpretation of the iteration complexity result?
  3. It feels like most papers are written assuming readers already know the meaning of big O for iteration complexity… Does anyone ever provide its precise meaning?

Best Answer

The definition you give is mostly a fine interpretation of big-$O$ computational complexity. It is not the standard definition because it's too general; we don't ever need to know the big-$O$ complexity of $f(x,y)$ as $(x,y) \to (3,4)$. I would think of the number of iterations as a function of $m$ and $\frac1{\varepsilon}$, and then we can go with a less general definition as $m, \frac1{\varepsilon} \to \infty$.

I say "mostly fine" because actually as soon as we start to do anything weird with the definition of big-$O$ (and multiple variables counts as "weird") the conclusions that people want to draw from big-$O$ statements are not always supported by the definitions. For example, I can imagine someone applying the result on iteration complexity that you quote to conclude that for a fixed problem with $m=10$, it takes $O(\frac1{\varepsilon})$ iterations to get within $\varepsilon$ of optimality. This does not follow from your definition or from Wikipedia's.

(In practice, I would expect that this is still true, and what saves us in practice is that people don't typically hide any particularly strange terms under big-$O$. Maybe the actual function would be more like $\binom m2^2 (\frac1\varepsilon + 1) + 13m\log m$, but it's not going to have a secret piecewise term "plus $\frac1{\varepsilon^{10}}$ when $m \le 20$". That usually doesn't appear in algorithms, and also if it did, I'd expect it to be called out.)

To address another part of the question, it is common not to say that we consider the limit as $m \to \infty$ and $\varepsilon \to 0$. There are three ways we deduce this from context:

  1. Notation: in a limit, a variable called $m$ always approaches $\infty$ and a variable called $\varepsilon$ always approaches $0$.
  2. Experience: in problems where we allow some tolerance $\varepsilon$ on the answer, having some $\frac1{\varepsilon}$ or $\log \frac1{\varepsilon}$ or $\text{poly}(\frac1{\varepsilon})$ factor in the big-$O$ complexity is pretty common, and it tracks what happens as $\varepsilon \to 0$.
  3. Motivation: an intuition for complexity analysis is "the more you want, the higher the cost will be" and the big-$O$ complexity is an upper bound on that cost. Taking $m \to \infty$ and $\varepsilon \to 0$ are both instances of "wanting more".