That's set theory. I honestly recommend that you study set theory, as nowdays all mathematics are built on sets. Just pick any introductory book to set theory, that ill be (imo) enough.
Now, what you may having difficulties with, is the interpretation of those things, let's take an example from that page:
Given the set of N observations Y = {yi ; i = 1, . . . , N } we want to identify which
observations belong to the same object.
This basically mean that $Y$ is a set containing observations, more simple (not so formal) translation could be: "We have a bunch of observations".
We are looking for a partition ω of observations from Y into several
trajectories Yk ⊂ Y (subsets of Y ) such that each trajectory collects
observations believed to come from a single person.
This now means that $\omega$ (defined in the next parragraph) is a partition from a space of partitions, and you're looking for subsets, that they call $Y_k$ that is believed to belong to the same person. Now the translation: "We are looking for observations that we believe belong to the same person, and we are going to separate the observations that belong to every person in different groups (sets)".
A valid partition expresses the set of all observations as an
exhaustive union of trajectories: Y = Y1 ∪ . . . ∪ YKω , where Kω is
the number of objects proposed by a partition ω. The trajectories must
be mutually exclusive: Y ∩ Yk = ∅, when = k.
And last, they say a valid $\omega$ must have trajectories that contain (alltogether) the initial set $Y$, and that this trayectories' intersection must be empty. The translation: "The partition must have all subsets of observations, that means we must have formed groups that cover all observations, and two different subsets of observations cannot share a member, that would be obviously absurd, if we had two belonging to person A, and another two belonging to person B, and one of those two was repeated, then A and B would be the same person"
That goes on for the whole article. The final parragraph means:
We have a bunch of observations. We are looking for observations that
we believe belong to the same person, and we are going to separate the
observations that belong to every person in different groups (sets),
that separation is called "partition". The partition must have all
subsets of observations, that means we must have formed groups that
cover all observations, and two different subsets of observations
cannot share a member, that would be obviously absurd, if we had two
belonging to person A, and another two belonging to person B, and one
of those two was repeated, then A and B would be the same person.
This notation is used because it doesn't let space for ambiguity, and is more rigouruos, so besides studying set theory, which is basic, you should train you interpretation of these articles (just read more and try to do what I've done here)
BTW, I don't think there's any book about mathematical notation for computer science (it would really surprise me), because such thing doesn't exist, that's just mathematical/formal notation.
We usually represent numbers as finite sequence of digits. In base-$2$ each digit is called bit and has value $0$ or $1$. So we write each number $a$ as $\sum_{k=0}^n a_i2^i$, where $a_i$ are digits and assuming $a_n\neq0$ number $a$ is called $n$-bit number. Notice that if $a$ is $n$-bit number then $2^{n-1}\leq a<2^n$. So number of bits of number $a$ is $O(\log a)$.
If algorithm works on individual digits, then the complexity is dependant on length of given number (by length I mean number of bits). For instance how does addition algorithm works? You want to add $a$ and $b$, first you add $a_0$ and $b_0$, write down the result, if necessary carry $1$, continue for next bit and so on. As a result you do $n$ bit additions, so adding two $n$-bit numbers has complexity $O(n)$.
So basicly, saying that algorithm takes time $O(n)$ for $n$-digit number, means that it is linear with respect to the length of given number. And also that it takes time $O(\log n)$ with respect to the number itself.
Best Answer
Given a positive integer $n\in\textbf Z^+$, consider the list $$1,2,3,...,n$$ comprised of the first $n$ positive integers. You have to guess some positive integer $1\leq x\leq n$ contained somewhere in this list, and whenever you make a guess, you are told one of three things $$1.\text{Your guess is too high :(}$$ $$2.\text{Your guess is too low :(}$$ $$3.\text{Your guess is correct!}$$ How can you guess what $x$ is? Let me go over two possible strategies. The first one is called linear search. Basically, what you do is guess that $x=1$, and if you get it correct then you celebrate and go drink a lot of beer, while if you are told that $1$ is too low, then you guess the next number in the list, i.e. you guess $x=2$, and if this is correct then you celebrate and go drink a lot of beer, while if you are told that $2$ is too low, then you guess the next number $3$, and this process keeps repeating until you guess the correct number (and go celebrate and drink a lot of beer). For each positive integer $n\in\textbf Z^+$, I'm going to let $T_{\text{linear}}(n)$ denote the maximum number of guesses that you need to correctly guess what $x$ is in a list of length $n$ using the linear search algorithm described above. Convince yourself then that the following table of values is indeed accurate: \begin{array}{r|c} n & T_{\text{linear}}(n) \\ \hline 1 & 1 \\ 2 & 2 \\ 3 & 3 \\ 4 & 4 \\ 5 & 5 \\ 6 & 6 \\ 7 & 7 \\ 8 & 8 \\ 9 & 9 \\ \vdots & \vdots \end{array} In short, for all positive integers $n\in\textbf Z^+$, $T_{\text{linear}}(n)=n$.
Now let's consider another strategy called binary search. Here, you start by guessing that $x$ is the "middle term" in the list, say $x=n/2$ if $n$ is even and $x=\frac{n+1}{2}$ is $n$ is odd. If the guess is correct, then you're done, while if the guess is too low then you have basically eliminated $\approx 1/2$ of the possibilities for what $x$ could be, and then you just apply the whole procedure again to the remaining half of the list, and you keep doing this until you guess correctly. For each possible list length $n\in\textbf Z^+$, let $T_{\text{binary}}(n)$ denote the maximum number of guesses that you need to correctly guess what $x$ is in a list of length $n$ using the binary search algorithm described above. In this case, convince yourself that the following table of values is correct:
\begin{array}{r|c} n & T_{\text{binary}}(n) \\ \hline 1 & 1 \\ 2 & 2 \\ 3 & 2 \\ 4 & 3 \\ 5 & 3 \\ 6 & 3 \\ 7 & 3 \\ 8 & 4 \\ 9 & 4 \\ \vdots & \vdots \end{array}
and that in general, for all positive integers $n\in\textbf Z^+$, we have that $T_{\text{binary}}(n)=\lfloor\log_2(n)\rfloor+1$. Below in red is the graph of the $T_{\text{linear}}$ function while in green is the graph of the $T_{\text{binary}}$ function:
Overall, it is clear that binary search is a much more efficient algorithm than linear search when it comes to this particular task of guessing a number in a list, especially as the list gets larger and larger (i.e. as $n\to\infty$). Computer scientists will often say that linear search runs in at worst $O(n)$ time (or linear time), while binary search runs in at worst $O(\log(n))$ time (or logarithmic time). This is because, using the notation from the original post, we have $X=\textbf Z^+$, $f(n)=T_{\text{linear}}(n)=n$ or $f(n)=T_{\text{binary}}(n)=\lfloor\log_2(n)\rfloor+1$, and $g(n)=n$ or $g(n)=\log(n)$ respectively, since in the first case we can choose $c:=1$ and $M:=1$ for instance, while in the latter case we can choose $c:=5$ and $M:=4$.