1.) What are poles and zeros of linear system {A,B,C,D} exactly? What does it mean for a system to have a pole at a certain value, or a zero at certain value?
Intuitively, I do not know exactly what poles or zeros are. All I know is that the poles are roots of the denominator of the transfer function, or the eigenvalues of the $A$ matrix, like the one in your question. Poles show up explicitly in the solutions of ordinary differential equations, and an example of this can be seen here:
http://www.math.oregonstate.edu/home/programs/undergrad/CalculusQuestStudyGuides/ode/laplace/solve/solve.html
So what kind of question can we answer using information about poles?
i) Is the system stable?
ii) If it is stable, is the response of the system oscillatory, is it like a rigid body?
iii) If it is unstable, is it possible to stabilize this system using output feedback? (you need information about zeros here)
Now, let's talk about zeros. Zeros show up in literature because it has an effect on the behavior of control systems.
i) They impose fundamental limitations on the performance of control systems.
ii) In adaptive control systems, zeros can cause your adaptive controller to go unstable.
iii) They tell you about the "internal stability" of a control system.
As far as I can tell, zeros are more subtle than poles. I cannot say I fully understand them.
2.) The author writes about 'poles of a transfer function matrix H(s)'. What is a transfer function matrix? The only thing I know is how to compute it and that it describes some relation between input/output of the system. But why do we need tranfser function matrices?
Taking the Laplace transform of a differential equation that has a single-input and a single-output yields a transfer function. An example of this is in the link above. A transfer function describes the relationship between a single output and a single input. So if you have a system of differential equations that has, say, 2 inputs and 3 outputs, then a transfer matrix is a matrix of transfer functions that contains 6 elements. Each individual element describing the relationship between one of the inputs and one of the outputs. (The superposition principle plays a big role here)
But why would one want a transfer matrix. I believe it is because calculating zeros for a multi-input multi-output system is not easy. Here is an article that talks about all the different kinds of zeros and why they are important:
http://www.smp.uq.edu.au/people/YoniNazarathy/Control4406/resources/HoaggBernsteinNonMinimumPhaseZero.pdf
3.) To calculate the poles and zeros, the author says that we need the Smith and Smith-McMillan Forms. These are matrices that have only diagonal entries. What is exactly the algorithm to calculate the Smith-(McMillan)-form of a transfer matrix?
Sorry. I don't have much on this one.
4.) What is the relation between the poles of a system and the controllability, observability, stability and stabilizability ? The same for a zero ?
For me, poles and zeros are important to transfer functions, which describe the relationship between inputs and outputs, and they can tell you about stabilizability and stability. However, concepts like controllability and observability are state space concepts (At least for me). If you write a transfer function in state space form, as you have written in your question, then there is a very simple test for controllability and observability. You can find more about this in almost any course, for example in Stephen Boyd's introductory control course at Stanford.edu.
5.) What is an invariant zero polynomial of the system {A,B,C,D} ?
A SISO system just has one kind of zero. A MIMO system has many kinds of zeros, one of which is an invariant zero. The roots of the invariant zero polynomial gives you invariant zeros. It makes me kind of sad that I do not know very much about zeros of MIMO systems.
6.) What is 'a realization of a system'?
Let's say you start off with a differential equation. Then you take its Laplace transform, and obtain a transfer function. Then, for this transfer function, there are an infinite number of state space representations. That is, there are an infinite number of matrices $A, B, C, D$ that yield the same input-output relationship as the original transfer function. These representations are called realizations. We can go from one realization to another using "Similarity Transformations".
7.) Where can I find more good information about this subject?
If you are a mathematician, then you should probably look for a more mathematical text on control systems. Most engineers use a classical control book ( like the one by Ogata ) in undergrad, which is mostly about transfer functions, zeros, poles, and various stability tests. Then, in grad school, engineers take a course called "Linear Systems Theory", where they learn about State Space theory of control systems. The book I used was by "Chen", but I did not like it very much.
Best Answer
Although this is not a rigorous proof, it should at least demonstrate the limitation that RHP poles and zeros put on the bandwidth in combination with the peak of the sensitivity.
If you have a system with only one RHP pole or only one RHP zero then, although bad practice, you can always cancel the remaining poles, zeros and gain in the controller. In order ensure that the controller has a proper transfer function one can always add a high bandwidth low-pass filter of sufficiently high order. This low-pass filter shouldn't effect the closed loop much, since this is mainly dominated around the frequency range where the magnitude of the open loop (system times controller) crosses the 0 dB line. Furthermore by using time scaling the RHP pole or zero can always be normalized to $-1$.
In the case of only one RHP pole the considered system can be generalized to
$$ G(s) = \frac{1}{s - 1}. $$
By using a controller of the form
$$ C(s) = \frac{a\,s + b}{s} $$
then the sensitivity transfer function would look like
$$ S(s) = \frac{s(s - 1)}{s^2 + (a - 1)s + b}. $$
Using $b = \omega^2$ and $a=2\,\zeta\,\omega+1$ gives a more standard form
$$ S(s) = \frac{s(s - 1)}{s^2 + 2\,\zeta\,\omega\,s + \omega^2}, $$
where $\omega$ can be used as a measure of the bandwidth. As expected for a sensitivity transfer function at really low frequencies the assymptote of $S(s)$ has a positive slope, namely $+1$, and at really high frequencies the assymptote of $S(s)$ has a slope of zero and a magnitude of 0 dB.
When $\omega < 1$ then the slope of the assymptote of $S(s)$ after a frequency of $\omega$ will decrease by two to $-1$ and eventually increase to the final assymptote with slope zero after a frequency of $1$. So, before the assymptote goes to the 0 dB line, the slope is negative which means that the magnitude of $S(s)$ during that interval is above 0 dB. The further $\omega$ lies below one, the higher the magnitude of $S(s)$ will go above 0 dB.
When $\omega > 1$ then the slope of the assymptote of $S(s)$ after a frequency of $1$ will increase by one to $+2$ and eventually decrease to the final assymptote with slope zero after a frequency of $\omega$. So, before the assymptote goes to the 0 dB line, the slope is always positive which means that there shouldn't be a magnitude of $S(s)$ which goes significantly above 0 dB.
The two cases above (and the case when $\omega=1$) are also illustrated in the figure below which uses $\zeta = \tfrac{1}{2}\sqrt{2}$:
In the case of only one RHP zero the considered system can be generalized to
$$ G(s) = \frac{s - 1}{s + p}. $$
The pole at $p>0$ is just added to make the system proper. Now by using a controller of the form
$$ C(s) = \frac{a(s + p)}{s^2 + b\,s} $$
then the sensitivity transfer function would look like
$$ S(s) = \frac{s (s + b)}{s^2 + (a + b)s - a}. $$
Using $a = -\omega^2$ and $b=\omega(2\,\zeta + \omega)$ again gives a more standard form
$$ S(s) = \frac{s (s + \omega(2\,\zeta + \omega))}{s^2 + 2\,\zeta\,\omega\,s + \omega^2}, $$
where $\omega$ can again be used as a measure of the bandwidth. As expected for a sensitivity transfer function at really low frequencies the assymptote of $S(s)$ has a positive slope, namely $+1$, and at really high frequencies the assymptote of $S(s)$ has a slope of zero and a magnitude of 0 dB. The transition for the magnitude of the peak of the sensitivity now does not lie near $\omega=1$ but roughly at $\omega=2\,\zeta$.
When $\omega < 2\,\zeta$ then the zero of $S(s)$ is of the same magnitude as the bandwidth (assuming a normal value for $\zeta$). This means that the slope of the assymptote of $S(s)$ a little after a frequency of $\omega$ will eventually decrease by one to zero (decrease by two and increase by one). The damping coefficient can influence this a little, namely the zero might lie a little ahead or behind $\omega$, but for realistic values for $\zeta$ the assymptote does not change much.
When $\omega > 2\,\zeta$ then the zero of $S(s)$ scales with the square of the bandwidth. So between the frequencies $\omega$ and roughly $\omega^2$ the assymptote of $S(s)$ will be $-1$ and thus its magnitude of $S(s)$ will have a significant portion above 0 dB.
The two cases above (and the case when $\omega=1$) are also illustrated in the figure below which uses $\zeta = \tfrac{1}{2}\sqrt{2}$:
For a RHP pole the maximal magnitude of the sensitivity quickly goes up when the bandwidth is chosen below the break frequency of the pole. The opposite is true for a RHP zero, so when the bandwidth is chosen above the break frequency of the zero. So it is possible to place the bandwidth anywhere you want if you have a single RHP pole or zero, but you will have poor performance. Namely a large maximal magnitude of the sensitivity transfer function means large amplification of disturbances that act of the system. So often one would want to keep the magnitude of the sensitivity below roughly 6 dB.