Let $\mathcal{X}$ represent your input space i.e the space where your data points resides. Consider a function $\Phi:\mathcal{X} \rightarrow \mathcal{F}$ such that it takes a point from your input space $\mathcal{X}$ and maps it to a point in $\mathcal{F}$. Now, let us say that we have mapped all your data points from $\mathcal{X}$ to this new space $\mathcal{F}$. Now, if you try to solve the normal linear svm in this new space $\mathcal{F}$ instead of $\mathcal{X}$, you will notice that all the earlier working simply look the same, except that all the points $x_i$ are represented as $\Phi(x_i)$ and instead of using $x^Ty$ (dot product) which is the natural inner product for Euclidean space, we replace it with $\langle \Phi(x), \Phi(y) \rangle$ which represents the natural inner product in the new space $\mathcal{F}$. So, at the end, your $w^*$ would look like,
$$
w^*=\sum_{i \in SV} h_i y_i \Phi(x_i)
$$
and hence,
$$
\langle w^*, \Phi(x) \rangle = \sum_{i \in SV} h_i y_i \langle \Phi(x_i), \Phi(x) \rangle
$$
Similarly,
$$
b^*=\frac{1}{|SV|}\sum_{i \in SV}\left(y_i - \sum_{j=1}^N\left(h_j y_j \langle \Phi(x_j), \Phi(x_i)\rangle\right)\right)
$$
and your classification rule looks like: $c_x=\text{sign}(\langle w, \Phi(x) \rangle+b)$.
So far so good, there is nothing new, since we have simply applied the normal linear SVM to just a different space. However, the magic part is this -
Let us say that there exists a function $k:\mathcal{X}\times\mathcal{X}\rightarrow \mathbb{R}$ such that $k(x_i, x_j) = \langle \Phi(x_i), \Phi(x_j) \rangle$. Then, we can replace all the dot products above with $k(x_i, x_j)$. Such a $k$ is called a kernel function.
Therefore, your $w^*$ and $b^*$ look like,
$$
\langle w^*, \Phi(x) \rangle = \sum_{i \in SV} h_i y_i k(x_i, x)
$$
$$
b^*=\frac{1}{|SV|}\sum_{i \in SV}\left(y_i - \sum_{j=1}^N\left(h_j y_j k(x_j, x_i)\right)\right)
$$
For which kernel functions is the above substitution valid? Well, that's a slightly involved question and you might want to take up proper reading material to understand those implications. However, I will just add that the above holds true for RBF Kernel.
To answer your question, "Is the situation so that all the support vectors are needed for the classification?"
Yes. As you may notice above, we compute the inner product of $w$ with $x$ instead of computing $w$ explicitly. This requires us to retain all the support vectors for classification.
Note: The $h_i$'s in the final section here are solution to dual of the SVM in the space $\mathcal{F}$ and not $\mathcal{X}$. Does that mean that we need to know $\Phi$ function explicitly? Luckily, no. If you look at the dual objective, it consists only of inner product and since we have $k$ that allows us to compute the inner product directly, we don't need to know $\Phi$ explicitly. The dual objective simply looks like,
$$
\max \sum_i h_i - \sum_{i,j} y_i y_j h_i h_j k(x_i, x_j) \\
\text{subject to : } \sum_i y_i h_i = 0, h_i \geq 0
$$
Step 1. For an arbitrary set of strings $\{x_i\}$, first sort them by their length. Then the kernel matrix is block-diagonal, since the kernel value between any two strings of different lengths is zero.
This lets us see that we only need to consider strings of the same length; how?
Step 2. So, we only need to consider a set of strings $\{ x_i \}$ all of length $m$. Then we can see that $$K(x, z) = \frac1m \sum_{i=1}^m \mathbb{1}(x_i = z_i), \tag{*}$$
where $\mathbb{1}(x_i = z_i)$ is 1 if $x_i = z_i$, 0 otherwise.
Looking at it this way lets us find an explicit feature map from the space of strings of length $m$ to vectors of a certain length: that is, $K(x, z) = \varphi(x)^T \varphi(z)$, for some function $\varphi$. Do you see what that function is?
Hint: if $a = (0, 1, 0, 1)$ and $b = (1, 1, 0, 0)$, then $a^T b = 0 + 1 + 0 + 0$; in general, for length-$k$ vectors whose entries are all either $0$ or $1$, $a^T b = \sum_{i=1}^k a_i b_i = \sum_{i=1}^k \mathbb 1(a_i = 1, b_i = 1)$. So you might want to come up with some function $\varphi$ that turns this into something that looks like (*).
Step 3. From there, it's a standard result that we have positive definiteness pretty quickly. Supposing that $\varphi$ maps into $\mathbb R^k$, let $\Phi$ be the $n \times k$ matrix whose $i$th row is $\varphi(x_i)$, so that $(\Phi \Phi^T)_{ij} = \varphi(x_i)^T \varphi(x_j) = K(x_i, x_j)$. Then the psd condition is that for all $a \in \mathbb R^n$, we have that
$$
\sum_{i=1}^n a_i K(x_i, x_j) a_j = a^T K a = a^T \Phi \Phi^T a = (\Phi^T a)^T (\Phi^T a) = \lVert \Phi^T a \rVert^2 \ge 0
.$$
Best Answer
First, your definition should be corrected as $$k(x, x') = \langle x, x\color{red}{'}\rangle = \sum_{a = 1}^N x_a x_a'. $$ The problem of your derivation is that you didn't distinguish $x_i = (x_{i,1}, \ldots, x_{i,N})^T$ and $x_j = (x_{j, 1}, \ldots, x_{j, N})^T$ very clearly. Let's say you have $p$ vectors $\{x_1, \ldots, x_p\}$ under consideration. It follows that (what you provided was actually incorrect): \begin{align} & \sum_{i, j} c_i c_j k(x_i, x_j) \\ = & \sum_{i = 1}^p \sum_{j = 1}^p c_i c_j \sum_{a = 1}^N x_{i,a}x_{j, a} \\ = & \sum_{i = 1}^p \sum_{j = 1}^p \sum_{a = 1}^N c_i x_{i,a} c_j x_{j, a} \\ = & \sum_{a = 1}^N \left(\sum_{i = 1}^p c_i x_{i, a}\right) \left(\sum_{j = 1}^p c_j x_{j, a}\right) \qquad \text{ change the order of summation}\\ = & \sum_{a = 1}^N \left(\sum_{i = 1}^p c_i x_{i, a}\right)^2 \geq 0. \qquad i, j \text{ are just dummy indices} \end{align}