If you look at the optimization problem that SVM solves:
$\min_{\mathbf{w},\mathbf{\xi}, b } \left\{\frac{1}{2} \|\mathbf{w}\|^2 + C \sum_{i=1}^n \xi_i \right\}$
s.t.
$y_i(\mathbf{w}\cdot\mathbf{x_i} - b) \ge 1 - \xi_i, ~~~~\xi_i \ge 0,$
for all $ i=1,\dots n$
the support vectors are those $x_i$ where the corresponding $\xi_i \gt 0$. In other words, they are the data points that are either misclassified, or close to the boundary.
Now let's compare the solution to this problem when you have a full set of features, to the case where you throw some features away. Throwing a feature away is functionally equivalent to keeping the feature, but adding a contraint $w_j=0$ for the feature $j$ that we want to discard.
When you compare these two optimization problems, and work through the math, it turns out there is no hard relationship between the number of features and the number of support vectors. It could go either way.
It's useful to think about a simple case. Imagine a 2-dim case where your negative and positive features are clustered around (-1,-1) and (1,1), respectively, and are separable with a diagonal separating hyperplane with 3 support vectors. Now imagine dropping the y-axis feature, so your data in now projected on the x-axis. If the data are still separable, say at x=0, you'd probably be left with only 2 support vectors, one on each side, so adding the y-feature would increases the number of support vectors. However, if the data are no longer separable, you'd get at least one support vector for each point that's on the wrong side of x=0, in which case adding the y-feature would reduce the number of support vectors.
So, if this intuition is correct, if you're working in very high-dimensional feature spaces, or using a kernel that maps to a high dimensional feature space, then your data is more likely to be separable, so adding a feature will tend to just add another support vector. Whereas if your data is not currently separable, and you add a feature that significantly improves separability, then you're more likely to see a decrease in the number of support vectors.
I am not an R user, but I suspect it is because you are using the soft-margin support vector machine (which is what I presume "C-svc" means). The support vectors will only lie exactly on the margins for the hard margin SVM (where C is infinite). Essentially the C parameter penalises the degree to which the support vectors are allowed to violate the margin constraint, so if C is less than infinity, the support vectors are allowed to drift away from the margins in the interests of making the margin broader, which often leads to better generalisation.
Best Answer
Yes. The minimum number of support vectors is two for your scenario. You don't need more than two here.