It sounds right to me. People sometimes also use the word "Perceptron" to refer to the training algorithm together with the classifier. For example, someone explained this to me in the answer to this question. Also, there is nothing to stop you from using a kernel with the perceptron, and this is often a better classifier. See here for some slides (pdf) on how to implement the kernel perceptron.
The major practical difference between a (kernel) perceptron and SVM is that perceptrons can be trained online (i.e. their weights can be updated as new examples arrive one at a time) whereas SVMs cannot be. See this question for information on whether SVMs can be trained online. So, even though a SVM is usually a better classifier, perceptrons can still be useful because they are cheap and easy to re-train in a situation in which fresh training data is constantly arriving.
RUser4512 gave the correct answer: RBF kernel works well in practice and it is relatively easy to tune. It's the SVM equivalent to "no one's ever been fired for estimating an OLS regression:" it's accepted as a reasonable default method. Clearly OLS isn't perfect in every (or even many) scenarios, but it's a well-studied method, and widely understood. Likewise, the RBF kernel is well-studied and widely understood, and many SVM packages include it as a default method.
But the RBF kernel has a number of other properties. In these types of questions, when someone is asking about "why do we do things this way", I think it's important to also draw contrasts to other methods to develop context.
It is a stationary kernel, which means that it is invariant to translation. Suppose you are computing $K(x,y).$ A stationary kernel will yield the same value $K(x,y)$ for $K(x+c,y+c)$, where $c$ may be vector-valued of dimension to match the inputs. For the RBF, this is accomplished by working on the difference of the two vectors. For contrast, note that the linear kernel does not have the stationarity property.
The single-parameter version of the RBF kernel has the property that it is isotropic, i.e. the scaling by $\gamma$ occurs the same amount in all directions. This can be easily generalized, though, by slightly tweaking the RBF kernel to $K(x,y)=\exp\left(-(x-y)'\Gamma(x-y)\right)$ where $\Gamma$ is a p.s.d. matrix.
Another property of the RBF kernel is that it is infinitely smooth. This is aesthetically pleasing, and somewhat satisfying visually, but perhaps it is not the most important property. Compare the RBF kernel to the Matern kernel and you'll see that there some kernels are quite a bit more jagged!
The moral of the story is that kernel-based methods are very rich, and with a little bit of work, it's very practical to develop a kernel suited to your particular needs. But if one is using an RBF kernel as a default, you'll have a reasonable benchmark for comparison.
Best Answer
The linear kernel is what you would expect, a linear model. I believe that the polynomial kernel is similar, but the boundary is of some defined but arbitrary order
(e.g. order 3: $ a= b_1 + b_2 \cdot X + b_3 \cdot X^2 + b_4 \cdot X^3$).
RBF uses normal curves around the data points, and sums these so that the decision boundary can be defined by a type of topology condition such as curves where the sum is above a value of 0.5. (see this picture )
I am not certain what the sigmoid kernel is, unless it is similar to the logistic regression model where a logistic function is used to define curves according to where the logistic value is greater than some value (modeling probability), such as 0.5 like the normal case.