All kernel methods are based on distance. The RBF kernel function is $\kappa(\mathbf{u},\mathbf{v}) = \exp(-\|\mathbf{u}-\mathbf{v}\|^2)$ (using $\gamma=1$ for simplicity).
Given 3 feature vectors:
$$
\mathbf{x}_1 = [1000, 1, 2], \quad
\mathbf{x}_2 = [900, 1, 2], \quad
\mathbf{x}_3 = [1050, -10, 20].
$$
then $\kappa( \mathbf{x}_1, \mathbf{x}_2) = \exp(-10000) \ll \kappa(\mathbf{x}_1, \mathbf{x}_3) = \exp(-2905)$, that is $\mathbf{x}_1$ is supposedly more similar to $\mathbf{x}_3$ then to $\mathbf{x}_2$.
The relative differences between $\mathbf{x}_1$ and:
$$
\mathbf{x}_2 \rightarrow [0.1, 0, 0],\quad
\mathbf{x}_3 \rightarrow [0.05, -10, 10].
$$
So without scaling, we conclude that $\mathbf{x}_1$ is more similar to $\mathbf{x}_3$ than to $\mathbf{x}_2$, even though the relative differences per feature between $\mathbf{x}_1$ and $\mathbf{x}_3$ are much larger than those of $\mathbf{x}_1$ and $\mathbf{x}_2$.
In other words, if you do not scale all features to comparable ranges, the features with the largest range will completely dominate in the computation of the kernel matrix.
You can find simple examples to illustrate this in the following paper: A Practical Guide to Support Vector Classification (Section 2.2).
There are multiple misunderstandings in both the question and the answer posted by @mp85.
There are to sets of parameters, but one of them are called hyperparameters.
The SVM problem/formulation is
$$ \min ||w||^2 + C \sum \xi_i $$
subject to
$$ y_i(w·\phi(x_i)+b) \ge 1−\xi_i \quad \xi_i \ge 0 $$
for all data $(x_i, y_i)$. $\phi(x)$ is a transformation on the input data.
So, you must set $\phi()$ and you must set $C$, and then the SVM solver (that is the fit method of the SVC class in sklearn) will compute the $\xi_i$, the vector $w$ and the coefficient $b$. This is what is "fitted" - this is what is computed by the method. And you must set $C$ and $\phi()$ before running the svm solver.
But there is no way to set $\phi()$ directly. It turns out that one defines the transformation by defining a kernel - linear (no transformation) or rbf or poly (or others). Each of this kernels are defined by one or more parameters: rbf by the gamma, poly by coef0 and degree, and so on.
So to run the SVM you must set C, and must choose the kernel and for each kernel, set the appropriate parameter (or parameters). These are collectively known as hiper parameters and they are not computed by the SVM solver, they are set by you.
Finally, it is not 100% true that the SVM solver computes the $w$, the $b$ and the $\xi_i$. The SVC solver uses a different formulation of the svm problem, the dual of the formulation above, and it computes different variables. For the LinearSVC solver, which only works for the linear kernel, it does compute $w$, the $b$ and the $\xi_i$ (and returns $w$ and $b$).
Best Answer
The RBF kernel function is as follows, for two vectors $\mathbf{u}$ and $\mathbf{v}$: $$ \kappa(\mathbf{u},\mathbf{v}) = \exp(-\gamma \|\mathbf{u}-\mathbf{v}\|^2). $$ The hyperparameter $\gamma$ is used to configure the sensitivity to differences in feature vectors, which in turn depends on various things such as input space dimensionality and feature normalization.
If you set $\gamma$ too large, you will end up overfitting. In the limit case $\gamma\rightarrow\infty$, the kernel matrix becomes the unit matrix which leads to a perfect fit of the training data, though an entirely useless model.
The optimal value of $\gamma$ depends entirely on your data, any rules of thumb should be taken with a pound of salt. That said, you can use specialized libraries to optimize hyperparameters for you (e.g. Optunity (*)), in the case of SVM with RBF kernel that is $\gamma$ and $C$. You can find an example of optimizing these parameters automatically with Optunity and scikit-learn here.
(*) disclaimer: I'm the lead developer of Optunity.