Honestly, I am not sure why you want to do this in Excel. Nonetheless, ...
A linear SVM requires solving a quadratic program with several linear constraints. You can check this answer [1] to find out how the quadratic program is setup. Once you setup the quadratic program and find a solver that can help you solve it in Excel, then you are good to go.
On the other hand, the corresponding quadratic program has a dual that gives rise to the notion of kernels. The objective function for the dual can be found here [2]. If you can find a quadratic program solver in Excel, you might as well solve the dual, which will allow you to solve problems beyond linear kernels.
If you don't have a QP solver at hand, then you can write the SMO algorithm [3] which solves the SVM dual. The provided link gives you a pseudocode. SMO is one of the simplest algorithms to solve the SVM dual, but also the slowest. For a small number of training data, it should be pretty fast, however.
[1] Given a set of points in two dimensional space, how can one design decision function for SVM?
[2] Non-linear SVM classification with RBF kernel
[3] http://cs229.stanford.edu/materials/smo.pdf
I do not believe it is the inverse either. Both approaches are very different.
LDA is optimal (in a Bayes sense) whenever the assumptions under which it is derived are met, namely: data is generated from two multivariate Gaussians with equal covariance matrices. This assumptions are very restrictive.
Linear SVM on the other side, makes no assumptions on the distribution of the data, and has parameters which allow one to control the number of outliers directly. This seems a quite more flexible approach.
Further, LDA has numerical problems in high dimensions. SVMs are more robust in that setting. So, in general, yes, SVMs will behave better.
As an example (more like a starting point to experiment with) you can take a look at this notebook in R. There you can see how in a general case (where assumptions are no longer warranted and for a higher dimensional problem) linear SVMs usually perform better.
Best Answer
Support vector machines focus only on the points that are the most difficult to tell apart, whereas other classifiers pay attention to all of the points.
The intuition behind the support vector machine approach is that if a classifier is good at the most challenging comparisons (the points in B and A that are closest to each other in Figure 2), then the classifier will be even better at the easy comparisons (comparing points in B and A that are far away from each other).
Perceptrons and other classifiers:
Perceptrons are built by taking one point at a time and adjusting the dividing line accordingly. As soon as all of the points are separated, the perceptron algorithm stops. But it could stop anywhere. Figure 1 shows that there are a bunch of different dividing lines that separate the data. The perceptron's stopping criteria is simple: "separate the points and stop improving the line when you get 100% separation". The perceptron is not explicitly told to find the best separating line. Logistic regression and linear discriminant models are built similarly to perceptrons.
The best dividing line maximizes the distance between the B points closest to A and the A points closest to B. It's not necessary to look at all of the points to do this. In fact, incorporating feedback from points that are far away can bump the line a little too far, as seen below.
Support Vector Machines:
Unlike other classifiers, the support vector machine is explicitly told to find the best separating line. How? The support vector machine searches for the closest points (Figure 2), which it calls the "support vectors" (the name "support vector machine" is due to the fact that points are like vectors and that the best line "depends on" or is "supported by" the closest points).
Once it has found the closest points, the SVM draws a line connecting them (see the line labeled 'w' in Figure 2). It draws this connecting line by doing vector subtraction (point A - point B). The support vector machine then declares the best separating line to be the line that bisects -- and is perpendicular to -- the connecting line.
The support vector machine is better because when you get a new sample (new points), you will have already made a line that keeps B and A as far away from each other as possible, and so it is less likely that one will spillover across the line into the other's territory.
I consider myself a visual learner, and I struggled with the intuition behind support vector machines for a long time. The paper called Duality and Geometry in SVM Classifiers finally helped me see the light; that's where I got the images from.