I don't think I can really give you the intuition that you seek because I don't think I quite have it yet either. But I think that understanding the relevance of Nigel Higson's comment might help, and I can try to provide some insight. (Full disclosure: most of my understanding of these matters has been heavily influenced by Nigel Higson and John Roe).
My first comment is that the index theorem should be regarded as a statement about K-theory, not as a cohomological formula. Understanding the theorem in this way suppresses many complications (such as the confusing appearance of the Todd class!) and lends itself most readily to generalization. Moreover the K-theory proof of the index theorem parallels the "extrinsic" proof of the Gauss Bonnet theorem, making the result seem a little more natural. The appearance of the Chern character and Todd class are explained in this context by the observations that the Chern character maps K-theory (vector bundles) to cohomology (differential forms) and that the Todd class measures the difference between the Thom isomorphism in K-theory and the Thom isomorphism in cohomology. I unfortunately can't give you any better intuition for the latter statement than what can be obtained by looking at Atiyah and Singer's proof, but in any event my point is that the Todd class arises because we are trying to convert what ought to be a K-theory statement into a cohomological statement, not for a reason that is truly intrinsic to the index theorem.
Before I elaborate on the K-theory proof, I want to comment that there is also a local proof of the index theorem which relies on detailed asymptotic analysis of the heat equation associated to a Dirac operator. This is analogous to certain intrinsic proofs of the Gauss-Bonnet theorem, but according to my understanding the argument doesn't provide the same kind of intuition that the K-theory argument does. The basic strategy of the local argument, as simplified by Getzler, is to invent a symbolic calculus for the Dirac operator which reduces the theorem to a computation with a specific example. This example is a version of the quantum-mechanical harmonic oscillator operator, and a coordinate calculation directly produces the $\hat{A}$ genus (the appropriate "right-hand side" of the index theorem for the Dirac operator). There are some slightly more conceptual versions of this proof, but none that I have seen REALLY explain the geometric meaning of the $\hat{A}$ genus.
So let's look at the K-theory argument. The first step is to observe that the symbol of an elliptic operator gives rise to a class in $K(T^*M)$. If the operator acts on smooth sections of a vector bundle $S$, then its symbol is a map $T^*M \to End(S)$ which is invertible away from the origin; Atiyah's "clutching" construction produces the relevant K-theory class. Second, one constructs an "analytic index" map $K(T^*M) \to \mathbb{Z}$ which sends the symbol class to the index of $D$. The crucial point about the construction of this map is that it is really just a jazzed up version of the basic case where $M = \mathbb{R}^2$, and in that case the analytic index map is the Bott periodicity isomorphism. Third, one constructs a "topological index map" $K(T^*M) \to \mathbb{Z}$ as follows. Choose an embedding $M \to \mathbb{R}^n$ (one must prove later that the choice of embedding doesn't matter) and let $E$ be the normal bundle of the manifold $T^*M$. $E$ is diffeomorphic to a tubular neighborhood $U$ of $T^*M$, so we have a composition
$K(T^* M) \to K(E) \to K(U) \to K(T^*\mathbb{R}^n)$
Here the first map is the Thom isomorphism, the second is induced by the tubular neighborhood diffeomorphism, and the third is induced by inclusion of an open set (i.e. extension of a vector bundle on an open set to a vector bundle on the whole manifold). But K-theory is a homotopy functor, so $K(T^* \mathbb{R}^n) \cong K(\text{point}) = \mathbb{Z}$, and we have obtained our topological index map from $K(T^*M)$ to $\mathbb{Z}$. The last step of the proof is to show that the analytic index map and the topological index map are equal, and here again the basic idea is to invoke Bott periodicity. Note that we expect Bott periodicity to be the relevant tool because it is crucial to the construction of both the analytic and topological index maps - in the topological index map it is hiding in the construction of the Thom isomorphism, which by definition is the product with the Bott element in K-theory.
To recover the cohomological formulation of the index theorem, just apply Chern characters to the composition of K-theory maps which defines the topological index. The K-theory formulation of the index theorem says that if you "plug in" the symbol class then you get out the index, and all squares with K-theory on top and cohomology on the bottom commute except for the "Thom isomorphism square", which introduces the Todd class. So the main challenge is to get an intuitive grasp of the K-theory formulation of the index theorem, and as I hope you can see the main idea is the Bott periodicity theorem.
I hope this helps!
The Fredholm index of an elliptic operator only depends on the symbol class. Here is the proof (which I memorize from Lawson-Michelsohn and Atiyah-Singer).
If $D: \Gamma(E_0) \to \Gamma(E_1)$ has order $k \neq 0$, pick a connection $\nabla$ on $E_0$. Then $A=(1+\nabla^{\ast}\nabla)$ is a self-adjoint invertible operator of order $2$ and $D \circ A^{-k/2}$ has the same Fredholm index as $D$ and the same symbol class; but it has order $0$. So for any operator, there is an order $0$ operator with the same symbol class and the same index; and this reduces the problem to the order $0$ case.
It has been mentioned that the index of an operator only depends on the homotopy class of its symbol (by the way, the proof of this in Lawson-Michelssohn is incomplete. In the proof of Theorem 7.10, they say ''to construct a family of operators $P_t$ with $\sigma(P_t)=\sigma_t$, [...] is evidently possible locally (in coordinates)''. One needs to know that in $R^n$, the operator norm of a pseudo-DO can be estimated by the symbol. You can see this by going through the proof of Prop. III.3.2 of L.-M.).
Recall the (modified) definition of $K^0 (TX,TX-0)$: it is the group of all equivalence classes of $(E_0,E_1,f)$; $E_i \to X$ vector bundles and $f: \pi^{\ast} E_0 \to \pi^{\ast} E_1$ a bundle map that is an isomorphism away from the zero section and homogeneous of order $0$ outside the zero section. The equivalence relation is generated by (1) homotopy, (2) isomorphism and (3) addition of things of the form $(E,E,id)$. (1) and (2) preserve the index. (3) also preserves the index; since a pseudo-DO with symbol $(E,E,id)$ is e.g. the identity operator, which has index $0$.
Best Answer
The original paper by Atiyah and SInger at http://www.jstor.org/stable/1970715 is as good as anything