Right, in general you're not going to see a straightforward equivalence there. We can use Dirac notation with $\hat P_b = |b\rangle\langle b|$ to see that $\langle \hat A \hat B \rangle = \sum_{a,b} a~b~\langle \psi | a \rangle~\langle a | b \rangle~\langle b | \psi \rangle$ and even inserting an identity matrix for $b$ (call it $b'$) gives: $$ \begin{align}\langle \hat A \hat B \rangle =& \sum_{a,b,b'} a~b~\langle \psi | b'\rangle~\langle b'|a \rangle~\langle a | b \rangle~\langle b | \psi \rangle\\=&\sum_{b,b'} b~\psi^*(b')~\psi(b) ~ \langle b'|\hat A| b\rangle\end{align}$$Indeed, you need to insert some sort of $\delta_{bb'}$ into this last sum to get the $b ~ \langle b|\hat A | b\rangle$ sense of "measure $B$ first, then $A$," which this expression doesn't have unless it's hidden in that $\hat A$ term.
There is a very simple reason why you do not see this straightforward equivalence. Let's work in a finite-dimensional Hilbert space $\psi \in \mathbb C^N.$ Then the matrix $\hat C = \hat A \hat B$ is really given by the Einstein sum $$C_{ik} = A_{ij} ~B_{jk}.$$This is Hermitian if and only if $C_{ik}^* = C_{ki}$ but the complex conjugate here is$$C_{ik}^* = A_{ij}^* ~B_{jk}^* = B_{kj}~A_{ji}$$and demanding that this is equal to $C_{ki}$ is therefore demanding that $B_{kj} A_{ji} = A_{kj} B_{ji}$ or therefore $[\hat A, \hat B] = 0.$
In other words, the product of two Hermitian matrices is only Hermitian if they commute. In general the expectation $\langle \hat A \hat B \rangle$ is going to be a complex number when they do not commute.
If you want something which is Hermitian (say you have a classical expression involving $\langle x~ p\rangle$ that you want to generalize into the quantum case) then you will probably do a symmetric product $\frac 12 (\hat A \hat B + \hat B \hat A)$, which is then again Hermitian if its constituent matrices are.
Observables don't commute if they can't be simultaneously diagonalized, i.e. if they don't share an eigenvector basis. If you look at this condition the right way, the resulting uncertainty principle becomes very intuitive.
As an example, consider the two-dimensional Hilbert space describing the polarization of a photon moving along the $z$ axis. Its polarization is a vector in the $xy$ plane.
Let $A$ be the operator that determines whether a photon is polarized along the $x$ axis or the $y$ axis, assigning a value of 0 to the former option and 1 to the latter. You can measure $A$ using a simple polarizing filter, and its matrix elements are
$$A = \begin{pmatrix} 0 & 0 \\ 0 & 1 \end{pmatrix}.$$
Now let $B$ be the operator that determines whether a photon is $+$ polarized (i.e. polarized southwest/northeast) or $-$ polarized (polarized southeast/northwest), assigning them values 0 and 1, respectively. Then
$$B = \begin{pmatrix} 1/2 & -1/2 \\ -1/2 & 1/2 \end{pmatrix}.$$
The operators $A$ and $B$ don't commute, so they can't be simultaneously diagonalized and thus obey an uncertainty principle. And you can immediately see why from geometry: $A$ and $B$ are picking out different sets of directions. If you had a definite value of $A$, you have to be either $x$ or $y$ polarized. If you had a definite value of $B$, you'd have to be $+$ or $-$ polarized. It's impossible to be both at once.
Or, if you rephrase things in terms of compass directions, the questions "are you going north or east" and "are you going northeast or southeast" do not have simultaneously well-defined answers. This doesn't mean compasses are incorrect, or incomplete, or that observing a compass 'interferes with orientation'. They're just different directions.
Position and momentum are exactly the same way. A position eigenstate is sharply localized, while a momentum eigenstate has infinite spatial extent. Thinking of the Hilbert space as a vector space, they're simply picking out different directions; no vector is an eigenvector of both at once.
Best Answer
Re question 2 : A conserved quantity is that which does not change with time. If some operator commutes with Hamiltonian operator then it will be conserved (the reason is simply because of the way time evolution of operators is defined, i.e. via Heisenberg equation). In quantum mechanics (and more generally in any science) those characteristics of a physical system which are conserved in time (or which at least do not change very rapidly or randomly) can be used as a natural name of that system. Non-conserved quantities are not good for this purpose; e.g. elementary particles are classified by their qualities like charge, spin etc, rather than by their position in space time. As another example, people usually remember each other by their face rather than hairstyle because the latter may keep changing.