As Lubos has mentioned
$QP-PQ=i\hbar$
is one of the basic requirements of quantum mechanics. Classically observables are functions of variables $q$, and $p$ and Poisson bracket relation read
$\{q,p\}=1$ (note that $\{q,p\}$ is unitless quantity )
In QM observables are required to be hermitian operators (so that they can have real eigenvalues). In particular for position we have an operator $Q$, and for momentum we have an operator $P$. Poisson bracket is replaced by commutator and we require
$[Q,P]=i\hbar$
In analogy with classical Poisson brackets we would have required
$[Q,P]=1$
But this is not possible since
i) We have already required that $Q,P$ be hermitian. So $[Q,P]^\dagger=(QP-PQ)^\dagger=(QP)^\dagger-(PQ)^\dagger=PQ-QP=-[Q,P]$. So if we require $[Q,P]$ to be a constant (i.e. constant multiple of Identity matrix) it should be purely imaginary.
ii) $[Q,P]$ has units of $ML^2T^{-1}$. You can see this by noting that $Q$ is a position operator, so it has units of $L$, and that $P$ is a momentum operator, so it has units of $MLT^{-1}$.
Two natural choices are $[Q,P]=i\hbar$ and $[Q,P]=-i\hbar$. They are both equivalent and choice of $[Q,P]=i\hbar$ is just a convention.
No two finite dimensional matrices can satisfy $[Q,P]=i\hbar$. This can be seen by taking trace on both sides. However this relation can be satisfied by infinite dimensional matrices. More explicitly take vector space to be space of functions of $q$. Define $Q$ as $Qf=qf$, and $P$ as $Pf=-i\hbar\partial f/\partial q$. Then it can be seen that these operators satisfy required commutation relation. Moreover if we define our inner product as
$(f,g)=\int f^* g\: dq$
then $Q$ and $P$ defined as above will also be hermitian.
Your running into circles will stop once you commit yourself to a choice.
What to regard as postulate is always a matter of choice (by you or by whoever writes an exposition of the basics). One starts from a point where the development is in some sense simplest. And one may motivate the postulates by analogies or whatever. The CCR are a simple coordinate-independent starting point.
However it is more sensible to introduce the momentum as the infinitesimal generator of a translation in position space. This is its fundamental meaning and essential for Noether's theorem, and has the CCR as a simple corollary.
Best Answer
There's a considerable danger of oversimplification here, but I find it helpful to think of the construction $e^{i\lambda \hat q}$ as an operator that generates the characteristic function of a probability distribution in a particular state. Being a little free with mathematical details, we can write the probability density of observing the value $q$ in a vector state $\left|\psi\right>$ as $\mathsf{Pr}(q)=\left<\psi\right|\delta(q-\hat q)\left|\psi\right>$, for which the characteristic function is the fourier transform $$C(\lambda)=\int\left<\psi\right|\delta(q-\hat q)\left|\psi\right>e^{i\lambda q}\mathrm{d}q= \left<\psi\right|e^{i\lambda \hat q}\left|\psi\right>.$$ We can inverse fourier transform this back to a probability density. What is commonly labeled $Q$, as you have here, is more helpfully labeled in a way that emphasizes its difference from the position operator and the values that position measurements may take. Mathematically, $\lambda$ is a linear dual of the position. It's very clear what a fourier transform of a probability density is algebraically, but perhaps not so clear what its physical meaning is. It's perhaps best to think of it as a formal device that uses $\lambda$ for keeping track of all the moments of the probabilities of different measurement results, which allows us to say that $\lambda^n$ is associated with the $n$-th moment. [You could do a lot worse, if you want to understand generating functions, though this is slightly idiosyncratic suggestion on my part, to read John Baez's recent series on Network Theory with an open mind; a Google search for "network theory (part" baez finds them. It may well twist your head a little, however, so if you want quick understanding this may not be for you.]
Everything is essentially the same for the momentum operator taken alone, but if we introduce both $\hat q$ and $\hat p$, so that we consider an object such as $$\tilde W(\lambda,\mu)=\int\left<\psi\right|\delta(q-\hat q)\delta(p-\hat p)\left|\psi\right> e^{i\lambda q+i\mu p}\mathrm{d}q\mathrm{d}p= \left<\psi\right|e^{i\lambda \hat q+i\mu\hat p}\left|\psi\right>,$$ and then inverse fourier transform this object, we obtain negative "probability densities" for some values of $p$ and $q$. This is the Wigner function, about which much has been written.