Formally, measure (resp. probability) theory requires us to works with a triple $(\Omega, \mathcal{F}, P)$ where $\Omega$ is the space we are working on, $\mathcal{F}$ is a $\sigma-$algebra and $P$ is a (probability) measure which maps elements of $\mathcal{F}$ to numbers (between $0$ and $1$). We call the elements of $\mathcal{F}$ the "$\mathcal{F}$ measurable sets". For any non-trivial $\Omega$, you will have many potential $\sigma-$algebras that you can use in the place of $\mathcal{F}$. As you say, one option is to take $\mathcal{F} = 2^\Omega$ (the power set of $\Omega$) to be our $\sigma-$algebra. The problem with this choice is that every subset of $\Omega$ is in the power set--everything is measurable here. Why is that an issue? Among other things, it's often too big for $P$ to have nice properties. In many (most, honestly) cases, finding nice properties we want $P$ to have is what really drives the probability, not the particulars of $\mathcal{F}$.
Stefan gives the standard non-probabilistic example of this in the comments. The Lebesgue measure, which is the natural notion of volume on the real line, is not compatible with the power set as the $\sigma-$algebra in our triple, so we need to pick a new one. The definition of the Borel $\sigma-$algebra is that it is the smallest $\sigma-$algebra containing the open intervals (which had better be measurable if we are going to define volume). Since this $\sigma-$algebra is compatible with the intuitive notion of volume, it is therefore the smallest $\sigma-$algebra we can choose with the property that $\mu\{(a,b)\} = b-a$ for all open intervals $(a,b)$. Why not stop here? Not all subsets of Borel sets of measure $0$ are measurable and it is often nice for the sake of theory to not have to worry about those sets. The Lebesgue $\sigma-$algebra is what you get if you insist that all subsets of sets of measure zero are measurable. In this case, as in many cases, because this $\sigma-$algebra is so natural we often drop the formalism and just say that a set is "measurable" or "not measurable" on the real line, when what we really mean is that it is measurable with respect to the Lebesgue $\sigma-$algebra or not measurable with respect to the Lebesgue $\sigma-$algebra. I believe that the last issue is the source of your confusion. Whatever you were reading dropped that they were referring to the Lebesgue $\sigma-$algebra.
It is not too difficult and not too trivial to construct a set which is Lebesgue measurable but not Borel measurable. In general, most sets you can write down will end up being Borel. By contrast, constructing sets which are not Lebesgue measurable requires using something like the axiom of choice. Analysts are fond of saying that if you can write it down explicitly, it is Lebesgue measurable.
Let me make one quick comment about why probabilists like to use the Borel $\sigma-$algebra rather than the Lebesgue $\sigma-$algebra. For an analyst, the definition of a function being measurable is that the inverse image of open sets is measurable. Since probabilists don't require our spaces to have topologies, this really doesn't work for us. For a probabilist, the definition of a function being measurable is that the inverse image of a measurable set is measurable. The Borel $\sigma-$algebra has the nice property that if you compose two Borel measurable functions, you get another Borel measurable function in either definition. This property fails badly for Lebesgue measurable functions with the analysts' definition of measurable.
Recall that $E$ is $\mu$-measurable iff
$$\forall A \subseteq X: \mu(A) = \mu(A \cap E) + \mu (A \cap E^\complement)\tag{1}$$
Dirac measure $\mu_0$: suppose $E$ is any subset of $\mathbb{R}$.
Let $A \subseteq \mathbb{R}$ be arbitrary.
If $0 \in A$ then $(1)$ reduces to $1= \mu(A \cap E) + \mu(A \cap E^\complement)$ and as $0$ is in exactly one of $E$ or $E^\complement$ the right hand side has one $0$ and $1$, so their sum $1$ too.
If $0 \notin A$, all $3$ sets in $(1)$ have measure $0$, and $(1)$ checks out too.
As $A$ was arbitrary, $E$ is $\mu$-measurable.
$\mu$ the trivial $0$-$1$ measure:
taking $E=\mathbb{R}$ or $E=\emptyset$ reduces $(1)$ to $0=0+0$ for $A=\emptyset$ or $1=1+0$ for other $A$. So always $\emptyset, \mathbb{R}$ are $\mu$-measurable.
If $\emptyset \neq E \neq \mathbb{R}$, let $p \in E, q \notin E$ and define $A=\{p,q\}$, then $(1)$ for this $A$ reduces to $1=1+1$ (all sets are non-empty so have measure $1$) so this fails for this $A$. Ergo, $E$ is not measurable, and we have that the measurable sets are only $\{\emptyset,\mathbb{R}\}$.
Best Answer
The motivation comes from trying to define a measure on the real numbers. To be useful, it should reflect our intuitive notions about length. We are trying to find a function $\mu : \mathcal{P}(\mathbb{R})\rightarrow \mathbb{R}$ that respects the following properties:
It turns out that such a function cannot exist: An easy way to show this is by using a Vitali set, defined as follows. We define a relation $\sim$ on $[0,1]$ by $a \sim b$ if $a - b \in \mathbb{Q}$. It is easy to show this is an equivalence relation. Take the quotient $[0,1]/\sim$ and choose one representative for each equivalence class. Call the set of these representatives $V$.
It is now easy to show that any two copies of $V$ translated by rational numbers are disjunct (i.e. $V + p \cap V + q = \emptyset$ for any $p \neq q$, $p, q \in \mathbb{Q}$.
Thus, by property (3), $ \mu(\bigcup_{q \in \mathbb{Q}, |q| < 1} V + q \cap [0, 1]) = \sum_{q \in \mathbb{Q}, |q| < 1} V + q \cap [0, 1]$. However, it is easy to show that $\bigcup_{q \in \mathbb{Q}, |q| < 1} \mu(V + q \cap [0, 1]) = [0,1]$, thus, $$ 1 = \sum_{q \in \mathbb{Q}, |q| < 1} \mu(V + q \cap [0, 1])$$ by property (1). By property (2), $$ \sum_{q \in \mathbb{Q}, |q| < 1} \mu(V + q \cap [0, 1]) = \sum_{q \in \mathbb{Q}, |q| < 1} \mu(V \cap [0, 1]),$$ but it is clear that there is no real number $\mu(V \cap [0, 1])$ other than $0$ such that this series converges.
All this shows that a function defined on all of $\mathcal{P}(\mathbb{R})$ with the properties 1-3 cannot exist. Usually, one moves on from this by defining the outer Lebesgue measure and then finding ''good'' (measurable) sets for which these properties do hold. One then shows that these sets form a sigma algebra, which is useful for showing that certain sets are also measurable. When moving on to abstract measure spaces, one starts from the sigma algebra, having learned that one usually cannot define a good measure on the entire power set.