Too long for the comments:
You are confusing a circle with a disc. A circle in a topological context is (homeomorphic to) the boundary of a disc. A circle has empty interior, in part because it is a boundary. One can of course speak about the region of $\Bbb R^2$ that is enclosed by the circle as, in some sense, its "interior". However, when you embed a circle in say $\Bbb R^3$, this notion of interior no longer makes sense, while $S^1 \subseteq \Bbb R^3$ still has empty (topological) interior.
Furthermore, neither a circle, nor a disc is open and closed in the standard topology on $\Bbb R^2$. The circle is closed, as is the disc. You should try and convince yourself that they are not open.
There is no problem with using a metric to define a topology. And you don't need a topology to define a metric space, you just need a metric. Surely you don't need to talk about open sets to construct a function $d: X \times X \to \Bbb R$?
A topological space is not really "more fundamental," it just happens that every metric space has a topology that is compatible with the metric in some sense. This is what we should want to happen, since topological spaces are generalizations of metric spaces.
You should also bear in mind that the definition of what an open set is depends on what sets you choose to be open. There are many different ways of choosing those open sets, many of which are compatible with various other mathematical structures. In addition to metric spaces, if a set has a linear order, then we can define in a particular way, a topology on that set using the linear order. When we look at algebraic objects, like groups, it might be possible to find a topology on the underlying set such that the group operation the map $g \mapsto g^{-1}$ are continuous. (This is called a topological group.)
In fact, $\Bbb R$ with the standard topology is an example of all of these. It is a metric space, linearly ordered, and is a group under addition. Just to be clear, the standard topology on $\Bbb R$ is generated by the Euclidean metric, but can also be generated by the linear order $<$, and makes addition continuous. This single, easily defined topology allows us to use three (actually more) extremely useful objects when discussing the topology of $\Bbb R$.
Remember that a topology on a point-set $X$ is just a collection of subsets of $X$ such that the collection satisfies certain properties. Look at the abstract definition of a topology. It only requires three axioms:
$\emptyset$ and $X$ are in the topology.
That the collection is closed under arbitrarily large unions.
That the collection is closed under finite intersections.
So that it is easier to talk about members of the collection, we call them open sets.
Observe that nowhere in the definition of what a topology is do we give any characterizations about the open sets themselves. This is somewhat analogous to how the axioms of a group make no mentions of what the group elements are, only how they interact.
The advantage to this flexibility is that it is easy to put different kinds of topologies on an arbitrary point-set. The down-side is that often we have too many choices and the easy ones (the discrete and trivial topologies) are rarely useful or what we want (but sometimes they are!).
Let's say you have a point-set $X$ and you want to put a topology on it. Chances are that you have some additional structure on $X$ and you want to use topological tools to study it. In that case, whatever topology you pick needs to be compatible with that structure in some sense that will vary a lot between structures.
The prototypical example is the metric topology. Suppose $(X,d)$ is a metric space. From analysis we have a definition of continuity on $X$, but we also have a different definition for arbitrary topological spaces:
Analytic Continuity: A function $f: X \to Y$ is continuous at $x \in X$ if for all $\epsilon > 0$, there exists a $\delta > 0$ such that for all $y \in X$, if $d_X(x,y) < \delta$, then $d_Y(f(x),f(y)) < \epsilon$.
Topological Continuity: A function $f: X \to Y$ is continuous if for all open sets $U \subseteq Y$, the preimage $f^{-1}(U)$ is open in $X$.
Whatever choice we make of topology on $X$, we want these two definitions to be equivalent, so that we can combine the analytic and topological machinery.
The take-away of all this is that when you just have a point-set and nothing else, it's not clear what the open sets should be or why you would even need these open set doohickeys. In fact, when we make purely abstract point-set topology arguments, it doesn't matter what an open set is, since we are reasoning about them abstractly. Once we start looking at (slightly) more concrete mathematical structures, like say metric spaces or groups, we can no longer reason about open sets abstractly, since there are certain compatibility properties we would like them to satisfy. Now we need to single out a particular collection of subsets that we would like to call the open sets and show that is forms a topology.
The other sensible choice of what a continuous function is what we call an open function:
- Open Function: A function $f: X \to Y$ is called open if for all open sets $U \subseteq X$, the image $f(U)$ is open in $Y$.
So why don't we call these functions continuous?
The short (unhelpful) answer is because the preimage definition is what works. To give a precise mathematical reason is not easy, because essentially our choice was based on what got us what we wanted. Any good definition of continuity ought to formalize our intuitive notions of continuity and the preimage definition does that better than the image definition.
It is also worth pointing out that preimages play much nicer with set operations than do images. To wit:
$f^{-1}(A \cap B) = f^{-1}(A) \cap f^{-1}(B)$, whereas $f(A \cap B) \subseteq f(A) \cap f(B)$ and
$f^{-1}(A \setminus B) = f^{-1}(A) \setminus f^{-1}(B)$, whereas $f(A \setminus B) \supseteq f(A) \setminus f(B)$.
And since whichever one choose in the definition will be used a lot with sets and set operations, so from a practical standpoint, we kind of want the preimage to be the right definition.
In fact, in the early 20th century, when topology was being "invented"or "discovered" as a separate mathematical branch, there were several approaches in defining topologies in the first place, as a set with some extra structure. Kuratowski used axioms for closure (as did Čech) and also demanded that $\overline{\{x\}} = \{x\}$ in some of his texts (so $T_1$ was assumed throughout). Fréchet used convergence notions (of sequences) and also assumed that constant sequences had unique limits (so $T_1$ too). Hausdorff used an axiom system based on neighbourhood systems and assumed as one of the axioms that two distinct points had at least 2 disjoint respective neighbourhoods, the axiom that was later named after him. So in the early days people "sneaked in" low separation axioms as part of their definitions, mostly for convenience. Later the open sets/closed sets axioms developed and were quite generally found to be convenient (and similar to other structures being developed at that time, like $\sigma$-algebras etc.) and people proved the general "equivalence" of many of these approaches. At that time the separation axioms were formulated as separate assumptions.
The system with bare-bones axioms and simple extra assumptions (Trennungsaxiome like $T_0, T_1, T_2$ etc.) won out. That way we can easily tell which results need which extra assumptions etc.
And of course later many applications of non-Hausdorff spaces were found too, which helps.
Best Answer
I like to think of topological spaces as defining "semidecidable properties". Let me explain.
Imagine I have an object that I think weighs about one kilogram. Suppose that, as a matter of fact, the object weighs less than one kilogram. Then I can, using a sufficiently accurate scale, determine that the object weighs less than one kilogram. Even if the object weighs, say, 0.9999996 kilograms, all I need to do is find a scale that's accurate to within, say, 0.0000002 kilograms, and that scale will be able to tell me that the object weighs less than one kilogram.
This means that "weighing less than one kilogram" is a semidecidable property: if an object has the property, then I can determine that it has the property.
Suppose, on the other hand, that the object actually weighs exactly one kilogram. There's no way I can measure the object and determine that it weighs exactly one kilogram, because no matter how precisely I measure it, it's still possible that there's some amount of error which I haven't discovered yet. So "weighing exactly one kilogram" is not a semidecidable property.
What does this have to do with topological spaces? Well, an open set in a topological space corresponds to a semidecidable property of that space. This is why in the topological space of real numbers, the set $\{x : x \in \mathbb{R}, x < 1\}$ is an open set, but the set $\{x : x \in \mathbb{R}, x = 1\}$ is not.
So, consider the "topological space" $X = \{a, b, c\}$ with open sets $\emptyset$, $\{a, b\}$, $\{b, c\}$, and $\{a, b, c\}$. In this "topological space", you are asserting that
However, these assertions contradict each other. Suppose that you have the point $b$. Because of the first bullet point, there is some measurement you can make which will tell you that the point is either $a$ or $b$. And because of the second bullet point, there is another measurement you can make which will tell you that the point is either $b$ or $c$. If you simply make both of these two measurements, then you will have successfully determined that the point is (either $a$ or $b$, and either $b$ or $c$)—in other words, that the point is $b$. But the third bullet point asserts that this is impossible!
For more explanation of this idea, see these two answers: