You are certainly not alone in wondering about this! I should ask: in what sense do you mean the question?
a) If your question -- "can a specific moment in time really have a rate of change?" -- is directed towards the physical world, and the words "time" or "moment" are to be taken as referring to those things from our daily experience, then I'd tell you not to forget what math does: it doesn't constitute the real world, it just models it.
Perhaps the space we live in is actually discrete; i.e. if you zoom in close enough, our world is made of atomic "cells", just like a Minecraft world. Suppose each cell is a cube $1.6 \times 10^{-45}$ meters (ten orders of magnitude below the Planck length) on an edge. We don't know if this hypothesis is true or not: what experiment would disprove it? If it were true, then some things about real numbers that we learn in math (i.e. the idea of the limit is based, that for any number you name, I can always name a smaller one*), would be "wrong" for talking about objects on that size scale.
But it would still work just as well, as an approximation, for things that we currently use calculus for -- e.g. to calculate where to aim our spaceships. The rocket equations themselves are never going to fit the situation exactly (have you accounted for that dust particle? and that one?), the numbers we put into them are never going to be measured precisely.
A model cannot be judged right or wrong in itself; only the application of a model to a real-world situation can be judged, and then only in grades -- more appropriate or less appropriate. If speed comes in discrete chunks, then there may be no moment at which the volleyball, whose arc is described by $y = -x^2$, is ever moving at $-4$ meters/second calculus would predict at $x = 2$. Or maybe speed is continuous, and there is such a moment.
There's no way, even in principle, to tell, so we stick with the model we've got and change it only when it predicts the real world incorrectly.
b) But being less high-minded, it's helpful to have several ways to think about these things (and don't let anyone, including me, convince you that you have to think only their way about it).
As others have said, the derivative of a function $f(x)$ is a function $f'(x)$ which gives you the slope of the tangent line at $x$. If you believe that there can be a tangent line at a single point, then you can just think of that when others say "instantaneous rate of change".
*Here's the technical definition of a limit (ripped from Wikipedia), in case it helps. The statement
$$
\lim_{x \rightarrow 0} f(x) = L
$$
means that you can make $f(x)$ as close to $L$ as you like by making $x$ sufficiently close to $0$. With variables, that's:
For every $\epsilon > 0$, there exists a $\delta > 0$ such that if $0 < |x| < \delta$, then $|f(x) - L| < \epsilon$.
You can see how this would not work if there was a smallest real number -- then if I choose $\epsilon$ equal to that number, how are you going to make $|f(x) - L|$ smaller than it?
As some people on this site might be aware I don't always take downvotes well. So here's my attempt to provide more context to my answer for whoever decided to downvote.
Note that I will confine my discussion to functions $f: D\subseteq \Bbb R \to \Bbb R$ and to ideas that should be simple enough for anyone who's taken a course in scalar calculus to understand. Let me know if I haven't succeeded in some way.
First, it'll be convenient for us to define a new notation. It's called "little oh" notation.
Definition: A function $f$ is called little oh of $g$ as $x\to a$, denoted $f\in o(g)$ as $x\to a$, if
$$\lim_{x\to a}\frac {f(x)}{g(x)}=0$$
Intuitively this means that $f(x)\to 0$ as $x\to a$ "faster" than $g$ does.
Here are some examples:
- $x\in o(1)$ as $x\to 0$
- $x^2 \in o(x)$ as $x\to 0$
- $x\in o(x^2)$ as $x\to \infty$
- $x-\sin(x)\in o(x)$ as $x\to 0$
- $x-\sin(x)\in o(x^2)$ as $x\to 0$
- $x-\sin(x)\not\in o(x^3)$ as $x\to 0$
Now what is an affine approximation? (Note: I prefer to call it affine rather than linear -- if you've taken linear algebra then you'll know why.) It is simply a function $T(x) = A + Bx$ that approximates the function in question.
Intuitively it should be clear which affine function should best approximate the function $f$ very near $a$. It should be $$L(x) = f(a) + f'(a)(x-a).$$ Why? Well consider that any affine function really only carries two pieces of information: slope and some point on the line. The function $L$ as I've defined it has the properties $L(a)=f(a)$ and $L'(a)=f'(a)$. Thus $L$ is the unique line which passes through the point $(a,f(a))$ and has the slope $f'(a)$.
But we can be a little more rigorous. Below I give a lemma and a theorem that tell us that $L(x) = f(a) + f'(a)(x-a)$ is the best affine approximation of the function $f$ at $a$.
Lemma: If a differentiable function $f$ can be written, for all $x$ in some neighborhood of $a$, as $$f(x) = A + B\cdot(x-a) + R(x-a)$$ where $A, B$ are constants and $R\in o(x-a)$, then $A=f(a)$ and $B=f'(a)$.
Proof: First notice that because $f$, $A$, and $B\cdot(x-a)$ are continuous at $x=a$, $R$ must be too. Then setting $x=a$ we immediately see that $f(a)=A$.
Then, rearranging the equation we get (for all $x\ne a$)
$$\frac{f(x)-f(a)}{x-a} = \frac{f(x)-A}{x-a} = \frac{B\cdot (x-a)+R(x-a)}{x-a} = B + \frac{R(x-a)}{x-a}$$
Then taking the limit as $x\to a$ we see that $B=f'(a)$. $\ \ \ \square$
Theorem: A function $f$ is differentiable at $a$ iff, for all $x$ in some neighborhood of $a$, $f(x)$ can be written as
$$f(x) = f(a) + B\cdot(x-a) + R(x-a)$$ where $B \in \Bbb R$ and $R\in o(x-a)$.
Proof: "$\implies$": If $f$ is differentiable then $f'(a) = \lim_{x\to a} \frac{f(x)-f(a)}{x-a}$ exists. This can alternatively be written $$f'(a) = \frac{f(x)-f(a)}{x-a} + r(x-a)$$ where the "remainder function" $r$ has the property $\lim_{x \to a} r(x-a)=0$. Rearranging this equation we get $$f(x) = f(a) + f'(a)(x-a) -r(x-a)(x-a).$$ Let $R(x-a):= -r(x-a)(x-a)$. Then clearly $R\in o(x-a)$ (confirm this for yourself). So $$f(x) = f(a) + f'(a)(x-a) + R(x-a)$$ as required.
"$\impliedby$": Simple rearrangement of this equation yields
$$B + \frac{R(x-a)}{x-a}= \frac{f(x)-f(a)}{x-a}.$$ The limit as $x\to a$ of the LHS exists and thus the limit also exists for the RHS. This implies $f$ is differentiable by the standard definition of differentiability. $\ \ \ \square$
Taken together the above lemma and theorem tell us that not only is $L(x) = f(a) + f'(a)(x-a)$ the only affine function who's remainder tends to $0$ as $x\to a$ faster than $x-a$ itself (this is the sense in which this approximation is the best), but also that we can even define the concept differentiability by the existence of this best affine approximation.
Best Answer
I think this comes down mainly to a conceptual issue: Imagine that you have a way of approximating the behavior of something, call it a function. Now, imagine that your approximation gets more and more accurate the closer you move to this function. If you were infinitely close to the function, your approximation becomes infinitely more accurate (and thus ceases to be an approximation—it becomes exact).
We know the slope of a line between two points, $(x_1, y_1)$ and $(x_2, y_2)$, is $\displaystyle \frac{y_2 - y_2}{x_2 - x_1}$—which can also be looked at as the average rate-of-change of the function between those two points (or, the approximation of the function's rate of change between those two points). Now, the closer these two points are to one another, the more accurate your approximation will be.
Let's take two points on an arbitrary function. We'll call these points $(x, f(x))$ and $(x + \delta{x}, f(x + \delta{x}))$, where $\delta{x}$ is defined as being an infinitely small quantity. (That is, if the function is differentiable, these points are infinitely close to one another because they deviate by an infinitely small quantity).
The derivative is defined as the slope of the line "between" these two infinitely close points. That is... $$\frac{df}{dx} = \frac{f(x + \delta{x}) - f(x)}{x +\delta{x} - x} = \frac{f(x + \delta{x}) - f(x)}{\delta{x}}$$ ... or, the more usual... $$\frac{df}{dx} = \lim_{\Delta{x} \rightarrow 0}\frac{f(x + \Delta{x}) - f(x)}{\Delta{x}}$$ (That is, as $\Delta{x}$ moves infinitely close to zero, becoming infinitely small).
Since these points are infinitely close, your approximation of the rate-of-change of the function becomes infinitely more accurate—or, in other words, it becomes exact.