The set of possible outcomes is now the set of pairs (what happens today, what happens tomorrow). What you want to count is the number of such pairs in which the event happens. I think that an example will make it more clear.
Experiment: roll a 6-sided die. Number of outcomes: 6. Probability of rolling a $1$: 1/6.
Double experiment: roll today, roll tomorrow. Number of outcomes is now $6\cdot 6 = 36$, corresponding to the 36 pairs of outcomes (1,1), (1,2), ..., (1,6), (2,1), (2,2), ..., (2,6), ..., (6,1), (6,2), ..., (6,6). Probability of rolling a $1$ today only is still 1/6; but now I could see this as 6/36, because the total number of outcomes, taking both days into account, in which I roll a $1$ today, is 6, since I could roll any of 6 different things tomorrow: (1,1), (1,2), ..., (1,6). Likewise, there are 6 outcomes (taking both days into account) in which I roll a $1$ tomorrow, since I could roll anything today: (1,1), (2,1), ..., (6,1).
At first glance, this makes it seem that the number of outcomes in which I roll a $1$ on either day is 6+6=12, but actually it is 11, because I have to be careful not to double count the outcome (1,1), in which I roll a $1$ today and tomorrow. There are really only 11 outcomes in which I roll a $1$ on one or both days: the five in which I roll it today and not tomorrow, the five in which I roll it tomorrow and not today, and the one in which I roll it on both days.
Thus, the total probability of rolling a 1 on either day is 11/36.
Similarly, in your example, if there is a 1/1000 chance of something happening in the experiment, and I try today and tomorrow, then out of the $1000\times 1000$ possible pairs of outcomes, in 999 the event happens today and not tomorrow; in 999 it happens tomorrow and not today; and in 1 case it happens on both days. The total probability of the event taking place is thus
$$\frac{999+1+999}{1000\cdot 1000} = \frac{1,999}{1,000,000}$$
Does that clarify?
If you roll a fair die, there are six possible outcomes, $1,2,3,4,5,6$, which are equally likely. The average of these six numbers $(1+2+3+4+5+6)/6 = 7/2$. We might say that "on average", if you roll a die, the outcome should be $7/2$. That is of course absurd for a single die roll, but it becomes increasingly true of the sample mean (i.e., the sum of the rolled values, divided by the number of rolls) if we perform more rolls.
To express this mathematically, suppose we roll the die $N$ times, and call the $n$'th outcome $x_n$, (so $x_n$ is one of $1,2,3,4,5,6$) and compute the sample mean of the $N$ resulting numbers:
$$\frac{1}{N}\sum_{n=1}^{N}x_n$$
we expect the result to be near $3.5$ if $N$ is large.
This can be quantified more precisely by the law of large numbers, which says (stated informally) that the sample mean is increasingly likely to be close to the expected value, as $N$ grows large, and in fact the probability that the sample mean differs from the expected value approaches zero as $N \to \infty$.
Edited to respond to the comment by the OP:
Let's consider rolling the die $2$ times. There are $36$ possible outcomes, as follows:
$$\begin{aligned}
(1,1)\qquad(2,1)\qquad(3,1)\qquad(4,1)\qquad(5,1)\qquad(6,1) \\
(1,2)\qquad(2,2)\qquad(3,2)\qquad(4,2)\qquad(5,2)\qquad(6,2) \\
(1,3)\qquad(2,3)\qquad(3,3)\qquad(4,3)\qquad(5,3)\qquad(6,3) \\
(1,4)\qquad(2,4)\qquad(3,4)\qquad(4,4)\qquad(5,4)\qquad(6,4) \\
(1,5)\qquad(2,5)\qquad(3,5)\qquad(4,5)\qquad(5,5)\qquad(6,5) \\
(1,6)\qquad(2,6)\qquad(3,6)\qquad(4,6)\qquad(5,6)\qquad(6,6) \\
\end{aligned}$$
Each outcome is equally likely, with probability $1/36$. Now let's look at the sample mean of the two rolls in each case. So for example, for the first outcome, both rolls were $1$, so the sample mean is $(1+1)/2 = 1$. Computing the sample mean for all $36$ outcomes:
$$\begin{aligned}
1\qquad 1.5\qquad 2\qquad 2.5\qquad 3\qquad 3.5 \\
1.5\qquad 2\qquad 2.5\qquad 3\qquad 3.5\qquad 4 \\
2\qquad 2.5\qquad 3\qquad 3.5\qquad 4\qquad 4.5 \\
2.5\qquad 3\qquad 3.5\qquad 4\qquad 4.5\qquad 5 \\
3\qquad 3.5\qquad 4\qquad 4.5\qquad 5\qquad 5.5 \\
3.5\qquad 4\qquad 4.5\qquad 5\qquad 5.5\qquad 6 \\
\end{aligned}$$
Notice that some sample mean values are more likely than others.
For example, there is only one outcome, namely $(1,1)$, which results in a sample mean of $1$, and similarly, only $(6,6)$ gives a sample mean of $6$. So the probability of observing a sample mean of $1$ is only $1/36$, and similarly for a sample mean of $6$.
On the other hand, there are three outcomes that give a sample mean of $2$, namely $(3,1)$, $(2,2)$, and $(1,3)$. So the probability of observing a sample mean of $2$ is $3/36 = 1/12$.
Looking at the second table, the most likely observed sample mean is $3.5$. It occurs once in each row, for a total of $6$ out of $36$, so the probability that the sample mean is $3.5$ is $1/6$.
If we were to repeat the experiment with more rolls, we would see that a higher percentage of outcomes would result in sample means near $3.5$, and this percentage would grow closer and closer to $100\%$ as the number of rolls grows larger and larger.
Best Answer
The difference between the classical definition and the empirical are similar to the difference between a theory and an experiment in physics. The theory is developed in an abstract (perfect) way while the experiments are practical observations. The same happens with probability.
In the classic definition of probability of an event you will assign a probability to an event based on abstract thinking. For example : what is the probability of getting the result 2 when perform the calculation $\frac{2x}{2}$, where $x=1,2,\dots,100$? There are $n=100$ possible outcomes but the result 2 will only occur once, so $m=1$. This is because we can write the relation as $\frac{2x}{2}=\frac{2}{2}x=x$, so only $x=2$ gives the right result. In this case we will say that the probability is $1/100$. So, in classical probability you think of the space of the outcomes and try to find an abstract reason to assign the probability (we used mathematics logic to came up with the number of possibilities and the one of outcomes).
In the empirical definition, on the other hand, you don't think, you just do experiments and count. So, to solve the last problem , you will do as many calculations as you can from the 100 possible and count how many times you get 2. For example, if you perform this experiment on the first 10 numbers (N=10)you will get only once the result of 2 , (N(A)=1), so your estimate for the probability will be $\frac{1}{10}$. This is not the right probability, but more experiments you do, better the estimate is.Closer you get to exhausting the number of possible outcomes closer you are to the true probability.
Now, everything is fine when the possible outcomes are a in finite number. The classical approach gives the right result but might require complex thinking, while the empirical approach gives without effort an estimate that will improve with the number of "measurements/experiments".
What about when you have an infinite number of outcomes? For example: what is the probability of selecting the number 6 from a box with all the natural numbers from 1 to 100? What if in the box there are the numbers to n=10.000, or n=10000000000.......0? The classic definition has an answer for you. Since 6 is unique, the probabilities are $\frac{1}{100}$, $\frac{1}{10.000}$ and $\frac{1}{10000000....0}$. The last probability is almost zero, which is the case "$P(A)=\lim_{n\rightarrow \infty}\frac{m}{n}=0$".
The empirical definition will never give you a good answer for this question since it won't ever be able to exhaust the possible outcomes. If in N tries the experimenter doesn't select the number 6, then the probability will be indeed $\frac{N(A)}{N}=\frac{0}{N}=0$, but the results was "correct" only by the chance of the experimenter. Instead, if she selects 6 at the beginning of the experiment, the result is $\frac{N(A)}{N}=\frac{1}{N}$ and the experimenter will get closer to the result only after a tremendous number of experiments. We need to notice here that there is never in the goal of the empiricist to reach infinity ($N\rightarrow\infty$)since she is always working with finite samples, she doesn't look for perfect knowledge but for useful approximations.
Another example : what is the probability of tails when flipping a fair coin? The classic approach will argue that the probability of "tails" in one flip is $1/2$ because there are only two possible outcomes and "tails" is one of them $\frac{m}{n}=\frac{1}{2}$. The empiricist will do N experiments and will count how many times A=tails occurs and finds $\frac{N(tails)}{N}$. This will always give her a constant since N is always a finite number of experiments.
Apart from the discourse whether the empiricist wants to reach infinity, by the law of large numbers the average result from a large number of experiments will get closer to the expected value of the phenomenon studies. By this law $\lim_{N\rightarrow\infty}\frac{N(A)}{N}$ will $ converge $ (not equal) to the expected probability of the event A, which is a constant.
The axiomatic definitions are conceived in an abstract perfect manner such that no mathematical contradiction can occur. This makes possible building a solid theory by using mathematical logic. The probability axioms where first proposed by Kolmogorov and can be found here http://en.wikipedia.org/wiki/Probability_axioms.