In the frequentist approach, it is asserted that the only sense in which probabilities have meaning is as the limiting value of the number of successes in a sequence of trials, i.e. as
$$p = \lim_{n\to\infty} \frac{k}{n}$$
where $k$ is the number of successes and $n$ is the number of trials. In particular, it doesn't make any sense to associate a probability distribution with a parameter.
For example, consider samples $X_1, \dots, X_n$ from the Bernoulli distribution with parameter $p$ (i.e. they have value 1 with probability $p$ and 0 with probability $1-p$). We can define the sample success rate to be
$$\hat{p} = \frac{X_1+\cdots +X_n}{n}$$
and talk about the distribution of $\hat{p}$ conditional on the value of $p$, but it doesn't make sense to invert the question and start talking about the probability distribution of $p$ conditional on the observed value of $\hat{p}$. In particular, this means that when we compute a confidence interval, we interpret the ends of the confidence interval as random variables, and we talk about "the probability that the interval includes the true parameter", rather than "the probability that the parameter is inside the confidence interval".
In the Bayesian approach, we interpret probability distributions as quantifying our uncertainty about the world. In particular, this means that we can now meaningfully talk about probability distributions of parameters, since even though the parameter is fixed, our knowledge of its true value may be limited. In the example above, we can invert the probability distribution $f(\hat{p}\mid p)$ using Bayes' law, to give
$$\overbrace{f(p\mid \hat{p})}^\text{posterior} = \underbrace{\frac{f(\hat{p}\mid p)}{f(\hat{p})}}_\text{likelihood ratio} \overbrace{f(p)}^\text{prior}$$
The snag is that we have to introduce the prior distribution into our analysis - this reflects our belief about the value of $p$ before seeing the actual values of the $X_i$. The role of the prior is often criticised in the frequentist approach, as it is argued that it introduces subjectivity into the otherwise austere and object world of probability.
In the Bayesian approach one no longer talks of confidence intervals, but instead of credible intervals, which have a more natural interpretation - given a 95% credible interval, we can assign a 95% probability that the parameter is inside the interval.
It is not necessary to call it frequentist material, rather material from probability and statistics in general.
Here are some examples of prior knowledge that, in my opinion, would be handy:
- What are densities, (conditional) distributions, expectations etc.?
- Some specific distributional families (Beta, normal, uniform etc.)
- Most likely you will want to apply Bayesian methods to real data, so
statistical software. My favorite: R
- Some mathematics: Matrix algebra, integration, ...
- Also, it could be handy to be familiar with some statistical models, such as the linear model $y=X\beta+u$.
- Given the heavy emphasis on the likelihood, it cannot hurt to have heard about maximum likelihood before
The Bayesian paradigm being a subjective one, I am sure others will disagree with or add to this list...
Best Answer
Both Bayesian statistics and frequentist statistics are based on probability theory, but I'd say that the former relies more heavily on the theory from the start. On the other hand, surely the concept of a credible interval is more intuitive than that of a confidence interval, once the student has a good understanding of the concept of probability. So, whatever you choose, I advocate first of all strengthening their grasp of probability concepts, with all those examples based on dice, cards, roulette, Monty Hall paradox, etc..
I would choose one approach or the other based on a purely utilitarian approach: are they more likely to study frequentist or Bayesian statistics at school? In my country, they would definitely learn the frequentist framework first (and last: never heard of high school students being taught Bayesian stats, the only chance is either at university or afterwards, by self-study). Maybe in yours it's different. Keep in mind that if they need to deal with NHST (Null Hypothesis Significance Testing), that more naturally arises in the context of frequentist statistics, IMO. Of course you can test hypotheses also in the Bayesian framework, but there are many leading Bayesian statisticians who advocate not using NHST at all, either under the frequentist or the Bayesian framework (for example, Andrew Gelman from Columbia University).
Finally, I don't know about the level of high school students in your country, but in mine it would be really difficult for a student to successfully assimilate (the basics of) probability theory and integral calculus at the same time. So, if you decide to go with Bayesian statistics, I'd really avoid the continuous random variable case, and stick to discrete random variables.