The threshold is chosen such that it ensures convergence of the hypergeometric distribution ($\sqrt{\frac{N-n}{N-1}}$ is its SD), instead of a binomial distribution (for sampling with replacement), to a normal distribution (this is the Central Limit Theorem, see e.g., The Normal Curve, the Central Limit Theorem, and Markov's and Chebychev's Inequalities for Random Variables). In other words, when $n/N\leq 0.05$ (i.e., $n$ is not 'too large' compared to $N$), the FPC can safely be ignored; it is easy to see how the correction factor evolves with varying $n$ for a fixed $N$: with $N=10,000$, we have $\text{FPC}=.9995$ when $n=10$ while $\text{FPC}=.3162$ when $n=9,000$. When $N\to\infty$, the FPC approaches 1 and we are close to the situation of sampling with replacement (i.e., like with an infinite population).
To understand this results, a good starting point is to read some online tutorials on sampling theory where sampling is done without replacement (simple random sampling). This online tutorial on Nonparametric statistics has an illustration on computing the expectation and variance for a total.
You will notice that some authors use $N$ instead of $N-1$ in the denominator of the FPC; in fact, it depends on whether you work with the sample or population statistic: for the variance, it will be $N$ instead of $N-1$ if you are interested in $S^2$ rather than $\sigma^2$.
As for online references, I can suggest you
The "margin of error" of 38% is computed using a formula for 0/1 results that are obtained independently and randomly from a large population. None of these apply. The analogous formula for the present case (with 1..7 results obtained from a small population assuming random non-response) would involve a sample standard deviation with a finite population correction. It would not be very helpful in sorting out the confusion. Instead, let's explore the concept starting from its foundations.
The purpose of a margin of error is to indicate how much the population might differ from the sample, assuming that the data are a random sample of the population.
The problem we have is we don't know what the four missing respondents would have said. We have to cover all possible cases that are consistent with the data.
An interesting possibility is that seven of the 11 CEOs would give a reply of 7, two of them would reply with 5, and the other 2 with 1: this is (hypothetically) the population. How consistent are the data with this scenario? In this case, the chance of observing six 7's and one 5 at random is
$$\frac{\binom{7}{6}\binom{2}{1}\binom{2}{0}}{\binom{11}{7}} = \frac{7 \times 2 \times 1}{330} \approx 4.24\%.$$
I imagine that observing seven 7's ($1/330$ chance) or five 7's and two 5's ($21/330$ chance) would also be considered a "very satisfied" overall rating. The total chance of observing this rating in the sample would therefore equal $(14 + 1 + 22)/330$ = $11.2\%$.
This is about the worst situation that is consistent with the recorded observations in the sense that the conclusion made from our sample ("very satisfied") has at least a 5% chance of arising from a random selection. A reasonable way to express the "margin of error," then, is to note that as many as two of the CEOs, but not more than that, might have been extremely dissatisfied.
It's a good idea to go further with the analysis, because we have no evidence to support the assumption that the seven respondents actually are a representative sample. In fact, most likely they are not. Perhaps, for example, four of the CEOs would reply with 1's but did not care to because they did not want to reveal their dissatisfaction: the missingness is not at random.
A frank and thoughtful exposition of the results would bring up both these points when assessing the credibility of the results and their applicability to the entire population. Its conclusions would necessarily be tentative. They might be stated thus:
In this self-selected sample of 7 of the 11 CEOs, six reported being "very satisfied" and one "somewhat satisfied" with their bonus. Nothing is known about what the remaining four CEOs feel. We can conclude that as a group, the majority of the 11 CEOs were highly satisfied, but there may be anywhere from none through a significant minority (four) who were dissatisfied.
One advantage of this plain exposition is that it makes no unnecessary technical demands on the reader (or the writer) by invoking a "margin of error" formula which would need to be explained and interpreted and might well be incorrect by not accounting for the small population size or possibility of non-random response.
The techniques illustrated here apply to similar problems with small population sizes, non-random responses, or qualitative ordinal scales.
Best Answer
With a 3% response rate, it is highly likely that self-selection has made this a biased sample. whuber has also pointed out that 10000 may not be the whole population.
If the population was 10000 and if you had a random sample of 300 then you could make a finite population correction. Wikipedia suggests a multiplicative factor for the standard error of $$\sqrt{\dfrac{N-n}{N-1}}$$ which with $N=10000$ and $n=300$ is about $0.985$, not something that is going to make a lot of difference.