Call your false positive rate p, the actual negative rate in your sample r, and n the number of trials without coming across a false positive. Clearly for your confidence interval, given you've observed no false positives so far, the lowest bound (and indeed the maximum likelihood estimate) is zero. For the higher bound, you can think "what is the value of p for which there is $\alpha$ probability of getting 0 failures out of my n trials? Higher values of p are not in your $1-\alpha%$ confidence interval.
The probability of a false positive for any one draw in your sample experiment is $p\times{r}$.
So solve
$\alpha=(1-pr)^n$
and you get
$p=\frac{1-e^{\frac{log(\alpha)}{n}}}{r}$
With your n=32000 and r=.5 (if I understand your question correctly) this suggests the upper bound of a 95% confidence interval for false positives is 0.0001872245.
I think you've discovered that the F-score is not a very good way to evaluate a classification scheme. From the Wikipedia page you linked, there is a simplification of the formula for the F-score:
$$
{F1} = \frac {2 {TP}} {2 {TP} + {FP} + {FN}}
$$
where $TP,FP,FN$ are numbers of true positives, false positives, and false negatives, respectively.
You will note that the number of true negative cases (equivalently, the total number of cases) is not considered at all in the formula. Thus you can have the same F-score whether you have a very high or a very low number of true negatives in your classification results. If you take your case 1, "# of predicted healthy patients over # of actual healthy patients", the "true negatives" are those who were correctly classified as having cancer yet that success in identifying patients with cancer doesn't enter into the F-score. If you take case 2, "# of predicted cancer patients over # of actual cancer patients," then the number of patients correctly classified as not having cancer is ignored. Neither seems like a good choice in this situation.
If you look at any of my favorite easily accessible references on classification and regression, An Introduction to Statistical Learning, Elements of Statistical Learning, or Frank Harrell's Regression Modeling Strategies and associated course notes, you won't find much if any discussion of F-scores. What you will often find is a caution against evaluating classification procedures based simply on $TP,FP,FN,$ and $TN$ values. You are much better off focusing on an accurate assessment of likely disease status with an approach like logistic regression, which in this case would relate the probability of having cancer to the values of the predictors that you included in your classification scheme. Then, as Harrell says on page 258 of Regression Modeling Strategies, 2nd edition:
If you make a classification rule from a probability model, you are being presumptuous. Suppose that a model is developed to assist physicians in diagnosing a disease. Physicians sometimes profess to desiring a binary decision model, but if given a probability they will rightfully apply different thresholds for treating different patients or for ordering other diagnostic tests.
A good model of the probability of being a member of a class, in this case of having cancer, is thus much more useful than any particular classification scheme.
Best Answer
So this is a confusion matrix in case any readers haven't seen one: $$ \begin{array}{l c c} & Predict + & Predict-\\ Actual + & a & b \\ Actual - & c & d \\ \end{array} $$
And this is the formula for calculating $F_1$ from a confusion matrix: $$F_1 = \frac{2a}{2a+b+c}$$ So if half the items are positive in reality and all are predicted positive: $$\begin{array}{l c c} & Predict + & Predict-\\ Actual + & 50 & 0 \\ Actual - & 50 & 0 \\ \end{array}$$ Then $F_1$ in this case is equal to $2/3$ as you stated in your question: $$F_1 = \frac{2(50)}{2(50)+50+0} = 0.67$$ But if half the items are positive in reality and all the items are randomly predicted, then there are many different ways for this to occur and the resulting $F_1$ score will range between $0.00$ and $1.00$. Let's imagine we have just two items, then there are four possible results for the random classifier: $$\begin{array}{l c c c} Item & Reality & Predict_1 & Predict_2 & Predict_3 & Predict_4 \\ \hline 1 & + & + & + & - & -\\ 2 & - & + & - & + & -\\ \hline F_1 & & 0.67 & 1.00 & 0.00 & 0.00\\ \end{array}$$ So all that was just to say that the value of $F_1$ for the random classifier is actually quite variable, unlike the always-positive classifier. The alternative I'd suggest is the $S$ score first proposed by Bennett, Alpert, & Goldstein (1954). It assumes a 50% probability of classifying any given item into its correct class "by chance" alone and discounts this from the final score. It also uses all cells of the confusion matrix (unlike $F_1$ which ignores $d$ or the number of "true negatives"). $$p_o = \frac{a+d}{a+b+c+d}$$ $$S = \frac{p_o - 0.5}{1-0.5}=2p_o-1$$ In the first example above, where $F_1=0.67$, the $S$ score would be equal to $0.00$ and would capture the idea that the predictions are doing no better than would be expected by chance. $$p_o = \frac{50+0}{50+0+50+0}=0.50$$ $$S = 2(0.50)-1 = 0.00$$ In the second example above, where $0.00 \le F_1 \le 1.00$ with a mean of $0.42$, the $S$ score would actually range between $-1.00$ and $1.00$ with a mean of $0.00$. Thus, both classifiers would be deemed equal by the $S$ score: $$\begin{array}{l c c c} Item & Reality & Predict_1 & Predict_2 & Predict_3 & Predict_4 \\ \hline 1 & + & + & + & - & -\\ 2 & - & + & - & + & -\\ \hline F_1 & & 0.67 & 1.00 & 0.00 & 0.00\\ S & & 0.00 & 1.00 & -1.00 & 0.00 \\ \end{array}$$ You can find more information about classification reliability at my website, including a history of the S score and functions for calculating it in MATLAB and even generalizing it to multiple classifiers, multiple categories, and non-nominal categories.