Updated 4/17/11:
(Originally, this answer contained a different proof of the result below for $k=3$. Not only did the proof not generalize, but it was wrong.)
The maximum number of edges in a strongly-connected digraph with $n \geq k+1$ vertices and no cycles of length at most $k$ is $${\binom{n}{2}} - n(k-2) + \frac{(k+1)(k-2)}{2}.$$
(A digraph where every vertex is reachable from every other vertex by a directed path is called strongly connected.)
Gordon Royle conjectured this bound an gave an example achieving it for $k=3$. For general $k$ and $n$ the bound is attained by the following construction, almost identical to the one provided by Nathann Cohen in the comments:
Let vertices $x_1,x_2,\ldots,x_{n-k+2}$ form a transitive tournament with $x_i \to x_j$ being an edge for all $1 \leq i < j \leq n-k+2$. Now delete the edge $x_1 \to x_{n-k+2}$ and replace it with a path $x_{n-k+2} \to x_{n-k+3} \to \ldots \to x_n \to x_1$. (The vertices $x_{n-k+3},\ldots, x_n$ will have in-degree one and out-degree one in the resulting graph.)
It remains to prove that the above number is a valid upper bound. The proof is by induction on $n$.
Simple counting shows that the bound is valid if $G$ is a directed cycle. It is tight if $G$ is a cycle of length $k+1$. Assume now that $G$ is not a cycle. Then there exist $\emptyset \neq X \subsetneq V(G)$ such that $G|X$ is strongly connected. (For example, one can choose the vertex set of any induced cycle in $G$.) Choose $X$ maximal subject to the above. Let $u \to v_1$ be an edge of $G$ with $u \in X$, $v_1 \not \in X$, and let $P$ be a shortest path in $G$ from $v_1$ to $X$. Let $P=v_1 \to v_2 \to \ldots \to v_l \to w$.
Note that adding to $G|X$ any path starting and ending in $X$ produces a strongly connected digraph. It follows from the choice of $X$ that any non-trivial such path must include all the vertices in $V(G)-X$. In particular, if $l\geq 3$, $v_2,\ldots,v_{l-1}$ have no neighbors in $X$.
Let us further assume that $u$ and $w$ are chosen so that the directed path $Q$ from $w$ to $u$ in $G|X$ is as short as possible. (Perhaps, $w=u$.) Then $V(P) \cup V(Q)$ induces a cycle in $G$, and so $v_1$ and $v_l$ have at least $k-2$ non-neighbors on $V(P) \cup V(Q)$. At least $k-l$ of those non-neighbors are in $X$ if $l\geq 2$. Therefore there are at least $k-2$ non-edges (pairs of non-adjacent vertices) between $X$ and $V(G)-X$ if $l=1$, and at least
$$2(k-l)+(l-2)(k+1) \geq l(k-2)$$
non-edges if $l \geq 2$.
By the induction hypothesis there are at least $|X|(k-2)- \frac{(k+1)(k-2)}{2}$ non-edges between vertices of $X$, and therefore at least
$$(l+|X|)(k-2)- \frac{(k+1)(k-2)}{2}=n(k-2) - \frac{(k+1)(k-2)}{2}$$
non-edges in total, as desired.
I wrote a program to collect some data.
For $n=8$, and $10^5$ trials, here are statistics on the longest cycles of length $k$ and the counts of the times that the cycle with the greatest normalized weight had length $k$.
k count avg std_dev
3 50415 1.40995707256456 0.277702203891974
4 30427 1.3675029633889 0.248163593506348
5 13738 1.32184789116913 0.229675012490759
6 4428 1.26765935699902 0.215218146521779
7 916 1.20083001890189 0.202927859960246
8 76 1.11148487469463 0.190259341168933
In a few cases I inspected, the largest weight cycle of length $k+1$ often shared a directed chain of $k$ vertices with the largest weight cycle of length $k$, but of course this did not always happen. There seemed to be a high correlation between the largest weights of cycles of different lengths.
For $n=10, 12, 20$, I did a restricted optimization over the cycles of length at most $6$.
n=10, 10^5 trials
k count avg std_dev
3 44788 1.56377702460182 0.258071707092035
4 30386 1.53787677069062 0.228885384830286
5 16974 1.50659766688642 0.212244752764919
6 7852 1.4715247336037 0.199249497688295
n=12, 10^5 trials
k count avg std_dev
3 41207 1.67848840347225 0.244485830656911
4 29722 1.66261483794121 0.21443274525213
5 18687 1.64098125565814 0.198203806693267
6 10384 1.61519351038532 0.186681888604542
n=20, 2000 trials
k count avg std_dev
3 667 1.97010656830871 0.212728229010943
4 584 1.97273614009628 0.18001851348712
5 418 1.96707199503644 0.16332139093596
6 331 1.95442360307882 0.154839166051771
Best Answer
I will rephrase your question slightly. Let $K_{n}^{*}$ be the directed graph with $n$ vertices and two oppositely directed edges for each pair of vertices. Your question is then the following.
For $n=2k+1$ odd, it is an old theorem of Walecki that $K_n$ can be decomposed into $k$ Hamiltonian cycles, and hence $K_n^*$ can be decomposed into $2k$ directed Hamiltonian cycles.
For $n=2k$ even, you are right to note that for $n=4$ we cannot achieve the upper bound of $n-1.$ One can also check that we cannot achieve the upper bound for $n=6$. However, Tilson proved that for even $n \geq 8$, $K_n^*$ can de decomposed into $n-1$ directed Hamiltonian cycles.
This completely answers your question. Namely, $n=4$ and $n=6$ are the only exceptions.