We know that $k$ is a field. So in $k$, either $2$ is a unit or $3$ is a unit.
Let $p$ be a prime number which, in $k$, is a unit. Then consider the ring $O = k[\epsilon]$, where $\epsilon^p = 0$. Formally, we consider the ring $k[x]/(x^p)$. Note that $O$ is a local ring. $O$ is also a free vector space of dimension $p$, with a basis $\epsilon^0, \epsilon^1, \ldots, \epsilon^{p - 1}$.
Now let us consider what derivations $D$ exist on $O$. Note that since $\epsilon$ generates $O = k[\epsilon]$ as a $k$-algebra, we see that $D$ is uniquely determined by $D\epsilon$.
Let us note that $0 = D0 = D(\epsilon^p) = p \epsilon^{p - 1} D\epsilon$. Therefore, $\epsilon^{p - 1} D\epsilon = 0$. Therefore, $\epsilon \mid D\epsilon$.
Conversely, we see that there is a derivation $D$ sending $\epsilon$ to $\epsilon$. This requires the following Lemma:
Suppose $R$ is a $k$-algebra which, as a $k$-vector space, is spanned by $B$. Then a linear map $g : R \to R$ is a derivation iff for all $a, b \in B$, $g(ab) = g(a) b + a g(b)$.
Proof: suppose $g(ab) = g(a) b + a g(b)$ for all $a, b \in B$. Then $\{x \in R \mid g(ax) = g(a) x + a g(x)\}$ is a subspace of $R$ containing $B$, hence is all of $R$. Then $\{y \in R \mid \forall x \in R (g(yx) = g(y)x + x g(y))\}$ is a subspace of $R$ containing $B$, hence is all of $R$. So $g$ is a derivation. The other direction is immediate. $\square$
So in particular, note that $\{\epsilon^n\}_{n = 0, 1, \ldots, p - 1}$ is a basis of $k[\epsilon]$. Consider the unique linear map $g : k[\epsilon] \to k[\epsilon]$ sending $\epsilon^n$ to $n \epsilon^n$ for $n = 0, 1, \ldots, p - 1$. Then we see that $g$ is a derivation using the above Lemma.
So the derivations on $O$ are, as a $O$-module, isomorphic to the ideal $(\epsilon)$. It is straightforward to show this is not a free module, since the only free module satisfying $\epsilon^{p - 1} x = 0$ for all $x$ is the zero module, which $(\epsilon)$ clearly is not.
Clearly, the above demonstration can be generalised to sheaves on any space. Given a field $k$, we can construct the sheaf $\mathcal{O}$ of locally constant functions on $k[\epsilon]$. This will be a local ring object in the category of sheaves. Its sheaf of derivations will be, as a sheaf of $\mathcal{O}$-modules, isomorphic to the sheaf of locally constant functions on $(\epsilon)$, which is locally free only if the space is empty.
It sounds to me like you are using the wrong definition of "sheaf of graded algebras". A sheaf of graded algebras should not be a sheaf of algebras where each algebra happens to have a grading which is preserved by the restriction maps. Instead, a sheaf of graded algebras is a sheaf which takes values in the category of graded algebras. This means that the sheaf gluing condition is interpreted in terms of limits in the category of graded algebras, which are not the same as limits in the category of algebras. In particular, this makes it not a problem if you have sections $p_i\in\mathcal{O}(U_i)$ of unbounded degree: this would mean that no graded algebra $A$ can have an element that maps to all of the $p_i$ under morphisms $A\to\mathcal{O}(U_i)$, and so $\mathcal{O}(U)$ does not have to have such an element in order to satisfy the gluing condition (since $\mathcal{O}(U)$ just has to be the universal graded algebra with compatible morphisms to each $\mathcal{O}(U_i)$).
Generally, all of this is easier to understand if you define graded objects as sequences rather than single objects together with a direct sum decomposition. That is, a graded vector space should be defined as a sequence $(V_n)$ of vector spaces (rather than as a single vector space $\bigoplus V_n$ equipped with a direct sum decomposition). There is then a tensor product monoidal structure on the category of graded vector spaces, and a graded algebra is monoid object with respect to this monoidal structure. This perspective then has the advantage that the forgetful functor from graded algebras to graded sets (i.e., sequences of sets) preserves limits. So, limits in the category of graded algebras (which are what you care about for sheaves of graded algebras) are computed by just taking the ordinary limits in each graded piece separately. Concretely, this means that to check the gluing condition for a sheaf of graded algebras, you just have to check it for homogeneous elements of a fixed degree. (Or, if you like, this means that a sheaf of graded algebras is the same thing as a graded sheaf of algebras, i.e. a sequence of sheaves of vector spaces together with a monoid structure with respect to the tensor product of sequences of sheaves of vector spaces).
(In contrast, the forgetful functor taking a graded algebra to its single underlying set which is the direct sum of the graded pieces does not preserve limits! This is basically because infinite products do not play well with infinite direct sums, so an infinite cartesian product of direct sums does not have a natural direct sum decomposition. This means that limits in the category of graded algebras cannot be computed by thinking "elementwise", if you are considering elements of the direct sum.)
Best Answer
Your definition does produce a valid sheaf. There is another way of doing this which is conceptually more elegant. $\DeclareMathOperator{Hom}{Hom}$
Consider the sheaf $G(U) = \{D \in \Hom(\mathcal{O}|_U, \mathcal{O}|_U) \mid $ for each $V \subseteq U$, $D(V)$ is a $k$-linear derivation$\}$. In the internal logic of sheaves, we are constructing the “set” $G = \{D : \mathcal{O} \to \mathcal{O} \mid D$ is a $k$-linear derivation$\}$, which of course can become a Lie algebra.
Note that $G$ can be constructed regardless of whether $\mathcal{O}$ is flasque. However, when $\mathcal{O}$ is flasque, we see that $G$ and $\mathcal{F}$ are isomorphic.
For consider the natural transformation $\theta : G \to \mathcal{F}$ defined by $\theta_U(D) = D(U)$. It is easy to see that $\theta$ is well-defined; if we have $D \in G(U)$, then we see that for all $f, g \in \mathcal{O}(U)$, for all $V \subseteq U$, if $f|_V = g|_{U}$, then $D(U)(f)|_{V} = D(V)(f|_V) = D(V)(g|_V) = D(U)(g)|_V$, so $D(U)$ is “well-behaved”. Now suppose we have $V \subseteq U$, $D \in G(U)$, and $f \in \mathcal{O}(V)$. Take some $f’ \in \mathcal{O}(U)$ such that $f’|_V = f$; then $\theta_U(D)|_V(f) = \theta_U(D)(f’)|_V = D(U)(f’)|_V = D(V)(f’|_V) = D(V)(f) = D|_V(V)(f) = \theta_V(D|_V)(f)$. This confirms the naturality of $\theta$.
Now I claim that $\theta$ is an isomorphism. To do this, we explicitly construct the inverse of $\theta_U$. Given $D \in \mathcal{F}(U)$, define $\eta(D)$ to be the natural transformation given by $\eta(D)(V) = D|_V$. This is a natural transformation by the definition of the restriction operator. We see immediately that $\theta_U \circ \eta$ and $\eta \circ \theta_U$ are both the identity.
So we see that $\mathcal{F}$ is just another way of constructing $G$ if we add an extra assumption - that $\mathcal{O}$ is flasque.
Now with our $G$, we can easily define the Lie bracket in the way you would expect. Given $D, E \in G(U)$, we can define $[D, E](V) = [D(V), E(V)]$ (using the ordinary Lie bracket structure on derivations). It is easy to verify the equations of Lie algebra hold here; they follow from the corresponding set-theoretic facts.
As for your second question, I’ll have to think on it a bit more.