No, it is not true that a process W satisfying the properties (1), (3) and (4) has to be a Brownian motion. We can construct a counter-example as follows.
This construction is rather contrived, and I don't know if there's any simple examples.
Start with a standard Brownian motion W. The idea is to apply a small bump to its distribution while retaining the required properties. I will do this by first reducing it to the discrete-time case. So, choose a finite sequence of times 0 = t0 < t1 < ... < tn. Then define a piecewise linear process X by Xtk = Wtk (k = 0,1,...,n) and such that X is linearly interpolated across each of the intervals [tk-1,tk] and constant over [tn,∞).
Then, Y = W - X is a continuous process independent from X. In fact, Y is just a sequence of Brownian bridges across the intervals [tk-1,tk] and is a standard Brownian motion on [tn,∞). Also by linear interpolation, for any time t ≥ 0, Xt is a linear combination of at most two of the random variables Xt1,...,Xtn. The increments of W,
$$
W_t-W_s = X_t-X_s + Y_t-Y_s,
$$
are then a linear combination of at most 4 of the random variables Xt1,...,Xtn plus an independent term. So, choosing n ≥ 5, if it is possible to replace (Xt1,...,Xtn) by any other ℝn-valued random variable without changing the joint-distribution of any 4 elements, then the distributions of the increments Wt - Ws will be left unchanged. So, properties (1), (3), (4) will still be satisfied but the new process for W will not be a standard Brownian motion. It is possible to change the distribution in this way:
Let X = (X1,X2,...,Xn) be an ℝn-valued random variable with a continuous and strictly positive probability density pX: ℝn → ℝ. Then, there exists a random variable Y = (Y1,Y2,...,Yn) with a different distribution than X but for which the projection onto any n - 1 elements has the same distribution as for X.
That is, for any k1,k2,...,kn-1 in {1,...,n}, (Yk1,Yk2,...,Ykn-1) has the same distribution as (Xk1,Xk2,...,Xkn-1).
We can construct the probability density pY of Y by applying a bump to the probability distribution of X,
$$
p_Y(x)=p_X(x)+\epsilon f(x_1)f(x_2)\cdots f(x_n).
$$
Here, ε is a fixed real number and f: ℝ → ℝ is a continuous function of compact support and zero integral, $\int_{-\infty}^\infty f(x)\,dx=0$. Then, $\int_{-\infty}^\infty p_Y(x)\,dx_k=\int_{-\infty}^\infty p_X(x)\,dx_k$ for each k. So, the integral of pY over ℝn is 1 and, by choosing ε small, pY will be positive. Then it is a valid probability density function. Finally, as the integral along the kth direction (any k) agrees for pX and pY, the projection of X and Y onto ℝn-1 along the kth direction give the same distribution.
This is a partial answer but shows the kind of subtlety that makes the continuity of Brownian motion non trivial. If you try and take the first three axioms of Brownian motion and try to prove that the process has continuous paths using a central limit theorem argument what you get is that on a probability space $(\Omega,\mathbb{P})$, that $\forall t > 0$
$\mathbb{P}(B_t\ is\ discontinuous\ at\ t) = 0 $
This means that there are null sets $\mathcal{N}_t \subset \Omega$ such that if $\omega \not\in \mathcal{N}_t$ then $B_t(\omega)$ is continuous at time $t$. And here is the delicate part. This does not imply that there exists any single $\omega \in \Omega$ such that $B_t(\omega)$ is continuous for all $t$. I have seen this property called stochastic continuity in some places.
What you usually want is a single null set $\mathcal{N}$ so that for $\omega \not \in \mathcal{N}$ $B_t(\omega)$ is continuous for all $t \ge 0$.
Of course the usual constructions of Brownian motion do take care of this subtlety but some times without mentioning it.
Best Answer
Let me assume that you seek for the generalization of Gaussian distribution in order to generalize the Brownian motion.
As far as I know, regarding the heat kernel as the generalization of the Gaussian distribution has long been adopted in many literatures. It comes from the following observation.
In $\mathbb {R}^1$, the following notions coincide:
(1) Gaussian distribution $N(x,t)\sim f(t,x,y)=\frac {1}{\sqrt{2\pi t}}e^{\frac{-(y-x)^2}{2t}}$,
(2) transition function $p(t,x,y)$ of the Brownian motion $B_t$,
(3) (heat kernel) fundamental solution $k_t(x,y)$ of the heat equation $\partial_t k=\Delta_y k$, with initial data $\delta_x$.
Thus, on manifolds, one way to define the Brownian motion is to construct a Markov process on the manifold whose transition function is exactly the heat kernel (let's identify the heat kernel with the Gaussian distribution in this setting). Since we always have the Laplacian-Beltrami $\Delta$ on a manifold, it is justifiable to talk about the heat equation and thus the heat kernel, and the Brownian motion in this sense is known to exist for a large class of manifolds.
But on metric spaces, we no longer have the Laplacian-Beltrami. So, in order to talk about heat kernel/Gaussian distribution, we need to generalize the notion of Laplacian-Beltrami. The key concept on this line the so-called Dirichlet form. A Dirichlet form on metric measure space $(X,d,\mu)$ a closed symmetric form $(\cdot,\cdot)$ defined on $L^2(X,\mu)$. It should further satisfy a couple of conditions so that it behaves like its prototype $(f,g)=\int_{M} {\nabla f\cdot \nabla g dx}$ on a manifold $M$. Notice that $(f,g)=(-\Delta f,g)_{L^2(M)}$ on manifolds, in the general case, one obtains the desired "Laplacian" by the same formula. Therefore, every Dirichlet form corresponds to a "Laplacian" and thus a Gaussian distribution (and thus a Brownian motion). What's more, a reasonable Dirichlet form always exists provided the space is suitably good.
In sum, if the space you are considering have both metric and measure structures, then the theory of Dirichlet form may provide you some satisfactory results regarding construction and properties of the Guassian distribution (and thus the Brownian motion). Roughly speaking, if we don't have a presumed measure, we may not be able to construct a reasonable probability space; if we don't have a metric, it would be hard to measure the regularity and decay of the Gaussian distribution. So metric measure structure might be the minimal structure for reasonable construction of Gaussian distribution.
Some reference books could be found in the above link. This paper by Sturm may allow you to have a glance at the whole picture. I am not an expert in this field. I apologize in advance for any mistake and naivety.