Take the simple white noise process $Z_t$, $EZ_t=0$, $cov(Z_t,Z_{t-h})=0$, for all $h\neq 0$. Now take its difference $Y_t=Z_{t}-Z_{t-1}$and calculate the first lag autocovariance:
$$cov(Y_t,Y_{t-1})=cov(Z_t-Z_{t-1},Z_{t-1}-Z_{t-2})=-cov(Z_{t-1},Z_{t-1})=-var(Z_t)$$
Hence $corr(Y_t,Y_{t-1})=-1/2.$ (Since $var(Y_t)=2var(Z_t)$).
Now for any (causal) stationary process $X_t$ there exists such a white noise process $Z_t$ and coefficients $\psi_j$ such that $X_t=\sum_{j=0}^{\infty}\psi_jZ_{t-j}$. This is courtesy of the Wold decomposition. Thus
$$cov(X_t,X_{t+h})=\sum_{j=0}^\infty\psi_j\psi_{j+h}$$
For the differenced version $Y_t=X_t-X_{t-1}$ we have
$$Y_{t}=Z_{t}+(\psi_1-1)Z_{t-1}+\sum_{j=2}(\psi_{j}-\psi_{j-1})Z_{t-j}$$
and
$$cov(Y_t,Y_{t-1})=\psi_1-1+\sum_{j=2}^{\infty}(\psi_{j}-\psi_{j-1})(\psi_{j-1}-\psi_{j-2})$$
Now more often than not the coefficients $\psi_j$ are decreasing and less than one. So we have that $\psi_1-1<0$ and is larger than remaining sum. This would be one (very obvious) explanation why the first covariance is negative. More can be said with more careful analysis of the terms of the sum, but I think I managed to convey the general idea.
alike Serial Dependance, if the value at some time t in the series is correlated on the some pair of value at another time s,say,serial correlation.as it is serially correlated. for this there is a term "no serial correlation"
Negative serial correlation implies that a positive error for one observation increases the chance of a negative error for another observation and a negative error for one observation increases the chances of a positive error for another.
you can test for type of auto-correlation by Durbin–Watson statistic
Real life examples
Best Answer
Short answer is yes, differencing will introduce a negative autocorrelation into the differenced series in most situations. Assuming a mean centered variable to make the notation a bit simpler, the covariance between the differenced series can be represented as:
$$Cov(\Delta X_t,\Delta X_{t-1}) = E[\Delta X_t \cdot \Delta X_{t-1}]$$
Where
Breaking this down into the original variables, we then have:
\begin{align} E[X_t \cdot X_{t-1}] &= E[(X_t - X_{t-1}) \cdot (X_{t-1} - X_{t-2}) ] \\ &= E[X_tX_{t-1} - X_tX_{t-2} - X_{t-1}X_{t-1} + X_{t-1}X_{t-2}] \end{align}
The multiplications are then just variances and covariances of the levels:
$$Cov(X_t,X_{t-1}) - Cov(X_t,X_{t-2}) - Var(X_{t-1}) + Cov(X_{t-1},X_{t-2})$$
So here we can see that many different situations will result in negative autocorrelations of the differenced series - basically only in the case that the auto-correlations of the levels are really large (e.g. an integrated series) will the differences have a small negative auto-correlation.
With random data the autocorrelation of the differences will be approximately -0.5, as with random data those covariance terms among the levels will be 0, so it is just $-Var(X_{t-1})$ for the numerator, but with the differences is $Var(X_t) - Var(X_{t-1})$ in the denominator.
This is typically called over-differencing. The solution is to not over-difference the data to begin with.