[Physics] Negative SNR and Shannon–Hartley theorem

experimental-physicsnoisesignal processing

It is intuitive to think that if the noise amplitude is more than signal amplitude, it will obscure the signal. But using Shannon–Hartley theorem, one can see that a receiver can read the signal even if the SNR is negative provided the bandwidth is high enough. What is the intuition behind this?

The Shannon–Hartley theorem states the channel capacity C, meaning the theoretical tightest upper bound on the information rate of data that can be communicated at an arbitrarily low error rate using an average received signal power S through an analog communication channel subject to additive white Gaussian noise of power N:
$C = B \log_2 \left( 1+\frac{S}{N} \right) $
where
C is the channel capacity in bits per second, a theoretical upper bound on the net bit rate (information rate, sometimes denoted I) excluding error-correction codes;
B is the bandwidth of the channel in hertz (passband bandwidth in case of a bandpass signal);
S is the average received signal power over the bandwidth (in case of a carrier-modulated passband transmission, often denoted C), measured in watts (or volts squared);
N is the average power of the noise and interference over the bandwidth, measured in watts (or volts squared); and
S/N is the signal-to-noise ratio (SNR) or the carrier-to-noise ratio (CNR) of the communication signal to the noise and interference at the receiver (expressed as a linear power ratio, not as logarithmic decibels).

Source https://en.wikipedia.org/wiki/Shannon%E2%80%93Hartley_theorem

Best Answer

The SNR in Shannon's equation is the signal power divided by the noise power, S/N. It is a number, and power cannot be less than 0, neither can noise. You are thinking of the SNR in dB, label it SNR(dB). You are thinking of the value in dB, and for SNR =1 indeed the SNR(dB) is 0, and for SNR < 1 (ie, noise greater than signal power) SNR(dB) is negative.

Similarly for SNR = 0, the log(1+SNR) is 0. For SNR < 1 (ie, SNR(dB) negative) the log(1+SNR) gets closer and closer to 0.

So, yes, indeed one can read (or as Shannon would say perfectly decode a signal or noisy information) for negative SNR's in dB. It happens all the time in for instance spread spectrum communications (which is a way of coding a signal, or generally, information). See an example further down. But actually there is a Shannon Limit, on what is called the SNR per bit, defined below as Eb/No. It is actually the most important entity.

One can perfectly decode or read a signal or information down to that limit. Below the limit you cannot read without erros, and the erro rate increases exponentially.

A good way to see what really happens is to write Shannon's equation

C = B $log_2$(1+SNR)

as C/B = $log_2$(1+SNR),

and then using SNR = S/NoB (with No the noise power density) you get

C/B = $log_2$(1+S/NoB). Then, setting S = signal power = EbC, where Eb is energy per bit, and setting

z == SNR = Eb/No x C/B, (Eq. 1)

with Eb/No the SNR per bit, one gets

C/B = $log_2$(1+EbC/NoB) = $log_2$(1+z) = z $log_2$ $(1+z)^{1/z}$

C/Bz = $log_2$ $(1+z)^{1/z}$

or using Eq. 1

No/Eb = $log_2$ $(1+z)^{1/z}$

In the limit as B goes to infinity with a finite C, so C/B goes to zero, and so z goes to zero, so

one gets, since limit as z goes to zero of $(1+z)^{1/z}$ = e

so one gets

limit as B goes to infinity of Eb/No = 1/$log_2$ e = 0.693

or, as B goes to infinity Eb/No goes to 0.693, and Eb/No (in dB) = -1.6 dB

You cannot decode a signal (or information) where the Eb/No (dB)= SNR per bit = < -1.6 dB. If you plot B/C vs Eb/No, the asymptote is at -1.6 dB.

That is called the Shannon limit. It is not possible to do better.

But we do decode negative SNR's (in dB) all the time. A spread spectrum signal with B/C = 100, can have,

SNR = Eb/No x C/B, or in dB,

SNR(dB) = Eb(No(dB) - 20 dB

so we could get SNR = -1.6 - 20 = -21.6 dB

That could happen, for instance, with a extremely well coded spread spectrum signal with 20 dB (so called) processing gain. The limit would be -21.6 dB of SNR. Usually we don't get to the limit. The same is true with error correcting codes, one uses them to make the SNR lower, while keeping enough Eb/No to decode well enough. Shannon's theorem does not tell us how to construct those codes, only that it is possible.

See Shannon in Wikipedia at https://en.wikipedia.org/wiki/Shannon%E2%80%93Hartley_theorem

See the Shannon limit explained starting at http://news.mit.edu/2010/explained-shannon-0115 and at YouTube at https://www.youtube.com/watch?v=Wq1-Iq9Vm28. See a Shannon limit graph with R actual bit rate, and R/B the spectral efficiency, pasted from http://www.gaussianwaves.com/2008/04/channel-capacity/

Related Solutions

[Physics] White Noise Generator Through Low Pass Filter – Autocorrelation of normalized data dropping below 0 (showing negative correlation)

A single measurement like this has a lot of noise on it - and random signal is always going to have some random correlation. You should definitely not pay too much attention to the stuff that is in the tail of the correlation distribution - it's all noise.

The fact that the built in function does not produce negative values is related to you only looking at the first 10 bins (which is the default). If you plotted the full range, it would go negative just like the others - use

plt.acorr(f, maxlags=None)

Incidentally, it does return all the values in four variables - if you do

result = plt.acorr(f, maxlags=None)

you can examine result[0] and result[1] to give you the values the function calculated - you will find it's similar (identical) to the one you get using other functions. The reason the plot is different, as I said, is that you are not looking at the same range of lags.

Let me demonstrate a few principles. I will be generating some white noise, filtering it with the response of an RC circuit, and plotting the autocorrelation multiple times. We are going to learn how to create white noise (and how to filter it); that the autocorrelation will not be the same twice; that it can go negative, even if you take the mean of 20; and that the measured time constant will vary by quite a bit. Bottom line - you will come away with a better feeling for this. There is much more to learn...

Here is the code:

# white noise

import numpy as np
import matplotlib.pyplot as plt
import math
from scipy import stats

def autocorr(f):
    temp = np.correlate(f, f, mode='full')
    mid = temp.size/2
    return temp[mid:]

plt.close('all')

# simulate a simple RC circuit
# V = 1/jwC/(R+1/jwC) = 1/(jwRC+1)
C = 1e-6      # 1 uF
R = 1.0e3     # 1 kOhm
RC = R*C
fs = 100000   # sampling frequency
fc = 1.0 / RC # cutoff frequency
repeats = 100 # number of times experiment is repeated
ns = 10000    # number of samples per repeat (1/10th of a second)
nplot = 500   # number of samples in autocorrelation to plot

t = np.arange(ns)/(1.0*fs) # time corresponding to each sample
fr = fs / (2.0 * ns) # resolution per bin in the FFT

freq =  np.arange(ns) * fr  # frequency bins in FFT
response = np.divide(1, 1j * 2 * math.pi * freq * RC + 1) # complex response per bin

# plot response function:

fn = 4* np.floor(fc / freq[1]) # two octaves above 3dB point
plt.figure()
plt.subplot(2,2,1)
plt.plot(freq[:fn], np.abs(response[:fn]))
plt.xlabel('frequency')
plt.title('Filter response: amplitude')

plt.subplot(2,2,2)
plt.plot(freq[:fn], np.angle(response[:fn], deg=1))
plt.title('Filter response: phase')
plt.ylabel('angle (deg)')
plt.xlabel('frequency')

fa = []  # array where we will store the filtered results
plt.subplot(2,2,3)
for j in range(repeats):
    e = np.random.normal(size=ns)
    # take Fourier transform:
    ff = np.fft.fft(e)
    # roll off the frequencies
    fr = ff * response
    # take inverse FFT:
    filtered = np.fft.ifft(fr)
    ac = autocorr(filtered)
    acNorm = np.abs(ac / ac.max()) # <<< using abs function here: get amplitude
    fa.append(acNorm)
    plt.plot(acNorm[:nplot])

plt.xlabel('correlation distance')
plt.title('Autocorrelation: result of %d samples'%repeats) 
plt.show()

# calculate the mean of the repeated samples:
repeatedMean = np.mean(fa, axis=0)

plt.subplot(2,2,4)
plt.semilogy(repeatedMean[:nplot])
plt.title('mean of %d repeats'%repeats)
plt.xlabel('correlation distance')
plt.show()

# fit an exponential to the data
# anything beyond the first 20 points does not make much sense - contains no useful information

# fit an exponential to the data
# anything beyond the first 20 points does not make much sense - contains no useful information

nfit = fs*RC  # <<<< added
slopes = []
plt.figure()
for j in range(repeats):
    slope, intercept, r_value, p_value, std_err = stats.linregress(t[:nfit], np.log(fa[j][:nfit]))
    plt.plot(t[:nfit], fa[j][:nfit])
    slopes.append(slope)
plt.title('autocorrelation used for fitting') # <<< fixed typo
plt.xlabel('correlation time')
plt.show()
print 'mean time constant is %.2f ms'%np.mean(np.divide(-1000,slopes)) # <<< changed
print 'standard deviation is %.3f ms'%np.std(np.divide(1000, slopes))  # <<< changed

And here are the plots it produces:

enter image description here

The time constant derived from these fits (inverse of slope) averaged over 100 fits was 1.01 ± 0.15 ms - consistent with the time constant of 1.0 ms that was implied in the choice of components in my code (1 kOhm and 1 uF).

Play around with the code - and let me know if this clears things up for you.

Best Answer

Related Solutions

[Physics] White Noise Generator Through Low Pass Filter – Autocorrelation of normalized data dropping below 0 (showing negative correlation)

Related Question