In many applications, our digital signal is the result of sampling a continuous signal. Examples abound: the voltage of an electronic signal may be sampled with an A/D converter, a meteorologist may read a barometer at the same time each day, and a navigator may determine the position of his vessel with a noontime sun shot using a sextant or he may determine his position every second with electronic aids, such as satellite navigation. In each case, it may be desired to treat the resulting equally spaced data with an LTI operator. For example, the satellite navigation data may need to be smoothed in some manner to better estimate the ship’s position. Whatever LTI operation is performed on the data, it can be performed by either convolution in the time domain or multiplication in the frequency domain. Therefore, it will be beneficial to see the effects of sampling in both domains.

We start with the continuous signal that exists before sampling. It and its continuous magnitude spectrum are shown in Figure 1a; we omit the phase spectrum for simplicity. In many situations, the signal and its spectrum will be rather broad, as shown.

Figure 1b shows the sampling function with spacing T. Multiplication in the time domain produces the sampled signal; convolution in the frequency domain produces its resulting spectrum. The result is shown in Figure 1c. Convolution with a row of function just places the original function at each – function location, replicating the original function. Thus, sampling in the time domain has induced periodicity in the spectrum. If the spectrum is wider than , an overlap superimposes portions of the true spectrum causing a distortion. This overlapping of the spectrum is **aliasing**. **The amount of the overlap depends on the spectral width compared to ****, the spacing between repeated spectra.** **For any given spectrum, the effect can be reduced by more time domain sampling, which spaces the replicated spectra farther apart.** The **Nyquist sampling theorem** applies only to a signal that is bandlimited so that the spectrum has exactly zero components above frequency . We know from the time-limited/band-limited theorem that such a band limited signal would have to be infinitely long — an impossibillity. Thus, fundamentally, there are no band limited signals and aliasing becomes a matter of degree. In practice, however, the long tail of the spectrum may be masked by other effects, such as noise, so that a sufficiently high sample rate can separate the replicated spectra to a point where the only meaningless information exists in the vicinity of the Nyquist frequency. The result of any further increase in the sampling rate depends on the nature of the contaminating noise. For example, if the noise had large, high-frequency components, they may be folded down into the low-frequency portion of the signal’s spectrum by an insufficient sampling rate. In this case, a higher sampling rate separating the noise spectrum from the signal’s spectrum will improve the data’s usefulness for subsequent signal processing.

Putting noise considerations aside, there is another fundamental limitation that we must impose on the digitized data shown in Figure 1c. In many situations, we can only record a portion of the actual signal. In effect, this truncation is multiplication of the data stream with the boxcar data window of length shown in Figure 1d. In the frequency domain, the spectrum of the data is convolved with the Fourier transform of the data — a sinc function of width

. **This convolution with the sinc function smears out the sharp features of the spectrum, limiting the spectral resolution**. The two sharp lines in our example are now smeared out as shown in Figure 1c. **The longer the data window , the sharper is the sinc function, and hence, the less loss of resolution.** In the limit of an infinitely long data window, its associated sinc function becomes a function, resulting in the identity operation when convolved in the frequency domain, only then is there no loss of resolution. **The dependence of frequency resolution on data window length is a manifestation of the uncertainty principle: it requires a long sample to obtain a precise frequency determination. Note that the frequency resolution depends only on the length and the shape of the data window. An increased sampling rate only separates the replicated spectra, reducing aliasing, but it does not increase spectral resolution.**

In addition to limiting spectral resolution, this convolution of the spectrum with the sinc function has another effect. The slowly decaying side lobes of the sinc function mix spectral energy from one spectrum to its adjacent replicas, as shown in Figure 1e. This interspectral mixing is called **leakage **and is distinct from the loss of resolution that occurs within the spectral replica. **Unlike resolution, leakage can be reduced by increased time domain sampling because of the increased separation between spectral replicas.**

**Both resolution and leakage depend on the shape of the data window. **We have used the boxcar for this discussion; clearly, others would produce a different effect on the spectrum. It turns out that compared to other reasonable window shapes the boxcar’s sinc function has a relatively narrow central peak with comparatively large side lobes. It thus produces good spectral resolution at the expense of poorer leakage properties. Whenever we are using only a portion of a data stream, sampled over some length of time as discussed here, we can never observe the true spectrum, but we see it through this data window: we see an estimate of the spectrum, smoothed by a moving average forced on us by the convolution operation. The selection of the best shape for data window, which in turn determines the kind of smoothing done on the spectrum is called **windowing ( we will discuss an approach to windowing later in these blogs).**

Up to this point, we have only discussed the acquisition of sampled data over a finite window length. These data have a continuous spectrum that is distorted in both resolution and leakage. To view this spectrum, we need to calculate it from the sampled data, using a convenient computational scheme — the DFT, of course. The DFT samples the spectrum at equally spaced points in frequency. This frequency domain sampling is represented by multiplication with the sampling function, as shown in Figure 1f. In the time domain, the data is convolved with the sampling function, producing the replicated data samples shown in Figure 1g. Again, sampling in one domain forces periodicity in the other so that our data is now unavoidably periodic. In our example, we have sampled the spectrum at sufficiently close frequency intervals, , that time domain replicas have gaps (that is, zero values) between them. This development shows how to use the DFT to compute the spectrum at any number of frequency points — simply append zeros to the data. This is called **zero padding **and is normally a good practice in computing spectra of discrete data via the DFT. The spectrum of discrete data is a continuous function of frequency, zero padding is simply a method of computing the spectrum at a large number of points using the DFT. Figure 1 is an important summary of our discussion.

This relationship is sufficiently important for using the DFT in practical applications to warrant some examples. A particularly instructive case is no zero padding. Then, the data window, shown in Figure 2a, is replicated with no gaps. This condition occurs when the replicating spikes in the time domain are separated by the length of the window, T. Then, the frequency sampling points, which we will call the DFT frequencies, are spaced by

up to for an N point DFT. **The important observation is that the zeros of the data window’s sinc function exactly fall on the DFT frequencies at multiples of **.

The significance of this result can be seen by considering sinusoidal data with frequencies that are multiples of . Then, an integer number of periods, n, just fits inside the data window, as shown in Fig2c for . Convolution with the data window’s sinc function produces two superimposed sinc functions located at as shown in Figure 2c. But, because of the special frequency of the sine wave, the DFT’s frequency sampling just hits the zeros of these combined sinc functions at every frequency except

, where the sinc’s central peak is sampled. The resulting DFT spectrum, shown in Figure 2d, is deceptively accurate: the time domain signal has been smoothly joined by its replicas to make an infinitely long sine wave whose spectrum is properly described by two zero-width spikes.

Have we beaten the uncertainty principle in this example, demonstrating infinite resolution and zero leakage? By no means. The sinc function associated with the boxcar data window is really lurking in the background of the DFT spectrum. Zero padding reveals it. The spectrum is continuous, and it can be computed at any frequency points that we desire by using the DFT with zero padding. Figure 3 shows the result of zero padding for our example.

One may argue that the zero padding of Figure 3 is an abomination of our original sine wave data. This suggestion brings us to an extremely important point for interpreting DFT results. To properly use the DFT, one needs to have more information about the signal than just available in the data window. For example, if we know that our signal is just the cosine burst of Figure 3, then a lot of zero padding is the correct thing to do. It makes the repetitive time domain signal of the DFT look more like an isolated burst. Also, because we know that an isolated burst has an infinitely broad spectrum, the higher sampling rates will reduce aliasing. On the other hand, if it were known that the signal under study were periodic with a period exactly equal to the data window, then no zero padding is the correct application of the DFT. If this signal were also band limited and unaliased, the resulting DFT line spectrum would then be exact.

By using the proper procedure, it is possible to use the computational convenience of the DFT to compute all four kinds of Fourier transforms that we have discussed. In some cases, exact results are possible; in others, we must be content with approximations. To see the kind of thinking required to use the DFT in Fourier transform problems, consider the following scenario.

Imagine a digitized sinusoid as sketched in Figure 4a. Approximately, not exactly, one half of a period of the sinusoid occupies the data window. Therefore, energy will be spread out across the entire DFT spectrum because the frequency of the sinusoid does not fall right on a DFT computation frequency. What then, is the interpretation of the DFT spectrum sketched in Figure 4a? To answer this question, we require knowledge of the signal beyond the data window. If we assume it is periodic with a period just equal to the data, then the signal fits the DFT formulization. Is the spectrum exact then? No, because the periodic extension of the signal suggested by the dotted line in the figure, has discontinuities that require it to be broadbanded. This infinite bandwidth signal is therefore necessarily aliased, and the DFT spectrum is inexact. A higher sampling rate, as shown in Figure 4b, will reduce the aliasing and improve the spectral estimation. This demonstrates the method of using the DFT to estimate the Fourier series coefficients of a broadbanded continuous function.

We might ask what signal does have the exact spectrum shown in Figure 4b? Zero padding in the frequency domain, as shown in Figure 4c, gives the answer. This zero padding is saying that the DFT spectrum is exact out to Nyquist and there is no spectral energy beyond. That is, we are now saying that the signal is bandlimited. This bandlimited signal is recovered by taking the IDFT of the zero padded spectrum as shown in Figure 4c. This procedure is really interpolating between the discrete points in the time domain by assuming that the signal is band limited and unaliased. Any amount of interpolation is possible; it just requires more frequency domain zero padding to get more interpolated points in the time domain. In the limit, a continuous, band limited time domain signal is approached. It is this signal that has the DFT spectrum as Fourier series coefficients.

The last possibility that we consider is that the signal is a single burst of the sinusoid. Such a signal, of course, is broadbanded, having spectral components extending to infinity. Zero padding in the time domain will make the periodic DFT representation look more like a single burst, as suggested in Fig 4d. As the sampling rate is increased to lower aliasing and zero padding is increased, the DFT spectrum approaches the broadbanded, non periodic, continuous spectrum of the sinusoidal burst, but never completely reaches it.

In summary, then, essentially the same data can be used to generate DFT spectra with greatly different interpretations. In two cases, the bandlimited nature of a signal permits the DFT to produce an exact spectrum. The most common of these two cases is the discrete, finite length sequence whose spectrum is naturally periodic. Time domain zero padding reveals the exact spectrum. The less common case is the continuous periodic signal that is known to be band limited. If the data window can be matched to exactly one period and a sufficiently high sample rate is used, the DFT will produce the true line spectrum.

In two other cases, the broadband nature of the signals allows only an estimate of the true spectrum via the DFT. If the signal is periodic and sampled for an integer multiple of its fundamental period, the DFT spectrum will approach the exact line spectrum as the sample rate increases. If the signal is known to be zero outside of the data window, the DFT spectrum will approach the correct continuous Fourier spectrum with increased sampling and increased zero padding.

It is a central point in DFT signal processing that one needs to know information about the data beyond the data window to determine the proper DFT treatment. this formulation of a signal’s behaviour beyond the observed data window is called data modelling. In addition to the four models discussed here, many more are possible; we will examine them later in spectral estimation. Next, we will use the sampling function and the properties of zero padding that we have discussed to explore some important manipulations of data and their DFT spectra.

Until the next blog,

Nalin Pithwa