The Frequency Domain and the Fourier Transform
Throughout the class, we have thought about our signal in the time domain. That is, we have modeled our signal as a time-dependent process f(t)1 that we often described by the dynamics determining the transition from time step t to time step t+1. Now, we will move to the frequency domain, where we think of our process as a combination of periodic processes. That is, we will be interested in describing our more general process as a sum of processes that exhibit the property:
\begin{equation*} f(t+P)=f(t) \tag{10.1} \end{equation*}
where P is the period of the process. As we will see moving forward, the basis functions we will want to use are sinusoids with the generic equation:
\begin{equation*} f(t)=A \cos (2 \pi k t+\phi), \tag{10.2} \end{equation*}
where A is the amplitude, k is the frequency (inverse of the period), and \phi is the phase. A controls the maximum and minimum of our function, k controls our period of oscillation (how often the function repeats itself), and \phi determines at what point in our oscillations we find ourselves at t=0. It can often be useful to represent this basis function as one of:
\begin{align*} A \cos (2 \pi k t+\phi) & =A_{1} \cos (2 \pi k t)-A_{2} \sin \left(2 \pi kt\right) \tag{10.3}\\ & =\frac{A}{2}\left(e^{i(2 \pi k t+\phi)}+e^{-i(2 \pi k t+\phi)}\right) \tag{10.4} \end{align*}
The second equation comes from the identity \cos (\alpha+\beta)=\cos (\alpha) \cos (\beta)- \sin (\alpha) \sin (\beta) and has A_{1}=A \cos (\phi) and A_{2}=A \sin (\phi). The third equation comes from the Euler formula:
\begin{equation*} e^{i x}=\cos (x)+i \sin (x) \tag{10.5} \end{equation*}
where i is the imaginary unit.
Let’s start with the simple question: how do we represent some arbitrary function f(t) using a basis of periodic functions? This concept is known as the Fourier series expansion of our function. Essentially, we want to write our function as:
\begin{equation*} f(t)=\sum_{n=0}^{\infty}\left(A_{n} \cos \left(\frac{2 \pi n t}{P}\right)+B_{n} \sin \left(\frac{2 \pi n t}{P}\right)\right) \quad B_{0}=0 \tag{10.6} \end{equation*}
Where P is the period on which we’ve observed our signal. We have not yet motivated why this is a good basis for our analysis. It turns out that we can extract these coefficients through the integrals:
\begin{align} A_{0} & =\frac{1}{P} \int_{0}^{P} f(t) d t \tag{10.7}\\ A_{n} & =\frac{2}{P} \int_{0}^{P} f(t) \cos \left(\frac{2 \pi n t}{P}\right) d t, \quad n \geq 1 \tag{10.8}\\ B_{n} & =\frac{2}{P} \int_{0}^{P} f(t) \sin \left(\frac{2 \pi n t}{P}\right) d t, \quad n \geq 1 \tag{10.9} \end{align}
Given Equation 10.6, the above equations can be derived by taking advantage of the properties of the sine and cosine function. We can understand A_{0} as capturing the mean of the function, whereas the remaining terms capture the amplitude of the signal for the periods \frac{P}{n}. We can think of the coefficients A_{n} and B_{n} as the values of our function in the frequency domain. That is \tilde{f}\left(k=\frac{n}{P}\right)=\left[A_{n}, B_{n}\right]. This is a little informal at the moment; making this statement more formal requires embedding our function in the complex plane.
So why did we pick sines and cosines as the basis for our expansion? Most importantly, because it can be shown that the series expansion in Equation 10.6 converges to the function with relatively few assumptions about the function (the exact assumptions required depend on the exact criteria for convergence). Another way to phrase this is that our transformation is invertible: we can convert to and from the coefficients of the Fourier series and the values of our function. The second is the orthogonality property from Equation 10.7, Equation 10.8, and Equation 10.9 makes it easy to calculate the coefficients of this series.
Before we introduce the Fourier transform, we will want to write our Fourier series in a slightly less intuitive form:
\begin{align*} f(t) & =\sum_{n=0}^{\infty}\left(A_{n} \cos \left(\frac{2 \pi n t}{P}\right)+B_{n} \sin \left(\frac{2 \pi n t}{P}\right)\right) \tag{10.10}\\ & =\sum_{n=0}^{\infty}\left(A_{n}\left(\frac{e^{2 \pi i n t / P}+e^{-2 \pi i n t / P}}{2}\right)+B_{n}\left(\frac{e^{2 \pi i n t / P}-e^{-2 \pi i n t / P}}{2 i}\right)\right) \tag{10.11}\\ & =\sum_{n=0}^{\infty}\left(\frac{A_{n}-i B_{n}}{2} e^{\frac{2 \pi i n t}{P}}+\frac{A_{n}+i B_{n}}{2} e^{\frac{-2 \pi i n t}{P}}\right) \tag{10.12}\\ & =\sum_{n=-\infty}^{\infty} C_{n} e^{\frac{2 \pi i n t}{P}} \tag{10.13} \end{align*}
where we have:
\begin{align} C_{n} & =\frac{A_{n}-i B_{n}}{2}, \quad n>0 \tag{10.14}\\ C_{n} & =A_{0}2, \quad n=0 \tag{10.15}\\ C_{n} & =\frac{A_{|n|}+i B_{|n|}}{2}, \quad n<0 \tag{10.16} \end{align}
This is sometimes called the exponential representation of the Fourier series.
Fourier Transform
With this foundation, we can now introduce the central tool of spectral analysis, the Fourier transform. First, let’s note that we can reconstruct the C_{n} terms for the series representation in Equation 10.13 with a similar integral as the A_{n} and B_{n} terms:
\begin{equation*} C_{n}=\frac{1}{P} \int_{0}^{P} f(t) e^{\frac{-2 \pi i n t}{P}} d t \tag{10.17} \end{equation*}
Now imagine that we take P \rightarrow \infty. In other words, let’s imagine that we have observed the function over all time. First, the frequencies we are considering go from \frac{n}{P} \rightarrow k \in \mathbb{R}. This gives us the Fourier transform for a series2.
\begin{equation*} \tilde{f}(k)=\int_{-\infty}^{\infty} f(t) e^{-2 \pi i k t} d t \tag{10.18} \end{equation*}
where \tilde{f} is a function over the complex numbers. Note that in this general framing, we don’t need f(t) to be a real-valued function. It can be complex. In practice, we will not be interested in complex-valued functions in the real world. For a real-valued function, we have the property that:
\begin{equation*} \tilde{f}(k)=\tilde{f}(-k)^{\dagger} \tag{10.19} \end{equation*}
where \dagger represent the complex conjugate. This is similar to the result we found in Equation 10.14 and Equation 10.16. This means that, on real-valued functions, we only need to calculate the Fourier transform for nonnegative k.
For functions that are continuous and absolutely integrable, the Fourier transform is invertible:
\begin{equation*} f(t)=\int_{-\infty}^{\infty} \tilde{f}(k) e^{2 \pi i k t} d k \tag{10.20} \end{equation*}
Since we can transform between f(t) and \tilde{f}(k), both are valid representations of the underlying function. As we alluded to in the previous section, we can think of f(t) as the time-domain representation of the function and \tilde{f}(k) as the frequency-domain representation of the function. In general, techniques that work in the frequency domain are labeled Spectral Analysis methods.
As we will see in lab, for periodic signals the frequency-domain representation can often be more informative than the time-domain representation. A sine function has infinite non-zero terms in the time-domain, but only a single non-zero term in the frequency domain. Identifying the period of our signal is a simple exercise in the frequency-domain and a more challenging problem in the time-domain.
Similarly, the frequency-domain representation also allows us to filter our signal to certain frequencies. We can do this through the use of a window function in our fourier space:
\begin{equation*} f_{\tilde{w}}(t)=\int_{-\infty}^{\infty} \tilde{w}(k) \tilde{f}(k) e^{2 \pi i k t} d k \tag{10.21} \end{equation*}
where \tilde{w}(k) is our window function and f_{\tilde{w}}(t) is our filtered function in the time-domain. There are a number of common choices for this filter function:
Delta function - a filter function that has non-zero value for a single frequency:
\tilde{w}(k)= \begin{cases}1 & \text { if } k=k_{w} \tag{10.22}\\ 0 & \text { if } k \neq k_{w}\end{cases}
Tophat filter - a filter function that has non-zero value for a small range of frequencies:
\tilde{w}(k)= \begin{cases}1 & \text { if }\left|k-k_{w}\right| \leq l \tag{10.23}\\ 0 & \text { if }\left|k-k_{w}\right|>l\end{cases}
High-pass filter - a filter function that removes the low-frequency signal:
\tilde{w}(k)= \begin{cases}1 & \text { if }|k| \geq k_{w} \tag{10.24}\\ 0 & \text { if }|k|<k_{w}\end{cases}
Low-pass filter - a filter function that removes the high-frequency signal:
\tilde{w}(k)= \begin{cases}1 & \text { if }|k| \leq k_{w} \tag{10.25}\\ 0 & \text { if }|k|>k_{w}\end{cases}
Discrete Fourier Transform
So far, our discussion of both the Fourier Series and the Fourier Transform has assumed that we have access to the continuous function f(t). In timeseries analysis applications, we usually only have access to the function at discrete points in time. That is we only have \left\{X_{0}, \ldots, X_{T-1}\right\}3. This will impose two limits on us: we cannot consider the interval from ( -\infty, \infty ) and we can only sum, not integrate. Therefore, instead of the Fourier transform, we will use the discrete Fourier transform (DFT):
\begin{equation*} \tilde{X}_{n}=\frac{1}{T} \sum_{t=0}^{T-1} X_{t} e^{\frac{-2 \pi i n t}{T}} \tag{10.26} \end{equation*}
Each term in our transform corresponds to frequency k=\frac{n}{T}. This also comes paired to the inverse discrete Fourier transform (iDFT).
\begin{equation*} X_{t}=\sum_{t=0}^{T-1} \tilde{X}_{n} e^{\frac{2 \pi i n t}{T}} . \tag{10.27} \end{equation*}
The fact that our DFT has a factor of \frac{1}{T} and our iDFT has no factor out front is a choice of convention. We pick this convention because then our DFT values \tilde{X}_{n} map directly to the discrete version of our Fourier series coefficients C_{n}. Notice that we have arbitrarily set the time difference between t=0 and t=1 to be one unit of time. We may often be interested in expressing our frequencies and times in other units. Thankfully, this conversion is simple. If the difference in time between t and t+1 is P_{s} in our units of interest, then:
\begin{align*} t^{\prime} & =P_{s} t \tag{10.28}\\ k^{\prime} & =\frac{k}{P_{s}} \tag{10.29} \end{align*}
where t^{\prime} and k^{\prime} are our time and frequency in the desired units. k_{s}^{\prime}=\frac{1}{P_{s}} or k_{s}=1 is called the sampling frequency.
At first pass, it looks like all we have done is transform our integral to a sum. In reality, we have made a number of subtle choices. Most notable, unlike our Fourier series, we only sum from t=0 until t=T-1. If we consider k=\frac{n}{T} our frequency, we seem to have made a substantial cut in frequency space. It turns out that if we sample a signal with some frequency k_{s}, then the maximum frequency we can reconstruct is \frac{k_{s}}{2}. This is known as the Nyquist frequency. Understanding this statement more deeply requires a discussion of the Nyquist-Shannon theorem, but we can build the necessary intuition by considering the example of sines. Imagine that we are sampling a function at the sampling frequency k_{s}^{\prime}. Let’s compare two sine functions, one with frequency k^{\prime} and another with frequency k^{\prime}+k_{s}^{\prime}. Since we are sampling the function with frequency k_{s}^{\prime}, we are measuring it at times t^{\prime}=0, \frac{1}{k_{s}^{\prime}}, \ldots, \frac{T-1}{k_{s}^{\prime}}. In other words, we are measuring at times that are integer multiples of \frac{1}{k_{s}^{\prime}} for some integer j. So:
\begin{align*} \sin \left(2 \pi\left(k^{\prime}+k_{s}^{\prime}\right) t\right) & =\sin \left(2 \pi \frac{k^{\prime} \cdot j}{k_{s}^{\prime}}+2 \pi \frac{k_{s}^{\prime} \cdot j}{k_{s}^{\prime}}\right) \tag{10.30}\\ & =\sin \left(2 \pi \frac{k^{\prime} \cdot j}{k_{s}^{\prime}}+2 \pi j\right) \tag{10.31}\\ & =\sin \left(2 \pi \frac{k^{\prime} \cdot j}{k_{s}^{\prime}}\right) \tag{10.32}\\ & =\sin \left(2 \pi k^{\prime} t\right) \tag{10.33} \end{align*}
where we arrived at the second-to-last last line by taking advantage of the periodicity of the sine function. That means that our samples would have identical values for a sine function at frequency k^{\prime} or k^{\prime}+k_{s}^{\prime}. This same proof would work for any integer multiple of k_{s}^{\prime}, so we can only uniquely determine our Fourier transform in some interval of length k_{s}^{\prime}. In the units of k this is just 1. Therefore, our DFT sums from n=0, \cdots, T-1 since this corresponds to k=0, \cdots, \frac{T-1}{T}.
To get to the Nyquist frequency, \frac{k_{s}^{\prime}}{2}, we use that our observations X_{t} are real. Therefore we know that:
\begin{equation*} \tilde{X}_{-n}^{\dagger}=\tilde{X}_{n} . \tag{10.34} \end{equation*}
Our example with the sines has shown that our DFT coefficients are periodic. In other words:
\begin{align*} \tilde{X}_{T-n} & =\tilde{X}_{-n} \tag{10.35}\\ & =\tilde{X}_{n}^{\dagger} \tag{10.36} \end{align*}
Therefore the second half of the values in our interval are required to be the complex conjugate of the first half, meaning that we only really have control over T // 2+1 terms where // is integer division. A lot of implementation of the real Fourier transform will only return the first T // 2+1 terms for this reason.
Short-Time Fourier Transform
Throughout this discussion, we have been treating the amplitude of our Fourier series, or equivalently the terms of our Fourier transform, as time-independent. Sometimes, it can make more sense to model the data as a periodic signal convolved with a time-dependent amplitude. Essentially, rather than considering the Fourier transform of the full dataset, we take the Fourier transform of our data on discrete time intervals. Mathematically, we can express this through a window function in the time-domain. In the continuous case this is:
\begin{equation*} \tilde{f}_{w}(k)=\int_{-\infty}^{\infty} w(t) f(t) e^{-2 \pi i k t} d t \tag{10.37} \end{equation*}
This is known as the short-time Fourier Transform (STFT) The form of the window function can vary, but the most easy to understand choice is a top-hat filter function:
w(t)= \begin{cases}1 & \text { if }\left|t-t_{w}\right| \leq l \tag{10.38}\\ 0 & \text { if }\left|t-t_{w}\right|>l\end{cases}
This picks out a window in time on which to do the Fourier transform, just like the top-hat filter in the frequency space picked out a window of frequencies. If we label our filters by t_{w}, then we can say that our full set of Fourier transform coefficients are:
\begin{equation*} \tilde{f}\left(k, t_{w}\right)=\int_{-\infty}^{\infty} w_{t_{w}}(t) f(t) e^{-2 \pi i k t} d t \tag{10.39} \end{equation*}
So long as our function is well-behaved and we make sensible choices for our filtering function w and our sampling scheme for t_{w}, the STFT is invertible (see the overlap-add method). As we will see in lab, this can often reveal more meaningful structure in our data that a simple discrete Fourier transform. However, because we are effectively running a DFT on smaller subsets of the data, we are also reducing the number of frequencies we are sensitive to.
Additional Spectral Analysis Tools
The Fourier transform discussion in this lecture only scratches the surface of what can be done in spectral analysis. Here are a few other concepts for further reading:
Lomb-scargle periodogram and non-uniform discrete fourier transform
These techniques allow us to measure the signal’s representation in the frequency domain when the sampling of our signal is not uniform (i.e. the time interval between X_{t} and X_{t-1} is not necessarily equal to the time interval between X_{t^{\prime}} and X_{t^{\prime}-1}.
Bartlett’s method, Welch’s method, and multitaper
These techniques work on uniform data but attempt to account for the limitations of empirical measurements (i.e., noise).
ARMA
While we didn’t have time to cover it in this lecture, there is actually a connection between ARMA modeling and spectral analysis. While it is a little difficult to parse, a discussion can be found in Shumway and Stoffer.