Overview of spectral analysis
What do ghosts, fourier analysis and landscape structure all have in common? Honestly…not much, but I’m here to talk to you about most of those topics anyway! Enter spectral analysis π».
Spectral analysis is the process of taking complex and signals (e.g., sound waves, images, or even landscape composition via rasters) and decomposing them into into periodic functions (i.e., sine and cosine waves) that can be more easily analyzed. Although there are many uses for spectral analysis, in this blog post I will just examine one particular use in the context of ecology–analyzing landscape structure. I probably don’t need to tell you dear reader that habitat (and thereby landscape structure) is a critical to an animal’s ability to thrive and reproduce. As such, it’s fundamental for wildilfe managers to understand how landscape structure (and our decisions to manage or alter that structure) may or may not impact the wild populations we care about!
Motivation aside spectral analysis we need a landscape. I’ll quickly go ahead and generate one with a handy function that generates landscapes with Gaussian random fields (GRFs).
|
|
Let’s start off by generating a few landscapes with different range parameters (a).
|
|
|
|
|
|
Note that I assumed we were only working with square matrices here…this is almost never the case in real data, but it’s an easy fix and makes things look neater for demonstration. Let’s go ahead and generate (and visualize π)
what the parameter a does…
|
|
|
|
OK great. That was a lot of work just to look at some GRFs…but hey–that’s ok.
|
|
OK so a couple of things to notice here. Remember the value `a` from the previous section? Just by inspection we can see that that the landscape is showing larger (more distinct) patches as the `a` parameter increases. In fancier terms landscape ecologists might say that the landscape is becoming more **autocorrelated**.
OK. OK. I hear what you’re thinking: “Alex, this blog post is about spectral analysis…and so far all we’ve done is plot some fancy looking fake lands…”, and you’d be correct! That is in fact all we’ve done, but now that we have some fake landscapes with varying degrees of structure (another way of talking about how autocorrelated your landscape is) we can start to analyze them…this time I promise.
To do spectral analysis let’s start with a concept…fourier transformations. I’m not scared…are you scared?…(ok I’m a little scared). So the big idea behind fourier transformation is that we can decompose a signal (again could be sound, habitat quality, or information compression in computer files) and decompose it into a set of basis functions. In essence what we’re doing is taking the original data and conducting what’s called a change of basis more on that here or you could check out Khan Academy. In short, a change of basis changes a dataset from one coordinate system to another one. In our case we take a system of coordinates from our landscape (could be real coordinates or the imaginary coordinates of the landscape I made up) and correlating each coordinate-pair to a value in the frequency domain. Let’s look at an example to make this more concrete.
Remember our handy landscapes from before?
|
|
A couple of things to notice here. The object freq_a_10 is a an object of size 100, 100…the same dimensions as our grf landscape grf_a_10!! Coincidence…? Of course not. By the way you’re about to run into some math let’s refer to the landscape grf_a_10 as $ L $ from here on out, deal? Under the hood the function fft() (which stands for fast fourier transform by the way) from the R package stats is doing a change of basis from our original space (habitat values on a coordinate grid) to a totally different space (frequencies on a totally separate but equal grid of indices). The long and short of it is that for each cell in our landscape freq_a_10 finds the correlation between the landscape’s structure and a pair of cosine and sine functions.
Let’s zoom in a minute:
|
|
|
|
OK this is weird right? Why is there an i at the end of the number? And why are there two numbers here if it’s just correlation between two things? Well, the answer comes from the fourier transform itself. Let’s introduce $ \phi(r,c) $ (a basis function) which correlates our landscape index $ (r,c) $ (e.g., (1,0), (53, 15), …) to a value in the frequency space I mentioned earlier.
$$ \phi(r, c) = e^{2\cdot \pi \cdot i \cdot (\frac{kr}{N} + \frac{lc}{M})} (1) $$
So this gets me my correlating value in frequency space…but how do those values we saw in the object freq_a_10? Well I’m glad you asked…let’s calculate just one of those values as an example.
|
|
OK now if you and I haven’t completely lost our sanity…these values should be the same. Let’s check the value returned by fft -671.53 -1029.25i and the value we just calculated…-671.53 +1029.25i. Bingo! They’re the same!! So behind the scenes we’re just getting correlations between our landscape and the basis function. In math this means:
And remember from Euler’s Formula we can write (1) as:
$$ \phi(r, c) = e^{2 \cdot \pi \cdot i \cdot (\frac{kr}{N} + \frac{lc}{M})} = \cos(2 \cdot \pi \cdot (\frac{kr}{N} + \frac{lc}{M})) + i \sin (2 \cdot \pi \cdot (\frac{kr}{N} + \frac{lc}{M})) $$So, our landscape can be written as a decomposition of periodic basis vectors!! Yipee! This is cool…but what can we actually do with this? I’m glad you asked dear reader. One thing we can now do is numerically quantify the spatial structure of our landscape (or whatever else you personally happen to be interested in). Let’s learn about the utility of what we just did with one spectral measure of landscape structure the spectral centroid.
The Spectral Centroid
OK so now we have a frequency-domain representation of our landscape. Each
cell F[k,l] in freq_a_10 tells us how strongly the landscape correlates with a wave that completes k cycles north-south and l cycles east-west. But we want a single number that summarises the dominant spatial scaleβ-this is where the spectral centroid comes in. Unfortunately, to do discuss the spectral centroid we’re going to have to introduce more actors into this play.
Remember how each of the components in our frequency matrix was an ugly complex number with two components (a real and imaginary one?). Yeah…let’s fix that. To do that I’ll introduce the power spectrum β the squared magnitude of each element of the frequency matrix from fft():
|
|
The function Mod() collapses the complex number to a single amplitude. Squaring gives us power. Let take a gander at the cell we looked at earlier 1,510,299. See now it’s just a nice real number. We like real numbers.
Next we need to know the actual spatial frequency that each [k,l] index corresponds to β how many cycles per spatial unit, rather than per domain. To do that I will introduce the the dc component. The term dc actually stands for direct current borrowed from electrical engineering. In the out context it represents the zero-frequency term β the global mean habitat value across the landscape. It is directly analogous to (but not necessarily equal to) the nugget in a variogram or the intercept in regression analysis. But the dc component…it’s got to go.
|
|
Now the freq_mat component gets a little confusing. Since we’ve done a change of basis we are no longer working in euclidean space in the classical sense. We’re instead working in frequency or cycle space. So the euclidean distance we take sqrt(fr^2 + fc^2) is a measure of how many cycles distant each cell is from dc. This is our unit of measurement in this space. The radial frequency collapses the 2D frequency grid to a single number per cell β the combined spatial frequency regardless of direction. A wave with k=3, l=4 has the same radial frequency as k=4, l=3: they both complete the same number of cycles per unit, just oriented differently.
Now the spectral centroid (i.e., the power-weighted mean frequency):
$$ \omega = \frac{\sum_{k,l} f_{kl} \cdot P_{kl}}{\sum_{k,l} P_{kl}} $$Here $ \omega $ is the term I’m using for the spectral centroid metric. Going from the numerator to the denominator. The $ \sum_{k,l}f_{kl} \cdot P_{kl} $ part. The sum is telling us that we want to continue the operation across all of the frequency-space indicies $ k,l $. At each of these indices we want to weight the frequency at a particular index $ f_{k,l} $ by it’s corresponding power $ P_{k,l} $. Remember that $ P_{k,l} $ is the square of the correlation matrix.
|
|
|
|
In short what we did:
fft()β correlate the landscape against every basis wave simultaneouslyMod()^2β how strong is each correlation? (power)freq_matβ what frequency does each correlation correspond to? (ordered from zero outward)sum(f*p)/pβ orf_centroidwhere is the centre of mass of those correlations on the frequency axis (note we are talking about how many times the dominant pattern repeats per cell.).1/f_centroidβ convert to wavelength.f_centroidgives us a frequency, but remember that we’re interested in the scale of landscape autocorrelation, so we need to convert it to how many spatial units it takes to complete a cycle.
The final conversion from frequency to wavelength ($ \omega = 1/f $) gives us a spatial scale in the same units as the landscape. Let’s visualise what we just computed β the full radial power spectrum with the centroid marked:
|
|
OK great so for this particular GRF landscape this plot tells us that it takes it takes 14.09 units to complete a cycle on average. OK great, but let's jump back to the original problem.
Did we make a way to do spectral analysis
We wanted a property that would tell us the scale of landscape structure regardless of the actual structure itseslf…let’s see if we’ve actually done that. Let’s go ahead and write a convient function to summarise the steps we just took to calculate the spectral centroid.
|
|
|
|
|
|
Here comes the kind of underwhelming finale.
|
|
Great! So it's admittedly not a perfect metric (there's some scatter about the one-one line), but considering the messiness of the dastaset we are getting a pretty good idea of the scale of the landscape in a way that's indepedent of the structure and values present!
References & Sources
Pebesma, E. & Bivand, R. (2023). Spatial Data Science: With Applications in R. Chapman and Hall. Boca Raton FL, United States. https://doi.org/10.1201/9780429459016. 10.1201/9780429459016. Taboga, Marco (2021). “Discrete Fourier Transform - Frequencies”, Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/discrete-Fourier-transform-frequencies. Taboga, Marco (2021). “Discrete Fourier transform”, Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/discrete-Fourier-transform. Taboga, Marco (2021). “Change of basis”, Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/change-of-basis.