# Spectral Analysis ### Will Styler - LIGN 168 --- ### Today's Plan - What are complex sounds? - Spectral Analysis and the Fourier Transform - The Time-Frequency Tradeoff --- ### Complex Sounds are everywhere! - As we said, pure sine waves are incredibly uncommon in nature - So, we need another way to characterize these sounds beyond just 'Duration, Amplitude, Period, Frequency, Phase' --- ### Speech is very spectrally complex - The voice generates a 'fundamental' frequency and harmonics - The most important features are not single frequencies, but distributions of energy across frequencies - Just looking at the waveform is not (generally) enough ---
--- ### Clearly, we need to do better than waveforms --- # Spectral Analysis! --- ### How do we find the many components of a complicated signal? --- ## Fourier Analysis Using a Mathematical Process called a Fourier Transform which breaks a signal down into its component frequencies at a certain time, with their phase --- ### We're going to 'blackbox' Fourier Analysis - We're not diving deep on the math - We're not going to go into the difference between Discrete Fourier Analysis and Fast Fourier Analysis - We're just going to tell you what it is, how it works, and what it does --- ### The Core Use of Fourier Analysis - "What frequencies make up this chunk of this signal, and what are their amplitudes and phases?" - We have a signal in the *time domain*, but we want to see it in the *frequency domain* - **Fourier analysis for speech is generally done on a *window* (chunk) of the signal** - This is a "Short Term Fourier Transform (STFT)" --- ### Windowing
--- ### Windowing
--- ### Then we break the wave inside the window apart into components! ---
---
---
--- ## FFT from the middle of "Noise"
--- ### Fourier Transforms show us the component frequencies - It shows us the components in a *given window of the signal* - This is *much* better information for analyzing speech - So, we're going to use them all the time here --- ### Let's walk through this process in more detail --- ### Sound W - 100 Hz - Waveform
--- ### Sound W - 100 Hz - Spectrum
--- ### Sound T - 500 Hz - Waveform
--- ### Sound T - 500 Hz - Spectrum
--- ### Sound F - 1000 Hz - Waveform
--- ### Sound F - 1000 Hz - Spectrum
--- ### What if we were to play these all at once? --- ### Sound WTF - Waveform
--- ### Sound WTF - Spectrum
--- ### That's much easier to cope with! --- ### Let's see how it works in the real world --- ### ... but what about time? - Spectra only show one 'moment' of the signal, captured by a small window --- ### "Noise" - Waveform
--- ### "Noise" - Spectral Slice (FFT)
--- ### How can we look at an entire word? --- ### Question: Why not just use a window which is the length of the entire signal? --- ## Spectrogram Displays a series of FFTs, with peaks arrayed vertically, to show changes in frequency over time --- ### "Noise" - Waveform
--- ### "Noise" - Spectral Slice (FFT)
--- ### "Noise" - Spectrogram
--- ### Sound WTF - Waveform
--- ### Sound WTF - Spectrum
--- ### Sound WTF - Spectrogram
--- ### ... but this is a constant signal, what about changing signals? - Now, we have the sound varying over time - Each *window* will look a little different from the last one - This leads us to a problem called the **Time-Frequency Tradeoff** --- ### The Time-Frequency Tradeoff - "You can have detailed information about changes over time, or detailed information about frequency, **but never both at once**" --- ### Imagine your project group is stalking Will - *Please don't* - You want to know how often I go to Vons - So, you talk to your friendly neighborhood data broker --- ### The SuperStalker Pro Packages - Hourly purchase count for one day - Daily purchase count for two weeks - Weekly walk-in count for three months - Monthly walk in count for a year - **Which gives the best temporal data? Which gives the best frequency data?** --- ### Smaller windows mean better temporal resolution - You'll get more details about changes in frequency over time, but frequency itself is blurrier - This produces a "Broadband" spectrogram --- ### Larger windows mean better frequency resolution - You'll get more details about the exact frequencies present, but it's harder to see the temporal boundaries and changes - This produces a "Narrowband" spectrogram --- ### Smaller Window (Broadband)
--- ### Larger Window (Narrowband)
--- ### Smaller Window (Broadband)
--- ### Larger Window (Narrowband)
--- ### Smaller Window (Broadband)
--- ### Larger Window (Narrowband)
--- ### This forces you to make choices - "Does time or frequency matter more here?" - You can't have both! --- ### Let's have a bit of spectrogram fun --- ### You can have fun on your own - SpectrumView on iOS, Spectroid on Android - https://musiclab.chromeexperiments.com/Spectrogram/ -
- Praat (which you should already have installed) --- ### We're not the only people using Spectrograms - Folks doing Sonar and Submarine Detection love them!
--- ### Here's a nice video on Fourier Transforms
"But what is the Fourier Transform? A visual introduction" by 3Blue1Brown --- ### Our cochleas are doing something like fourier analysis!
--- ### There are other ways to break signals into component parts - **Discrete Cosine Transform (DCT)** approximates the signal as a series of summed cosines - "I don't care what's really there, make me a version from fewer components" - DCT is great for *compression*, more later - **Mel-Frequency Cepstral Coefficients (MFCC)** break the signal down into an opaque matrix of numbers which express changes over time - 'Turn the signal into an opaque matrix of numbers which characterizes it' - More on this later! - **Functional Principal Component Analysis (fPCA)** allows you to find dominant patterns of variation in curves - ... but it's pretty inelegant for periodic signals --- ### We're going to spend a *ton* of time in the Spectral Domain - Most things that matter in speech show up as variations of frequency and amplitude over time - Nearly any approach to ASR or TTS will need to make use of frequency-domain information - ... and regardless, we'll need fourier analysis to evaluate the results - We'll be looking at spectrograms a fair bit this quarter! --- ### Wrapping Up - Complex waves have more than one frequency component - Fourier Analysis helps us understand the frequency components of complex signals - There's an inherent tradeoff between time and frequency knowledge in fourier analysis - Spectral Analysis is *wildly useful* for everything we're doing here --- ### For Next Time - We'll look at what speech looks like, and think a bit more about speech acoustics ---
Thank you!