index.md (2962B)
1 +++ 2 title = 'Sound processing' 3 template = 'page-math.html' 4 +++ 5 # Sound processing 6 Problems with sound capture: 7 1. Acquired signals are very noisy 8 2. Context information is hidden 9 10 How do we process sound to classify and extract information? 11 12 Basic features of sound 13 14 - Volume (air pressure) or loudness (dB) — amplitude of wave 15 - Frequency (Hz) or pitch — frequency of wave 16 17 ## Periodic signal — Fourier 18 19 When the signal is sinusoidal, it’s simple to calculate the frequency with a physics formula. 20 21 But if it’s not sinusoidal, what do you do? Analyse frequency spectrum. Enter Fourier. 22 23 Fourier: almost every signal can be broken down into multiple sinusoidal waves with different frequencies and amplitudes. 24 25 Instead of having signal amplitude as function of time, represent it by function of frequencies. 26 27 ![screenshot.png](2f8ad86778ebb0b91e9ebc527decb0d4.png) 28 29 Then you end up with a Fourier series — sum of simple sinusoidal waves with frequencies kf₀, amplitudes Ak and phase shifts φk: 30 31 $x(t) = A_{0} + \sum_{k=1}^N A_{k} \sin (2 \pi k f_{0} t + \phi_{k})$ 32 33 The periodic signal has a frequency spectrum of various harmonics: 34 35 ![screenshot.png](8ecb6e39f786a6738ceaea52c1640948.png) 36 37 Component frequencies are a multiple of the fundamental frequency, called harmonics. 38 39 You can calculate amplitudes Ak with an algorithm called FFT (Fast Fourier Transform), in a vector. 40 41 You put in the vector of samples and the number of samples N, and you get out a vector of amplitudes of length N+1 42 43 - First element is DC component with frequency 0 44 - You can really only use the first half of the vector 45 46 Formulas: 47 48 <table> 49 <tr><td>Frequency step</td> 50 <td>Frequency at amplitude</td> 51 <td>Nyquist frequency</td> 52 <td>Last useful amplitude</td> 53 </tr> 54 <tr> 55 <td>$\Delta f = \frac{F_s}{N}$</td> 56 <td>$f_{k} = k \Delta f = \frac{kF_{s}}{N}$</td> 57 <td>$F_{s}/2$</td> 58 <td>$f_{N/2} = N/2 \Delta f$</td> 59 </tr> 60 </table> 61 62 Nyquist frequency (fc): maximum freq. detected using FFT; half sampling rate Fs. 63 64 ## Not periodic — short time analysis 65 some sound signals are periodic for a very short time 66 67 ![screenshot.png](5a9081f841b448d241811917f4eea3e3.png)![screenshot.png](fb0360fdcbdf2c0fa8c15ce7ddbe6670.png) 68 69 Cut the speech in segments (frames). Then you can apply FFT on those pieces. 70 This is called segmentation or windowing. 71 72 ### Spectrogram 73 Freq. spectrum varies in time 74 75 Graph with time on x-axis, frequency on y-axis and colour being amplitude of each frequency 76 77 ![screenshot.png](fe629573739f7ff022dd7c5ae666c281.png) 78 79 ![screenshot.png](e90248e66991c5183a713e851b9fbda8.png) 80 81 ### Digital filtering 82 Time domain: moving average filter 83 Frequency domain: 84 85 - Low-pass 86 - High-pass 87 - Band-pass — allow only a certain frequency band 88 - Band-reject (notch-filter) — allow everything but a certain frequency band 89 - sample signal, compute spectrum using FFT, set to zero portions of spectrum that are just noise, and inverse FFT to synthesise improved signal