Audio Processing in Matlab Simulink
Audio Processing in Matlab Simulink
Audio Processing in Matlab Simulink
2006
About Me
Experiences:
1993-1995: The MathWorks, Inc. 1995-now: CS Dept., Tsing Hua Univ., Taiwan
Research interests
Speech/Audio Signal Processing, Fuzzy Logic, Neural Networks, Pattern Recognition, Biometric Identification, Document Classification, Webbased Technologies
Programming languages:
MATLAB, C, JavaScript, VBScript, Perl
2010/8/26
2006
Outline
Wave file manipulation
Reading, writing, recording ...
Time-domain processing
Delay, filtering, sptools
Frequency-domain processing
Spectrogram
Pitch determination
Auto-correlation, SIFT, AMDF, HPS ...
Others
Formant estimation, speech coding
3
2010/8/26
2006
Toolbox/Blockset Used
MATLAB Simulink Signal Processing Toolbox DSP Blockset
2010/8/26
2006
MATLAB Primer
Before you start, you need to get familiar with MATLAB. Please read MATLAB Primer at the following page: http://neural.cs.nthu.edu.tw/jang/demo/demoDownload. asp Exercise: 1. Please plot two curves y=sin(2*t) and y=cos(3*t) in the same figure. 2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).
2010/8/26 5
2006
2010/8/26
2006
Exercise
1. Plot the waveform of rrrrr.wav. Use MATLABs zoom button to find the consecutive curling R occurs. 2. Plot the two-channel waveform in flanger.wav.
7
2010/8/26
2006
2010/8/26
2006
Example (wavPlay01.m)
[y, fs] = wavread(rrrrr.wav); wavplay(y, fs);
Exercise
9
2010/8/26
2006
Example:
Frame-based operation!
Exercise:
Create a model as shown above.
10
2010/8/26
10
2006
Solution
Solution to the previous exercise: slWavFilePlay01.mdl
11
2010/8/26
11
2006
Example (wavWrite01.m)
[y, fs] = wavread(rrrrr.wav); wavwrite(y, fs*1.2, 8, testout.wav); !start testout.wav
Exercise
Try out the above example.
12
2010/8/26
12
2006
Example
1. Go ahead and try WinXP recording utility! 2. Try wavRecord01.m 3. Try slWavFileRecord01.mdl
Exercise:
Try out the above examples.
13
2010/8/26
13
2006
14
2010/8/26
14
2006
Exercise:
Try wavPlay01.m and trace the code. Create wavPlay02.m such that you can record your own voice on the fly.
15
2010/8/26
15
2006
16
2010/8/26
16
2006
Synthetic Sounds
Use a sine wave generator (under DSP blocksets) to produce sounds
Single frequency:
Multiple frequencies:
Amplitude modulation:
Exercise:
17
2010/8/26
17
2006
Solution
Solution to the previous exercise: sineSource01 sineSource02 sineSource03
18
2010/8/26
18
2006
Delay in Speech/Audio
What is a delay in a signal?
y(n) --> y(n-k)
19
2010/8/26
19
2006
-k
Simulink model:
Exercise:
Create the above model.
20
2010/8/26
20
2006
-k
Output y(n)
Simulink model:
21
2010/8/26
21
2006
Exercise:
Create the above model and change some parameters to see their effects. Modify the model to take microphone input (so you can start singing karaoke now!) Use a configurable subsystem to include all possible input files and the microphone. (See next page.)
22
2010/8/26
22
2006
2. Get a block of configurable subsystem 3. Fill the dialog box with the library name
23
2010/8/26
23
2006
Audio Flanging
Flanging sound:
A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound Pitch modulation due to a variable delay
Simulink demo:
dspafxf.mdl (all platforms) dspafxf_nt.mdl (for 95/98/NT)
24
2010/8/26
24
2006
Audio Flanging
Simulink model:
Original spectrogram:
Modified spectrogram:
25
2010/8/26
25
2006
26
2010/8/26
26
2006
Speech Production
How is speech produced?
Speech is produced when air is forced from the lungs through the vocal cords (glottis) and along the vocal tract.
27
2006
28
Two important characteristics of the model are fundamental (pitch) frequency (f0) and formants 2010/8/26 (F1, F2, F3, ) 28
2006
Zoom in
Overlap Frame
29
2010/8/26
29
2006
Spectrogram
Spectrogram (specgram.m) displays short-time frequency contents:
Wave form :
Spectrogram :
30
2010/8/26
30
2006
Real-time Spectrogram
Try dspstfft_win32:
Spectrum:
Spectrogram:
31
2010/8/26
31
2006
32
2010/8/26
32
2006
Spectrogram Reading
Spectrogram Reading
http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe ading/spectrogram_reading.html
Waveform:
Spectrogram:
33
2010/8/26
compute
33
2006
Frequency-domain:
Cepstrum (Noll 1964) Harmonic product spectrum (Schroeder 1968)
Others:
SIFT (Simple inverse filter tracking) Maximum likelihood Neural network approach
34
2010/8/26
34
2006
s(k): s(k-L):
L=30
Autocorrelation x(L):
35
2010/8/26
30
Pitch period
35
2006
Exercise:
Construct the above model and try it.
36
2010/8/26
36
2006
37
2010/8/26
37
2006
Formant Analysis
Characteristics of formants:
Formants are perceptually defined. The corresponding physical property is the frequencies of resonances of the vocal tract. Formant analysis is useful as the position of the first two formants pretty much identifies a vowel.
Computation methods:
38
Peak picking on the smoothed spectrum Peak picking on the LP spectrum Factoring for the LP roots Fitting of mixture of Gaussians
38
2010/8/26
2006
Formant Analysis
Track Draw:
A package for formant synthesis with options to sketch formant tracks on a spectrogram. http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html
39
2010/8/26
39
2006
2010/8/26
40
2006
Conclusions
Ideal tools for speech/audio signal processing:
MATLAB Simulink Signal Processing Toolbox DSP Blockset Reliable functions: well-established and tested Visible graphical algorithm design tools High-level programming language yet C-compatible Powerful visualization capabilities
Advantages:
2010/8/26
41
2006
References
[1] Discrete-Time Processing of Speech Signals, by Deller, Proakis and Hansen, Prentice Hall, 1993 [2] Fundamentals of Speech Recognition, by Rabiner and Juang, Prentice Hall, 1993 [3] Effects Explained, http://www.harmonycentral.com/Effects/effects-explained.html [4] TrackDraw, http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html [5] Speech Coding in MATLAB, http://www.eas.asu.edu/~speech/education/educ1. html
42
42
2010/8/26