Audio Processing in Matlab Simulink

2006
Speech/Audio Signal Processing in MATLAB/Simulink

J. J.-S. Roger Jang ( ) ) CS Dept, TsingTsing-Hua Univ, Taiwan ( ) )
http://www.cs.nthu.edu.tw/~jang [email protected]
2006
About Me
Experiences:
1993-1995: The MathWorks, Inc. 1995-now: CS Dept., Tsing Hua Univ., Taiwan
Research interests
Speech/Audio Signal Processing, Fuzzy Logic, Neural Networks, Pattern Recognition, Biometric Identification, Document Classification, Webbased Technologies
Programming languages:
MATLAB, C, JavaScript, VBScript, Perl
2010/8/26
2006
Outline
Wave file manipulation
Reading, writing, recording ...
Time-domain processing
Delay, filtering, sptools
Frequency-domain processing
Spectrogram
Pitch determination
Auto-correlation, SIFT, AMDF, HPS ...
Others
Formant estimation, speech coding
3
2010/8/26
2006
Toolbox/Blockset Used
MATLAB Simulink Signal Processing Toolbox DSP Blockset
2010/8/26
2006
MATLAB Primer
Before you start, you need to get familiar with MATLAB. Please read MATLAB Primer at the following page: http://neural.cs.nthu.edu.tw/jang/demo/demoDownload. asp Exercise: 1. Please plot two curves y=sin(2*t) and y=cos(3*t) in the same figure. 2. Please plot x vs. y where x=sin(2*t) and y=cos(3*t).
2010/8/26 5
2006
To Read a Wave File

To read a MS .wav file (PCM format only): wavread
y = wavread(file) [] = wavread(file, [n1, n2]) [y, fs, nbits, opts] = wavread(file) [] = wavread(file, n) [y, fs, nbits] = wavread(file)
If the wav file is stereo, y will be a two-column matrix.
2010/8/26
2006
To Read a Wav File

Example (wavRead01.m):
[y, fs] = wavread('singapore.wav'); plot((1:length(y))/fs, y); xlabel('Time in seconds'); ylabel('Amplitude');
Exercise
1. Plot the waveform of rrrrr.wav. Use MATLABs zoom button to find the consecutive curling R occurs. 2. Plot the two-channel waveform in flanger.wav.
7
2010/8/26
2006
Solution to the Previous Exercise

wavRead02.m:
[y, fs] = wavread(flanger.wav); subplot(2,1,1), plot((1:length(y))/fs, y(:,1)); subplot(2,1,2), plot((1:length(y))/fs, y(:,2));
2010/8/26
2006
To Play Wav Files

To play sound using Windows audio output device: wavplay, sound, soundsc
wavplay(y, fs) wavplay(y, fs, async): non-blocking call wavplay(y, fs, sync): blocking call sound(y, fs) soundsc(): autoscale the sound
Example (wavPlay01.m)
[y, fs] = wavread(rrrrr.wav); wavplay(y, fs);
Exercise
9
2010/8/26
Follow the example to play flanger.wav.
2006
To Read/Play Using DSP Blocks

To read/play sound using DSP Blockset:
DSP Blockset/DSP Sources/From Wave File DSP Blockset/DSP Sinks/To Wave Device
Example:
Frame-based operation!
Exercise:
Create a model as shown above.
10
2010/8/26
10
2006
Solution
Solution to the previous exercise: slWavFilePlay01.mdl
11
2010/8/26
11
2006
To Write a Wave File

To write MS wave files: wavwrite
wavwrite(y, fs, nbits, wavefile) nbits must be 8 or 16. y must have two columns for stereo data. Amplitude values outside [-1,1] are clipped.
Example (wavWrite01.m)
[y, fs] = wavread(rrrrr.wav); wavwrite(y, fs*1.2, 8, testout.wav); !start testout.wav
Exercise
Try out the above example.
12
2010/8/26
12
2006
To Record a Wave File

To record wave files:
1. Use the recording utility under WinXP. 2. Use wavrecord under MATLAB. 3. Use From Wave Device under Simulink, under DSP Blocksets/Platform Specific IO/Windows (Win32)
Example
1. Go ahead and try WinXP recording utility! 2. Try wavRecord01.m 3. Try slWavFileRecord01.mdl
Exercise:
Try out the above examples.
13
2010/8/26
13
2006
Time-Domain Speech Signals

A typical time-domain plot of speech signals:
Amplitude: volume or intensity Frequency: pitch
14
2010/8/26
14
2006
Changing Wave Playback Param.

To control the play of a sound:
Normal: wavplay(y, fs) High volume: wavplay(2*y, fs) Low volume: wavplay(0.5*y, fs) High pitch (and faster): wavplay(y, 1.2*fs) Low pitch (and slower): wavplay(y, 0.8*fs)
Exercise:
Try wavPlay01.m and trace the code. Create wavPlay02.m such that you can record your own voice on the fly.
15
2010/8/26
15
2006
Time-Domain Signal Processing

Take-home exrecise:
How to get a high pitch with the same time span?
16
2010/8/26
16
2006
Synthetic Sounds
Use a sine wave generator (under DSP blocksets) to produce sounds
Single frequency:
Multiple frequencies:
Amplitude modulation:
Exercise:
17
2010/8/26
Create the above models.
17
2006
Solution
Solution to the previous exercise: sineSource01 sineSource02 sineSource03
18
2010/8/26
18
2006
Delay in Speech/Audio
What is a delay in a signal?
y(n) --> y(n-k)
What effects can delay generate?

Echo Reverberation Chorus Flanging
19
2010/8/26
19
2006
Single Delay in Audio Signal

Block diagram:
Input u(n)
-k
Output y(n) = u(n) + a*u(n-k)
Simulink model:
Exercise:
Create the above model.
20
2010/8/26
20
2006
Multiple Delay in Audio Signal

How to create karaoke effects:
a
Input u(n)
-k
Output y(n)
y(n) = u(n) + a u(n-k) + a 2u(n-2k) + a 3u(n-3k) ...
Simulink model:
21
2010/8/26
21
2006

Parameter values:
Feedback gain a < 1 Actual delay time = k/fs
Exercise:
Create the above model and change some parameters to see their effects. Modify the model to take microphone input (so you can start singing karaoke now!) Use a configurable subsystem to include all possible input files and the microphone. (See next page.)
22
2010/8/26
22
2006

How to use configurable subsystem block?
1. Create a library (say, wavinput.mdl)
2. Get a block of configurable subsystem 3. Fill the dialog box with the library name
23
2010/8/26
23
2006
Audio Flanging
Flanging sound:
A sound similar to the sound of a jet plane flying overhead, or a "whooshing" sound Pitch modulation due to a variable delay
Simulink demo:
dspafxf.mdl (all platforms) dspafxf_nt.mdl (for 95/98/NT)
24
2010/8/26
24
2006
Audio Flanging
Simulink model:
Original spectrogram:
Modified spectrogram:
25
2010/8/26
25
2006
Signal Processing Using sptool

To invoke sptool, type sptool.
26
2010/8/26
26
2006
Speech Production
How is speech produced?
Speech is produced when air is forced from the lungs through the vocal cords (glottis) and along the vocal tract.
Analogy to System Theory:

Input: air forced into the vocal cords Output: media vibration System (or filter): vocal tract Pitch frequency: frequency of the input Formant frequency: resonant frequency
2010/8/26 27
27
2006
Source Filter Model of Speech

The source-filter model of speech production:
Speech is split into a rapidly varying excitation signal and a slowly varying filter. The envelope of the power spectra contains the vocal tract information.
28
Two important characteristics of the model are fundamental (pitch) frequency (f0) and formants 2010/8/26 (F1, F2, F3, ) 28
2006
Frame Analysis of Speech Signal

Speech wave form :
Zoom in
Overlap Frame
29
2010/8/26
29
2006
Spectrogram
Spectrogram (specgram.m) displays short-time frequency contents:
Wave form :
Spectrogram :
30
2010/8/26
30
2006
Real-time Spectrogram
Try dspstfft_win32:
Spectrum:
Spectrogram:
31
2010/8/26
31
2006
Pitch and Formants

Pitch and formants can be defined visually:
First formant F1 Pitch period = 1/f0 Second formant F2
32
2010/8/26
32
2006
Spectrogram Reading
Spectrogram Reading
http://cslu.cse.ogi.edu/tutordemos/SpectrogramRe ading/spectrogram_reading.html
Waveform:
Spectrogram:
33
2010/8/26
compute
33
2006
Pitch Determination Algorithms

Time-domain:
Auto-correlation AMDF (Average Magnitude Difference Function) Gold-Rabiner algorithm (1969)
Frequency-domain:
Cepstrum (Noll 1964) Harmonic product spectrum (Schroeder 1968)
Others:
SIFT (Simple inverse filter tracking) Maximum likelihood Neural network approach
34
2010/8/26
34
2006
Autocorrelation of Each Frame

Let s(k) be a frame of size 128.

s(k): s(k-L):
L=30
x(30) = dot prod. of overlapped = sum(s(31:128).*s(1:99)
Autocorrelation x(L):
35
2010/8/26
30
Pitch period
35
2006
Autocorrelation via DSP Blockset

Real-time autocorrelation demo:
Exercise:
Construct the above model and try it.
36
2010/8/26
36
2006
Pitch Tracking via Autocorrelation

Real-time pitch tracking via autocorrelation: pitch2.mdl
37
2010/8/26
37
2006
Formant Analysis
Characteristics of formants:
Formants are perceptually defined. The corresponding physical property is the frequencies of resonances of the vocal tract. Formant analysis is useful as the position of the first two formants pretty much identifies a vowel.
Computation methods:

38
Peak picking on the smoothed spectrum Peak picking on the LP spectrum Factoring for the LP roots Fitting of mixture of Gaussians
38
2010/8/26
2006
Formant Analysis
Track Draw:
A package for formant synthesis with options to sketch formant tracks on a spectrogram. http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html
Formant Location Algorithm

MATLAB code by Michelle Jamrozik http://ece.clemson.edu/speech/files.htm
39
2010/8/26
39
2006
Speech Waveform Coding

Time domain coding
PCM: Pulse Code Modulation DPCM: Differential PCM ADPCM: Adaptive Differential PCM (dspadpcm.mdl)
Frequency domain coding

Sub-band coding Transform coding
Speech Coding in MATLAB

http://www.eas.asu.edu/~speech/education/educ1.ht ml
40
2010/8/26
40
2006
Conclusions
Ideal tools for speech/audio signal processing:
MATLAB Simulink Signal Processing Toolbox DSP Blockset Reliable functions: well-established and tested Visible graphical algorithm design tools High-level programming language yet C-compatible Powerful visualization capabilities
Advantages:
Easy debugging Integrated environment

41
2010/8/26
41
2006
References
[1] Discrete-Time Processing of Speech Signals, by Deller, Proakis and Hansen, Prentice Hall, 1993 [2] Fundamentals of Speech Recognition, by Rabiner and Juang, Prentice Hall, 1993 [3] Effects Explained, http://www.harmonycentral.com/Effects/effects-explained.html [4] TrackDraw, http://www.utdallas.edu/~assmann/TRACKDRAW/t rackdraw.html [5] Speech Coding in MATLAB, http://www.eas.asu.edu/~speech/education/educ1. html
42
42
2010/8/26

Audio Processing in Matlab Simulink

Uploaded by

Copyright:

Available Formats

Audio Processing in Matlab Simulink

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Audio Processing in Matlab Simulink

Uploaded by

Copyright:

Available Formats

2006

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wave File

If the wav file is stereo, y will be a two-column matrix.

Speech/Audio Signal Processing in MATLAB/Simulink

To Read a Wav File

Speech/Audio Signal Processing in MATLAB/Simulink

Solution to the Previous Exercise

Speech/Audio Signal Processing in MATLAB/Simulink

To Play Wav Files

Follow the example to play flanger.wav.

Speech/Audio Signal Processing in MATLAB/Simulink

To Read/Play Using DSP Blocks

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

To Write a Wave File

Speech/Audio Signal Processing in MATLAB/Simulink

To Record a Wave File

Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Speech Signals

Amplitude: volume or intensity Frequency: pitch

Speech/Audio Signal Processing in MATLAB/Simulink

Changing Wave Playback Param.

Speech/Audio Signal Processing in MATLAB/Simulink

Time-Domain Signal Processing

Speech/Audio Signal Processing in MATLAB/Simulink

Create the above models.

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

What effects can delay generate?

Speech/Audio Signal Processing in MATLAB/Simulink

Single Delay in Audio Signal

Output y(n) = u(n) + a*u(n-k)

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal

y(n) = u(n) + a u(n-k) + a 2u(n-2k) + a 3u(n-3k) ...

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal

Speech/Audio Signal Processing in MATLAB/Simulink

Multiple Delay in Audio Signal

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink

Signal Processing Using sptool

Speech/Audio Signal Processing in MATLAB/Simulink

Analogy to System Theory:

Speech/Audio Signal Processing in MATLAB/Simulink

Source Filter Model of Speech

Speech/Audio Signal Processing in MATLAB/Simulink

Frame Analysis of Speech Signal

Speech/Audio Signal Processing in MATLAB/Simulink

Speech/Audio Signal Processing in MATLAB/Simulink