Skip to content

Commit 3db7ff5

Browse files
authored
Add files via upload
1 parent a4802c8 commit 3db7ff5

File tree

4 files changed

+350
-0
lines changed

4 files changed

+350
-0
lines changed

LICENSE

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright (c) 2022, The MathWorks, Inc.
2+
All rights reserved.
3+
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
4+
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
5+
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
6+
3. In all cases, the software is, and all modifications and derivatives of the software shall be, licensed to you solely for use in conjunction with MathWorks products and service offerings.
7+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

SECURITY.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
2+
# Reporting Security Vulnerabilities
3+
4+
If you believe you have discovered a security vulnerability, please report it to
5+
[[email protected]](mailto:[email protected]). Please see
6+
[MathWorks Vulnerability Disclosure Policy for Security Researchers](https://www.mathworks.com/company/aboutus/policies_statements/vulnerability-disclosure-policy.html)
7+
for additional information.

convertLibrosaToMATLAB.mlx

283 KB
Binary file not shown.

convertLibrosaToMATLABCode.m

Lines changed: 336 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,336 @@
1+
%% Convert librosa Audio Feature Extraction To MATLAB
2+
%% This Example Shows How to:
3+
%%
4+
% * Convert librosa Python feature extraction code to MATLAB.
5+
% * Using the MATLAB feature extraction code, translate a Python speech command
6+
% recognition system to a MATLAB system where Python is not required.
7+
%% Overview
8+
% Audio and speech AI systems often include feature extraction. Importing AI
9+
% audio models trained in non-MATLAB frameworks into MATLAB usually consists of
10+
% two steps:
11+
%%
12+
% * Import the pretrained network to MATLAB.
13+
% * Translate the feature extraction performed in the non-MATLAB framework to
14+
% MATLAB code.
15+
%%
16+
% This example focuses on the second step of this process. In particular, you
17+
% learn how to translate librosa feature extraction functions to their MATLAB
18+
% equivalents.
19+
%
20+
% The example covers three of the most popular audio feature extraction algorithms:
21+
%%
22+
% * Short-time Fourier transform (STFT) and its inverse (ISTFT).
23+
% * Mel spectrogram.
24+
% * Mel-frequency cepstral coefficients (MFCC).
25+
%%
26+
% You also leverage the converted feature extraction code to translate a Python
27+
% deep learning speech command recognition system to MATLAB. The Python system
28+
% uses PyTorch for the pretrained network, and librosa for mel spectrogram feature
29+
% extraction.
30+
%% Requirements
31+
%%
32+
% * <https://www.mathworks.com/ MATLAB®> R2021b or later
33+
% * <https://www.mathworks.com/products/audio.html Audio Toolbox™>
34+
% * <https://www.mathworks.com/products/deep-learning.html Deep Learning Toolbox™>
35+
%%
36+
% The Python code uses the following packages:
37+
%%
38+
% * librosa version 0.9.2
39+
% * PyTorch version 1.10.2
40+
%% Mapping librosa Code to MATLAB Code
41+
% STFT and ISTFT
42+
% Execute librosa Code
43+
% You start with translating STFT and ISTFT librosa code to MATLAB.
44+
%
45+
% The Python script <./PythonCode\librosastft.py |librosaSTFT.py|> uses the
46+
% librosa functions |stft| and |istft|.
47+
%
48+
% Inspect the contents of the script.
49+
50+
addpath("PythonCode\")
51+
pythonScript = fullfile(pwd,"PythonCode","librosastft.py");
52+
type(pythonScript)
53+
%%
54+
% You can execute Python scripts and commands from MATLAB. For more information
55+
% about this functionality, see <https://www.mathworks.com/help/matlab/call-python-libraries.html
56+
% Call Python from MATLAB> in the documentation. In this example, you use <https://www.mathworks.com/help/matlab/ref/pyrunfile.html
57+
% pyrunfile> to run a Python script in MATLAB.
58+
%
59+
% Use pyrunfile to call the Python script. Pass the name of the test audio file
60+
% as an input argument. Return variables computed in the Python script to MATLAB
61+
% by specifying them as output arguments.
62+
63+
filename = fullfile(pwd,"samples","yes.flac");
64+
[stftOut1, istftOut1] = pyrunfile(pythonScript,["stftOut","istftOut"],filename=filename);
65+
stftOut1 = single(stftOut1);
66+
istftOut1 = single(istftOut1);
67+
% Implement Equivalent MATLAB Code
68+
% To perform the equivalent STFT and ISTFT computations in MATLAB, you use the
69+
% MATLAB functions <./HelperFiles\+librosa\stft.m librosa.stft> and <./HelperFiles\+librosa\istft.m
70+
% librosa.istft>. The name-value arguments of these functions match the name-value
71+
% arguments of their librosa counterparts.
72+
%
73+
% Load the sample audio signal in MATLAB.
74+
75+
addpath(fullfile(pwd,"HelperFiles"))
76+
[samples,fs] = audioread(filename);
77+
samples = single(samples);
78+
%%
79+
% Now compute the STFT. Use the same name-value arguments as in the Python script.
80+
81+
stftOut2 = librosa.stft(samples,FFTLength=512,HopLength=160,...
82+
WindowLength=512,Window="hann",...
83+
Center=true);
84+
%%
85+
% Compare the Python and MATLAB STFT values by computing the error.
86+
87+
fprintf("STFT error: %f\n", norm(stftOut1(:)-stftOut2(:)));
88+
%%
89+
% Note that calling librosa.stft with no output arguments plots the magnitude
90+
% of the STFT.
91+
92+
figure;
93+
librosa.stft(samples,FFTLength=512,HopLength=160,...
94+
WindowLength=512,Window="hann",...
95+
Center=false);
96+
%%
97+
% Now compute the ISTFT in MATLAB by using the same name-value arguments as
98+
% the Python script.
99+
100+
istftOut2 = librosa.istft(stftOut2,FFTLength=512,HopLength=160,...
101+
WindowLength=512,Window="hann",...
102+
Center=true);
103+
%%
104+
% Compare the MATLAB and librosa ISTFT values.
105+
106+
figure
107+
subplot(2,1,1)
108+
L = length(istftOut1);
109+
t = (0:L-1)/fs;
110+
plot(t,istftOut1)
111+
grid on
112+
xlabel("Time (s)")
113+
title("librosa")
114+
subplot(2,1,2)
115+
t = (0:L-1)/fs;
116+
plot(t,istftOut2)
117+
grid on
118+
xlabel("Time (s)")
119+
title("MATLAB")
120+
%%
121+
% Compute the error.
122+
123+
fprintf("ISTFT error: %f\n", norm(istftOut1(:)-istftOut2(:)));
124+
% Generate MATLAB Code from librosa.stft and librosa.istft
125+
% To generate MATLAB code that implements librosa's STFT with documented MATLAB
126+
% function, specify |GenerateMATLABCode=true| in the call to |librosa.stft.| In
127+
% this case, the generated MATLAB code uses the function <https://www.mathworks.com/help/signal/ref/stft.html
128+
% stft>.
129+
130+
out = librosa.stft(samples,FFTLength=512,HopLength=160,...
131+
WindowLength=512,Window="hann",...
132+
Center=true,GenerateMATLABCode=true);
133+
% Mel Filter Bank
134+
% Next, you map librosa's mel filter bank function to MATLAB. Mel filter banks
135+
% are integral to mel spectrograms and MFCC computations.
136+
%
137+
% Inspect the Python script that builds the filter bank.
138+
139+
pythonScript = fullfile(pwd,"PythonCode","librosamel.py");
140+
type(pythonScript)
141+
%%
142+
% Execute the script.
143+
144+
melOut1 = pyrunfile(pythonScript,"melOut");
145+
melOut1 = single(melOut1);
146+
%%
147+
% Use <./HelperFiles\+librosa\mel.m librosa.mel> to construct the same filter
148+
% bank in MATLAB.
149+
150+
melOut2 = librosa.mel(SampleRate=fs,FFTLength=512,NumBands=50,...
151+
Normalization="Slaney",HTK=true);
152+
%%
153+
% Plot and compare the librosa and MATLAB filter banks.
154+
155+
Fc = mel2hz(linspace(0,fs/2,50));
156+
figure;
157+
subplot(2,1,1)
158+
plot(melOut1.')
159+
grid on
160+
title("librosa Mel Filter Bank")
161+
xlabel("Frequency Bin #")
162+
subplot(2,1,2)
163+
plot(melOut2.')
164+
grid on
165+
title("MATLAB Mel Filter Bank")
166+
xlabel("Frequency Bin #")
167+
%%
168+
% Compute the error.
169+
170+
fprintf("Mel filter bank error: %f\n", norm(melOut1(:)-melOut2(:)))
171+
%%
172+
% Similar to |librosa.stft| and |librosa.istft|, specify |GenerateMATLABCode=true|
173+
% to generate MATLAB code that uses documented functions. In this case, the generated
174+
% code uses <https://www.mathworks.com/help/audio/ref/designauditoryfilterbank.html
175+
% designAuditoryFilterBank>.
176+
177+
librosa.mel(SampleRate=fs,FFTLength=512,NumBands=50,...
178+
Normalization="Slaney",HTK=true,...
179+
GenerateMATLABCode=true);
180+
% Mel Spectrogram
181+
% Next, you map librosa's mel spectrogram function to MATLAB.
182+
%
183+
% Inspect the Python script that computes the mel spectrogram.
184+
185+
pythonScript = fullfile(pwd,"PythonCode","librosamelspectrogram.py");
186+
type(pythonScript)
187+
%%
188+
% Execute the script.
189+
190+
melSpectrogramOut1 = pyrunfile(pythonScript,"melSpectrogramOut",filename=filename);
191+
melSpectrogramOut1 = single(melSpectrogramOut1);
192+
%%
193+
% Use <./HelperFiles\+librosa\melSpectrogram.m librosa.melSpectrogram> to compute
194+
% the same mel spectrogram in MATLAB.
195+
196+
melSpectrogramOut2 = librosa.melSpectrogram(samples,SampleRate=fs,FFTLength=512,NumBands=50,...
197+
Center=false,HopLength=160,WindowLength=512,Window="hann",...
198+
Normalization="Slaney", HTK=true, Power=2);
199+
%%
200+
% Compute the error.
201+
202+
fprintf("Mel spectrogram error: %f\n", norm(melSpectrogramOut1(:)-melSpectrogramOut2(:)))
203+
%%
204+
% Similar to other functions, specify |GenerateMATLABCode=true| to generate
205+
% MATLAB code that uses documented MATLAB functions. In this case, the generated
206+
% code uses <https://www.mathworks.com/help/signal/ref/stft.html stft> and <https://www.mathworks.com/help/audio/ref/designauditoryfilterbank.html
207+
% designAuditoryFilterBank>.
208+
209+
librosa.melSpectrogram(samples,SampleRate=fs,FFTLength=512,NumBands=50,...
210+
Center=false,HopLength=160,WindowLength=512,Window="hann",...
211+
Normalization="Slaney", HTK=true, Power=2,...
212+
GenerateMATLABCode=true);
213+
% MFCC
214+
% Finally, you map librosa's MFCC computation function to MATLAB.
215+
%
216+
% Inspect the Python script that computes MFCC.
217+
218+
pythonScript = fullfile(pwd,"PythonCode","librosamfcc.py");
219+
type(pythonScript)
220+
%%
221+
% Execute the script.
222+
223+
mfccOut1 = pyrunfile(pythonScript,"mfccOut",filename=filename);
224+
mfccOut1 = single(mfccOut1);
225+
%%
226+
% Use <./HelperFiles\+librosa\mfcc.m librosa.mfcc> to compute the same MFCC
227+
% in MATLAB.
228+
229+
mfccOut2 = librosa.mfcc(samples,SampleRate=fs,FFTLength=512,NumBands=50,FMin=10,...
230+
HopLength=160,WindowLength=512,Window="hann",...
231+
HTK=true,Power=2,DCTType=2,Lifter=0.2);
232+
%%
233+
% Compute the error.
234+
235+
fprintf("MFCC error: %f\n", norm(mfccOut1(:)-mfccOut2(:)))
236+
%%
237+
% Similar to other functions, specify |GenerateMATLABCode=true| to generate
238+
% MATLAB code that uses documented functions. In this case, the generated code
239+
% uses <https://www.mathworks.com/help/signal/ref/stft.html stft>, <https://www.mathworks.com/help/signal/ref/dct.html
240+
% dct>, and <https://www.mathworks.com/help/audio/ref/designauditoryfilterbank.html
241+
% designAuditoryFilterBank>.
242+
243+
librosa.mfcc(samples,SampleRate=fs,FFTLength=512,NumBands=50,FMin=10,...
244+
HopLength=160,WindowLength=512,Window="hann",...
245+
HTK=true,Power=2,DCTType=2,Lifter=0.2,...
246+
GenerateMATLABCode=true);
247+
%% Import Python Speech Command System to MATLAB
248+
% You now use the feature extraction mapping functionality to translate a Python
249+
% pretrained speech recognition system to MATLAB.
250+
% System Description
251+
% The deep learning speech command recognition system was trained in Python.
252+
%
253+
% The system recognizes the following commands:
254+
%%
255+
% * "yes"
256+
% * "no"
257+
% * "up"
258+
% * "down"
259+
% * "left"
260+
% * "right"
261+
% * "on"
262+
% * "off"
263+
% * "stop"
264+
% * "go"
265+
%%
266+
% The system is comprised of a convolutional neural network. The network accepts
267+
% mel spectrograms as an input.
268+
%
269+
% For the training workflow, a supervized learning approach is followed, where
270+
% mel spectrograms labeled with commands are fed to the network.
271+
%
272+
%
273+
%
274+
% The following were used to train the command recognition system:
275+
%%
276+
% * *PyTorch* to design and train the model.
277+
% * librosa to perform feature extraction (auditory spectrogram computation).
278+
%%
279+
% You perform speech recognition in Python by first extracting an mel spectrogram
280+
% from an audio signal, and then feeding the spectrogram to the trained convolutional
281+
% network.
282+
%
283+
%
284+
%
285+
%
286+
% Perform Speech Command Recognition in Python
287+
% The Python script <./PythonCode\InferSpeechCommands.py |InferSpeechCommands.py|>
288+
% performs speech command recognition.
289+
%
290+
% Execute Python inference in MATLAB. The Python script prints out the recognized
291+
% keyword. Return the network activations.
292+
293+
cd("PythonCode")
294+
pythonScript = "InferSpeechCommands.py";
295+
[pytorchActivations,mm] = pyrunfile(pythonScript,["activations","z"],filename=filename);
296+
cd ..
297+
% Convert the Pretrained Network to MATLAB
298+
% You first import the PyTorch pretrained network to MATLAB using MATLAB's <https://www.mathworks.com/help/deeplearning/deep-learning-import-and-export.html?s_tid=CRUX_lftnav
299+
% model import-export functionality>. In this example, you use <https://www.mathworks.com/help/deeplearning/ref/importonnxnetwork.html
300+
% importONNXNetwork>. The function imports a version of the network that was saved
301+
% to the Open Neural Network Exchange (ONNX) format. To see how the PyTorch model
302+
% can be saved to an ONNX format, refer to <./PythonCode\convertModelToONNX.py
303+
% convertModelToONNX.py>.
304+
305+
onnxFile = "cmdRecognitionPyTorch.onnx";
306+
%%
307+
% Import the network to MATLAB
308+
309+
net = importONNXNetwork(onnxFile)
310+
% Perform Speech Command Recognition in MATLAB
311+
% Use |librosa.melSpectrogram| to perform feature extraction. Call the function
312+
% with the same name-value arguments as the Python inference.
313+
314+
spect = librosa.melSpectrogram(samples,SampleRate=fs, FFTLength=512,NumBands=50,...
315+
Center=false,HopLength=160,WindowLength=512,Window="hann",...
316+
Normalization="Slaney", HTK=true, Power=2);
317+
spect = log10(spect + 1e-6);
318+
MATLABActivations = predict(net,spect.');
319+
%%
320+
% Compare MATLAB and PyTorch activations.
321+
322+
figure
323+
plot(MATLABActivations,"b*-")
324+
hold on
325+
grid on
326+
plot(pytorchActivations,"ro-")
327+
xlabel("Activation #")
328+
legend("MATLAB", "Python")
329+
%%
330+
% Verify the spoken command in MATLAB.
331+
332+
CLASSES = ["unknown" " yes" " no" " up" " down" " left" " right" " on" " off" " stop" "go"];
333+
[~,ind] = max(MATLABActivations);
334+
fprintf("Recognized command: %s\n",CLASSES(ind))
335+
%%
336+
% _Copyright 2022 The MathWorks, Inc._

0 commit comments

Comments
 (0)