R. Brigola


Supplement to my lectures on Fourier analysis
Test of Speaker Recognition

Here we consider an example for the generation of vectors with amplitude means for a series of test speakers.
These vectors build the data base for a speaker recognition test.
We experience, whether we can identify a speaker of that data base population by a 6 s speech test.
To test a text independant speaker recognition, the text sections used for the data base generation and those of the speech tests
were different (arbitrarily taken from a newspaper).



In the file generate_speaker_db.m characterizing vectors are computed for a series of speaker recordings.
For an unknown speaker the analogous vector has to be computed and compared with the data base.


We test the following type of a spectral characterization:

For the bandwidth 0 - 4000 Hz a vector is computed, whose components are amplitude means in 8.33 Hz steps.
Per speaker 6 time frames of 3 s duration are used, i.e. a 18 s speech recording as a wav-file with 44100 sampling rate.


For recognition an analogous vector of a test speaker is computed. The components of that vector in the data base
vector directions are computed. As "identification" is chosen that data base person, where we find the largest component.
This is done in recognize_speaker.m


For more details please read the text and comments in the m-files. The m-files are

m-file generate_speaker_db.m for the generation of a test data base with characterizing speaker spectra
m-file recognize_speaker.m for the recognition test
m-file plot_speaker_spectrum.m to plot the computed test speaker amplitude spectrum

Example of a single-sided amplitude speaker spectrum computed with generate_speaker_db.m



For new Matlab users:
You can open the m-files in the Matlab editor by double-clicking, and run them by
" Evaluate current cell", a menu point found in the top line of the Matlab window clicking "cell".


Tests

A test data base generated with 23 test speakers
List of speakers and corresponding numbers of their identities and test_track numbers

To experience how it works and compare the results with the speaker list, you can download
the test tracks with the filenames test_trackN.wav (each about 6 s) from the folder

test_track1 test_track2 test_track3 test_track4
test_track5 test_track6 test_track7 test_track8
test_track9 test_track10 test_track11 test_track12
test_track13 test_track14 test_track15 test_track16
test_track17 test_track18 test_track19 test_track20
test_track21 test_track22 test_track23

The m-files, test_tracks and the data base speaker_fingerprint_db.mat must be located in the same folder.