Speaker identification - prototype development and performance

Watts, David Michael Graeme (2006) Speaker identification - prototype development and performance. [USQ Project]


Download (4MB)


Human speech is our most natural form of communication and conveys both meaning
and identity. The identity of a speaker can be determined from the information
contained in the speech signal through speaker identification.
Speaker identification is concerned with identifying unknown speakers from a database
of speaker models previously enrolled in the system. The general process of speaker
identification involves two stages. The first stage extracts features from speakers that
are to be enrolled into the system. The second stage involves processing the identity of
a speaker using features extracted from the speech and comparing these to the speaker
Several techniques available for feature extraction including Linear Predictive Coding
(LPC), Mel-Frequency Cepstral Coefficients and LPC Cepstral coefficients. These
features are used with a classification technique to create a speaker model. Vector
Quantization is commonly used in speaker identification producing reliable results.
This project demonstrates a prototype speaker identification system tailored for utterances
containing less than ten words and target sets of less than eight voice profiles.
VQ (codebook size = 128) with 20-dimension LPCC obtain accuracy results of 83% and
100% using 12 speakers with the NTIMIT and Alternative (own) corpus, respectively.
Tests were conducted using 30s of training speech and 3s of testing speech.

Statistics for USQ ePrint 2338
Statistics for this ePrint Item
Item Type: USQ Project
Refereed: No
Item Status: Live Archive
Faculty/School / Institute/Centre: Historic - Faculty of Engineering and Surveying - Department of Electrical, Electronic and Computer Engineering (Up to 30 Jun 2013)
Date Deposited: 11 Oct 2007 01:03
Last Modified: 09 Nov 2022 01:16
Uncontrolled Keywords: speech; linear predictive coding (LPC); vector quantization (VQ); gaussian mixture models; NTIMIT
Fields of Research (2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing
Fields of Research (2020): 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing
URI: https://sear.unisq.edu.au/id/eprint/2338

Actions (login required)

View Item Archive Repository Staff Only