Watts, David Michael Graeme (2006) Speaker identification - prototype development and performance. [USQ Project]
|
PDF
WATTS_David_2006.pdf Download (4MB) |
Abstract
Human speech is our most natural form of communication and conveys both meaning
and identity. The identity of a speaker can be determined from the information
contained in the speech signal through speaker identification.
Speaker identification is concerned with identifying unknown speakers from a database
of speaker models previously enrolled in the system. The general process of speaker
identification involves two stages. The first stage extracts features from speakers that
are to be enrolled into the system. The second stage involves processing the identity of
a speaker using features extracted from the speech and comparing these to the speaker
models.
Several techniques available for feature extraction including Linear Predictive Coding
(LPC), Mel-Frequency Cepstral Coefficients and LPC Cepstral coefficients. These
features are used with a classification technique to create a speaker model. Vector
Quantization is commonly used in speaker identification producing reliable results.
This project demonstrates a prototype speaker identification system tailored for utterances
containing less than ten words and target sets of less than eight voice profiles.
VQ (codebook size = 128) with 20-dimension LPCC obtain accuracy results of 83% and
100% using 12 speakers with the NTIMIT and Alternative (own) corpus, respectively.
Tests were conducted using 30s of training speech and 3s of testing speech.
Statistics for this ePrint Item |
Item Type: | USQ Project |
---|---|
Refereed: | No |
Item Status: | Live Archive |
Faculty/School / Institute/Centre: | Historic - Faculty of Engineering and Surveying - Department of Electrical, Electronic and Computer Engineering (Up to 30 Jun 2013) |
Date Deposited: | 11 Oct 2007 01:03 |
Last Modified: | 09 Nov 2022 01:16 |
Uncontrolled Keywords: | speech; linear predictive coding (LPC); vector quantization (VQ); gaussian mixture models; NTIMIT |
Fields of Research (2008): | 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080107 Natural Language Processing |
Fields of Research (2020): | 46 INFORMATION AND COMPUTING SCIENCES > 4602 Artificial intelligence > 460208 Natural language processing |
URI: | https://sear.unisq.edu.au/id/eprint/2338 |
Actions (login required)
Archive Repository Staff Only |