Text Independent Speaker Recognition
P. Thévenaz
AGEN Mitteilungen, no. 52, pp. 35–45, November 1990.
The author describes an automatic speaker recognition technique, in a text independent context. By text independence, it is meant that no password has to be used, reducing the risk of loss or mimicry. To this end, four different methods are selected whose performances are individually reviewed. Then, the author shows how to combine them in order to enhance the global result. The first method characterizes a speaker by his mean cepstrum, averaged over time. The second method is based on a technique of accumulation of vector quantization error; the parameters used are also cepstral vectors. Using their time derivatives instead, the third method is produced, which is otherwise identical to the previous one. The fourth and last method exploits the histogram of entries in a universal cepstrum codebook, according to a vector quantization technique. Finally, the author combines the results by the Fisher linear discriminant analysis. It is shown by a series of telephone-bandwidth experiments, that the methods behave well except for the third method which had to be rejected due to its bad performance. However, the combination of the other three methods is even more successful than any single method.
@ARTICLE(http://bigwww.epfl.ch/publications/thevenaz9002.html, AUTHOR="Th{\'{e}}venaz, P.", TITLE="Text Independent Speaker Recognition", JOURNAL="{AGEN} Mitteilungen", YEAR="1990", volume="52", number="", pages="35--45", month="November", note="")