Speaker Search and Indexing for Multimedia Databses

Silva, T.; Karunarathne, D.D.; Wikramanayake, G.N.; Hewagamage, K.P.; Dias, G.K.A.

Please use this identifier to cite or link to this item: http://archive.cmb.ac.lk:8080/xmlui/handle/70130/142

Full metadata record

DC Field	Value	Language
dc.contributor.author	Silva, T.	-
dc.contributor.author	Karunarathne, D.D.	-
dc.contributor.author	Wikramanayake, G.N.	-
dc.contributor.author	Hewagamage, K.P.	-
dc.contributor.author	Dias, G.K.A.	-
dc.date.accessioned	2011-10-04T10:31:08Z	-
dc.date.available	2011-10-04T10:31:08Z	-
dc.date.issued	2004	-
dc.identifier.uri	http://archive.cmb.ac.lk:8080/xmlui/handle/70130/142	-
dc.description.abstract	This paper proposes an approach for indexing a collection of multimedia clips by a speaker in an audio track. A Bayesian Information Criterion (BIC) procedure is used for segmentation and Mel-Frequency Cepstral Coefficients (MFCC) are extracted and sampled as metadata for each segment. Silence detection is also carried out during segmentation. Gaussian Mixture Models (GMM) are trained for each speaker, and an ensemble technique is proposed to reduce errors caused by the probabilistic nature of GMM training. The indexing system utilizes sampled MFCC features as segment metadata and maintains the metadata of the speakers separately, allowing modification or additions to be done independently. The system achieves a True Miss Rate (TMR) of around 20% and a False Alarm Rate (FAR) of around 10% for segments between 15 and 25 seconds in length with performance decreasing with reduction in segment size.	en_US
dc.language.iso	en	en_US
dc.subject	databases	en_US
dc.subject	multimedia	en_US
dc.title	Speaker Search and Indexing for Multimedia Databses	en_US
dc.type	Research paper	en_US
Appears in Collections:	University of Colombo School of Computing

Files in This Item:

File	Description	Size	Format
Silva2004[1].pdf		183.54 kB	Adobe PDF	View/Open

Show simple item record