Human Voice and Speech: Creating a Simple Speech Recognizer for the Sinhala Language

Sonnadara, D.U.J.; Warusawithana, M.P.; Jayananda, M.K.; Nanayakkara, A.

UOC eRepository
→
Science
→
Department of Physics
→
View Item

dc.contributor.author	Sonnadara, D.U.J.
dc.contributor.author	Warusawithana, M.P.
dc.contributor.author	Jayananda, M.K.
dc.contributor.author	Nanayakkara, A.
dc.date.accessioned	2012-12-19T07:42:38Z
dc.date.available	2012-12-19T07:42:38Z
dc.date.issued	1998
dc.identifier.citation	Proceedings of the Annual Sessions, SLAAS, 54 (1998) E1-17	en_US
dc.identifier.uri	http://archive.cmb.ac.lk:8080/xmlui/handle/70130/3275
dc.description.abstract	A Speech Recognizer which can recognize spoken words and convert them to text has numerous applications in commercial industry, in academic institutions as well as in our homes. Since the standard computer keyboard is designed to process the English language, constructing such a system for the Sinhala language is of particular importance to us. The work described in this paper is directed towards building a simple speech recognizer that can convert voice signals to text. First, the raw data corresponding to the speech signals were extracted from the pre-recorded sound files. The entire data set was chopped into windows of a finite number of samples. A Hamming window was used to reduce the errors due to discontinuities in the boundaries. By applying a Fast Fourier Transform (FFT), frequencies corresponding to these sound files were extracted. To limit the number of frequencies, a Mel scaled filter bank was applied on the extracted frequency spectrums. Finally a lookup table was constructed with the mean values of the selected frequencies for basic sounds in the Sinhala language. The comparison of the test sounds with the lookup table produced quite remarkable results. With a limited set of sounds this technique can be used to produce a speaker independent speech recognizer. A direct industry application to such a system would be a voice command process control system. In building a speech recognizer, the accuracy can be further improved by using a higher number of frequencies or by increasing the sampling frequency.	en_US
dc.language.iso	en	en_US
dc.subject	Voice recognition	en_US
dc.title	Human Voice and Speech: Creating a Simple Speech Recognizer for the Sinhala Language	en_US
dc.type	Research paper	en_US