Speech Recognition System Using Wav2vec Model (Punjabi Language) / Capt Kashif Yaseen, Capt Adeel Zafar, Maj Awais Ali.

By: Yaseen, KashifContributor(s): Supervisor Dr. Shibli NisarMaterial type: TextTextPublisher: MCS, NUST Rawalpindi 2024Description: 55 pSubject(s): UG EE Project | BEE-57DDC classification: 621.382,YAS
Contents:
Speech Recognition presents natural phenomena for the communication among man and machine. The purpose of Speech Recognition speech system is to convert the sequence of sound units in the form of text description. Technology for understanding spoken words by computers has improved a lot recently. But for languages like Punjabi, it's still hard for computers to understand speech well. The complexity of Punjabi phonology, compounded by variations in accent and pronunciation, poses substantial challenges for automatic speech recognition systems. As a result, the need for a robust Punjabi sound recognition system has become increasingly evident. Our project aims to solve this problem by using a special computer model called Wav2Vec. We train this model to understand Punjabi sounds better, so it can transcribe speech more accurately. So far, no work has been done in the field of Punjabi speech recognition system. Our approach involves pre-processing Punjabi audio data, training the Wav2Vec model, and fine-tuning it using transfer learning techniques. The final output is presented through a user-friendly Graphical User Interface (GUI), illustrating the outcomes of our Punjabi sound recognition system in a clear and accessible manner, facilitating easy interaction with transcribed speech for users of varying technical abilities. In this paper, the focus is on the development of the spontaneous speech model for the recognition of the Punjabi language. The GUI for Punjabi speech model also has been created and tested. The recognition accuracy is good for Punjabi sentences and much higher for Punjabi words. The python programming are used to build a speech model for Punjabi live speech.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

Speech Recognition presents natural phenomena for the communication among man and machine. The purpose of Speech Recognition speech system is to convert the sequence of sound units in the form of text description. Technology for understanding spoken words by computers has improved a lot recently. But for languages like Punjabi, it's still hard for computers to understand speech well. The complexity of Punjabi phonology, compounded by variations in accent and pronunciation, poses substantial challenges for automatic speech recognition systems. As a result, the need for a robust Punjabi sound recognition system has become increasingly evident. Our project aims to solve this problem by using a special computer model called Wav2Vec. We train this model to understand Punjabi sounds better, so it can transcribe speech more accurately. So far, no work has been done in the field of Punjabi speech recognition system. Our approach involves pre-processing Punjabi audio data, training the Wav2Vec model, and fine-tuning it using transfer learning techniques. The final output is presented through a user-friendly Graphical User Interface (GUI), illustrating the outcomes of our Punjabi sound recognition system in a clear and accessible manner, facilitating easy interaction with transcribed speech for users of varying technical abilities. In this paper, the focus is on the development of the spontaneous speech model for the recognition of the Punjabi language. The GUI for Punjabi speech model also has been created and tested. The recognition accuracy is good for Punjabi sentences and much higher for Punjabi words. The python programming are used to build a speech model for Punjabi live speech.

There are no comments on this title.

to post a comment.
© 2023 Central Library, National University of Sciences and Technology. All Rights Reserved.