Urdu Text Recognition in TV Channel Streams / Hira Afzal

By: Afzal, HiraContributor(s): Supervisor : Dr. Hasan SajidMaterial type: TextTextIslamabad : SMME- NUST; 2022Description: 88p. Soft Copy 30cmSubject(s): MS Robotics and Intelligent Machine EngineeringDDC classification: 629.8 Online resources: Click here to access online
Tags from this library: No tags from this library for this title. Log in to add tags.

Text recognition from images has received a lot of attention of the researchers for over three decades. It is one of the most worked upon problem in the domain of pattern classification. The optical character recognitions systems can be used to recognize numeric digits, alphabets, words and sentences in any language and can be applied to recognize scenic text for the assistance of visually impaired persons. These systems can also be used for the assisted navigation of autonomous vehicles. Despite all the related work that has been done for the Latin scripts, the recognition of text written in non-Latin languages like Urdu, Arabic, Pashto etc. has always been a challenging task due to complex cursive nature of the script. Urdu OCR systems can be used for digitization of the data, which will further allow us to use it for search and retrieval of the specific information. These kinds of systems grant us the easy access to content based information retrieval. Extracting embedded text from the images is an active area of research in the community of Document analysis as well. The currently available OCR frameworks mainly focus on the recognition of Latin texts like English script etc. and they cannot be applied for non-Latin languages. So we are looking to implement a solution based on deep framework for the line level recognition of Urdu text, to extract useful information from the news tickers. In this work, we are focusing to design an end-to-end system that will detect and recognize the Urdu text embedded in TV channel streams that is commonly written in Nasta‘liq scripting style. The development of Urdu OCR systems consist mainly of two subtasks, text detection and text recognition. For the development of robust recognition systems, the availability and access of a huge quantity of annotated data is the first and the foremost requirement. So, the dataset used here has been collected from different news channels and is comprehensive enough to cover the low and high resolution images. It includes the distorted, low quality as well as faded news tickers making it ideal for testing the performance of any Urdu News OCR system. Once the text has been detected or localized in an image, it can be cropped and used for the recognition part. For the recognition task, a language independent Convolutional Recurrent Neural Network (CRNN) based end-to-end architecture has been proposed with CTC loss function for the line level recognition of Urdu text embedded in the news tickers of TV channel streams. In this proposed system, a large number of different techniques have been used for data augmentation. These kinds of data variations will prevent the model from over fitting and help it to generalize better. Finally, the results of this approach have been presented on the test set. The achieved results are 0.63% CER, 6.43% WER, 5.14% LER and levenshtein distance of 0.02 on the Urdu Ticker Text dataset. These results indicate that our proposed methodology has shown outstanding performance as compared to the commercially available recognition systems and this proposed methodology can be applied to a variety of other non-Latin scripts as well. In this thesis, we also discussed the common problems faced when dealing with the low-resource language recognizers like Urdu and Arabic etc. The outcomes of this study are expected to be applicable and useful for the researchers, working on the recognition of non-Latin languages written in cursive scripting style.

There are no comments on this title.

to post a comment.
© 2023 Central Library, National University of Sciences and Technology. All Rights Reserved.