Urdu Digital Text Optical Character Recognition / Mustafa Ahmed

By: Ahmed, MustafaContributor(s): Supervisor : Dr. Karam Dad KalluMaterial type: TextTextIslamabad : SMME- NUST; 2023Description: 46p. ; 30cmSubject(s): MS Robotics and Intelligent Machine EngineeringDDC classification: 629.8 Online resources: Click here to access online Summary: This thesis introduces an innovative word-level Optical Character Recognition (OCR) model designed specifically for digital Urdu text recognition. Leveraging the power of transformer-based architectures and attention mechanisms, the proposed model was trained on a comprehensive dataset comprising approximately 160,000 Urdu text images. Remarkably, the model achieved a commendable character error rate (CER) of 0.242, indicating its superior accuracy in recognizing Urdu characters. The key strength of the model lies in its unique architecture, incorporating the permuted autoregressive sequence (PARSeq) model. This advanced approach enables context-aware inference and iterative refinement, leveraging bidirectional context information to enhance recognition accuracy. Additionally, the model's ability to handle a diverse range of Urdu text styles, fonts, and variations further enhances its applicability in real-world scenarios. While the model demonstrates promising results, it does have some limitations. Blurred images, non-horizontal orientations, and the overlay of patterns, lines, or other text can occasionally lead to suboptimal results. Additionally, trailing or following punctuation marks may cause noise in the recognition process. Addressing these challenges will be a focal point of future research. The proposed model's exceptional performance and its ability to adapt to various text styles make it a valuable tool for applications that require accurate and efficient Urdu text recognition. Future work will focus on refining the model, exploring data augmentation techniques, optimizing hyperparameters, and integrating context-aware language models to further improve its overall performance and robustness.
Tags from this library: No tags from this library for this title. Log in to add tags.
Item type Current location Home library Shelving location Call number Status Date due Barcode Item holds
Thesis Thesis School of Mechanical & Manufacturing Engineering (SMME)
School of Mechanical & Manufacturing Engineering (SMME)
E-Books 629.8 (Browse shelf) Available SMME-TH-927
Total holds: 0

This thesis introduces an innovative word-level Optical Character Recognition (OCR) model designed
specifically for digital Urdu text recognition. Leveraging the power of transformer-based architectures
and attention mechanisms, the proposed model was trained on a comprehensive dataset comprising
approximately 160,000 Urdu text images. Remarkably, the model achieved a commendable character
error rate (CER) of 0.242, indicating its superior accuracy in recognizing Urdu characters. The key
strength of the model lies in its unique architecture, incorporating the permuted autoregressive
sequence (PARSeq) model. This advanced approach enables context-aware inference and iterative
refinement, leveraging bidirectional context information to enhance recognition accuracy.
Additionally, the model's ability to handle a diverse range of Urdu text styles, fonts, and variations
further enhances its applicability in real-world scenarios. While the model demonstrates promising
results, it does have some limitations. Blurred images, non-horizontal orientations, and the overlay of
patterns, lines, or other text can occasionally lead to suboptimal results. Additionally, trailing or
following punctuation marks may cause noise in the recognition process. Addressing these challenges
will be a focal point of future research. The proposed model's exceptional performance and its ability
to adapt to various text styles make it a valuable tool for applications that require accurate and
efficient Urdu text recognition. Future work will focus on refining the model, exploring data
augmentation techniques, optimizing hyperparameters, and integrating context-aware language
models to further improve its overall performance and robustness.

There are no comments on this title.

to post a comment.
© 2023 Central Library, National University of Sciences and Technology. All Rights Reserved.