Benchmarking and enhancing Bengali OCR: a hybrid OCR system with analytic hierarchy process-based evaluation
Loading...
Date
Publisher
BRAC University
Authors
Citation
Abstract
This study benchmarks the performance of three OCR systems—Tesseract OCR,
EasyOCR, and a hybrid approach combining Tesseract OCR, EasyOCR, and the
Google Vision API—for Bengali text recognition. The evaluation was conducted
on a diverse, real-world dataset comprising 216 images across nine categories of
Bengali documents, totaling 19,064 words. Each OCR engine was independently assessed
using multiple performance metrics, including Character Error Rate (CER),
Word Error Rate (WER), Character-Level Accuracy (CLA), Word-Level Accuracy
(WLA), and processing time. Among other preprocessing techniques, the pipeline
employed grayscale conversion, resizing, noise removal, and adaptive thresholding;
however, these steps did not consistently enhance the performance of standalone
OCR engines. To address the limitations of single-engine systems, a hybrid OCR
framework was developed that processes raw images and employs a multi-criteria
decision-making approach based on the Analytic Hierarchy Process (AHP). A user
study involving 41 participants was conducted to determine the relative importance
of CER versus WER. Using Saaty’s scale, over 70% of participants assigned a value
of 5 or higher in favor of CER. This resulted in a CER-to-WER importance ratio of
4.76, which was then used to compute AHP-based weights. For each image, OCR
outputs were scored using a weighted combination of CER and WER, and the engine
with the lowest score was selected as the optimal result. The hybrid system
demonstrated strong performance under optimal conditions, achieving a Character-
Level Accuracy (CLA) of 96.63% and a Word-Level Accuracy (WLA) of 80.34%,
corresponding to a Character Error Rate (CER) of 3.37% and a Word Error Rate
(WER) of 19.66%. This significantly outperformed Tesseract OCR (CLA: 88.54%,
CER: 11.46%; WLA: 79.99%, WER: 20.01%) and EasyOCR (CLA: 90.98%, CER:
9.02%; WLA: 78.06%, WER: 21.94%). These results were obtained from specific
document categories where OCR performance tends to be highest. While recognition
accuracy may vary across different document types, the findings highlight
the potential of the AHP-guided hybrid approach to substantially improve Bengali
OCR performance in favorable scenarios and provide a strong foundation for further
enhancement in more challenging, real-world conditions.
LC Subject Headings
Description
Cataloged from PDF version of project report.
Includes bibliographical references (pages 71-73).
This project report is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2025.
Includes bibliographical references (pages 71-73).
This project report is submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, 2025.
Publisher Link
Type
Project Report