Efficient smart OCR solution for banking document digitization
Loading...
Date
Publisher
BRAC University
Authors
Citation
Abstract
The digitization of multilingual banking documents, particularly those
containing handwritten Bengali and English scripts, poses significant
challenges due to variable handwriting styles, document noise, and
domain-specific terminology. This study presents a hybrid Optical
Character Recognition (OCR) and language model–based pipeline designed
to achieve high-fidelity text extraction and correction for banking
document digitization. The proposed system integrates two stateof-
the-art OCR architectures—Tesseract, EasyOCR OCR for robust
unstructured Raw text extraction and GPT-3.5,LLaMA-2 for end-toend
handwritten text recognition—with advanced language models for
post-processing. Bengali text correction is performed using Gemma-
7B and BLOOM-7B, while English text is refined through GPT-3.5
and LLaMA-2 (7B-chat). The dataset comprising paired images and
annotations for both languages, undergoes preprocessing, binarization
,noise reduction, skew correction and redundancy filtering before
model training and evaluation. Experimental results show substantial
improvements in linguistic accuracy and semantic preservation
compared to baseline OCR outputs, demonstrating the system’s applicability
for real-world multilingual banking document digitization.
Description
Cataloged from PDF version of internship report.
Includes bibliographical references (page 48).
This internship report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2025.
Includes bibliographical references (page 48).
This internship report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2025.
Publisher Link
Type
Internship Report