A survey on script segmentation for Bangla OCR

Loading...
Thumbnail Image

Date

Publisher

Center for research on Bangla language processing (CRBLP), BRAC University

Citation

Abstract

Script segmentation is an important primary task for any Optical Character Recognition (OCR) software. Especially, in case of off-line OCR for printed character, it has more importance. Through script segmentation a big image of some written document is fragmented into a number of small pieces which are then used for pattern matching to determine the expected sequence of characters. In the implementation of Bangla OCR, the script segmentation may also play a vital role. But, for accurate and proper segmentation it is necessary to identify the properties of Bangla script as well as the exceptions. This paper depicts the most important and useful properties, advantages, disadvantages of various Bangla scripts, especially the printed scripts. It also gives some ideas regarding the prospective field of Bangla OCR and its applications.

Description

Includes bibliographical references (page 5).

Publisher Link

Type

Technical Report