Towards devising an efficient VQA in the Bengali Language

This paper aims to provide insight into how Visual question answering might work on Bangla datasets versus English datasets. Several studies have been conducted on deep learning methods applied to Bangla datasets up to this point. However, a Bangla dataset with images and questions embedded in each of them has yet to be created. We attempted to create a Bangla dataset suitable for such implementation through our re search. The step-by-step procedures in our work demonstrate how various bar riers can be overcome while developing datasets. We attempted to use existing visual question answering datasets because there are no actual Bangla datasets created for this specific task.In the end we successfully created our own Bangla visual question an swering datasets and proposed a model to train and compare among existing datasets. Following that, the comparison was provided to show how the Bangla dataset differs from the English datasets in terms of the VQA model. Our work should make more than enough room for future research and implementation of visual question answering tasks in Bangla.

Keywords

Natural Language Processing, CLEVR, VQA V1, Visual Question Answering

LC Subject Headings

Natural language processing (Computer science).

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 50-53).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2021.

Department

Department of Computer Science and Engineering

Type

Thesis

Collections

Thesis (Bachelor of Science in Computer Science)

Towards devising an efficient VQA in the Bengali Language

Files

Date

Publisher

Authors

URI

Citation

Abstract