Mostrar el registro sencillo del ítem

dc.contributor.authorJha, Sudan
dc.contributor.authorDey, Anirban
dc.contributor.authorKumar, Raghvendra
dc.contributor.authorKumar-Solanki, Vijender
dc.date2019-06
dc.date.accessioned2022-02-24T10:50:56Z
dc.date.available2022-02-24T10:50:56Z
dc.identifier.issn1989-1660
dc.identifier.urihttps://reunir.unir.net/handle/123456789/12505
dc.description.abstractVisual Question Answering (VQA) is a stimulating process in the field of Natural Language Processing (NLP) and Computer Vision (CV). In this process machine can find an answer to a natural language question which is related to an image. Question can be open-ended or multiple choice. Datasets of VQA contain mainly three components; questions, images and answers. Researchers overcome the VQA problem with deep learning based architecture that jointly combines both of two networks i.e. Convolution Neural Network (CNN) for visual (image) representation and Recurrent Neural Network (RNN) with Long Short Time Memory (LSTM) for textual (question) representation and trained the combined network end to end to generate the answer. Those models are able to answer the common and simple questions that are directly related to the image’s content. But different types of questions need different level of understanding to produce correct answers. To solve this problem, we use faster Region based-CNN (R-CNN) for extracting image features with an extra fully connected layer whose weights are dynamically obtained by LSTMs cell according to the question. We claim in this paper that a single R-CNN architecture can solve the problems related to VQA by modifying weights in the parameter prediction layer. Authors trained the network end to end by Stochastic Gradient Descent (SGD) using pretrained faster R-CNN and LSTM and tested it on benchmark datasets of VQA.es_ES
dc.language.isoenges_ES
dc.publisherInternational Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI)es_ES
dc.relation.ispartofseries;vol. 5, nº 5
dc.relation.urihttps://www.ijimai.org/journal/bibcite/reference/2688es_ES
dc.rightsopenAccesses_ES
dc.subjectcomputer visiones_ES
dc.subjectneural networkes_ES
dc.subjectnatural language processinges_ES
dc.subjectstochastic gradient descent.es_ES
dc.subjectlong short term memoryes_ES
dc.subjectIJIMAIes_ES
dc.titleA Novel Approach on Visual Question Answering by Parameter Prediction using Faster Region Based Convolutional Neural Networkes_ES
dc.typearticlees_ES
reunir.tag~IJIMAIes_ES
dc.identifier.doihttp://doi.org/10.9781/ijimai.2018.08.004


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem