视觉问答(Visual Question Answering)论文初步整理
来源:互联网 发布:淘宝儿童滑板车 编辑:程序博客网 时间:2024/05/14 05:13
刚找的综述性文章:这两篇我没怎么看不知道怎么样
Visual Question Answering: Datasets,Algorithms, and Future Challenges
Visual Question Answering: A Survey ofMethods and Datasets
论文:
A multi-world approach to question answeringabout real-world scenes based on uncertain input. NIPS, 2014.
比较早的一篇文章
Ask Your Neurons: A Neural-based Approach toAnswering Questions about Images. ICCV 2015
这篇文章也比较早,方法比较基础,VQA初期采用的方法
Where To Look: Focus Regions for VisualQuestion Answering。
加入attention机制的一篇文章
Image Question Answering using ConvolutionalNeural Network with Dynamic Parameter Prediction. CVPR 2016,
想法比较好,我之前一篇文章就是基于这篇做了进一步工作
Exploring models and data for image questionanswering. NIPS, 2015.
加拿大那边的一篇文章提出了一个数据集,规模较小但是可以算最早的数据集
Learning to Answer Questions From Image UsingConvolutional Neural Network, AAAI, 2016.
好像是李航老师的文章,感觉一般
Compositional Memory for Visual QuestionAnswering .
Hierarchical Question-Image Co-Attention forVisual Question Answering。NIPS2016,
采用图像attention问题,再用问题attention图像
Dynamic Memory Networks for Visual andTextual Question Answering。
这篇文章比较犀利,在处理文本问答和图像问答都可以,而且效果比较好
Ask Me Anything: Free-form Visual QuestionAnswering Based on Knowledge from External Sources CVPR 2016
沈春华老师的文章,这篇加入了外接知识库
Visual7W: Grounded Question Answering inImages CVPR 2016
李飞飞老师的文章,这篇提出了一个新的数据集Visual7W
Stacked Attention Networks for Image QuestionAnswering
采用多次关注聚焦的方式来处理定位问题关注点
VQA: Visual Question Answering
提出了目前最大的数据集mscocoQA 网页:http://www.visualqa.org/
Neural Module Networks
这篇文章也比较犀利,他有个姊妹篇,同一作者。而且差不多,这篇是更好的一篇,cvpr2016,根据问题不同动态组合网络。
Image Captioning and Visual QuestionAnswering Based on Attributes and Their Related External Knowledge
沈春华老师的文章,提取高层次语义概念的图像特征
最近应该还有新的论文,可以再arxiv上搜一下,因为最近没在弄视觉问答所以也没怎么跟这方面的论文了
- 视觉问答(Visual Question Answering)论文初步整理
- 论文笔记: Hierarchical Question-Image Co-Attention for Visual Question Answering
- 论文笔记 :Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- 论文笔记:Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
- 【论文笔记】Question Answering with Subgraph Embeddings
- (reading)Revisiting Visual Question Answering Baselines
- Hierarchical Question-Image Co-Attention for Visual Question Answering
- Hierarchical Question-Image Co-Attention for Visual Question Answering
- 【分享】Community Question Answering Datasets(社区问答数据集)
- 论文研读--Stacked Attention Networks for Image Question Answering
- 论文浅尝 | CFO: Conditional Focused Neural Question Answering
- Dynamic Memory Networks for Visual and Textual Question Answering
- Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
- 视觉类比VISALOGY: Answering Visual Analogy Questions--NIPS2015
- 阅读笔记(Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding)
- Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- HDU-#4096 Universal Question Answering System(字符串+路径判断)
- 【论文笔记】Information Extraction over Structured Data: Question Answering with Freebase
- sqlite "replace into"的用法
- 新手要了解的编程语言
- BZOJ 3172 AC自动机
- ORACLE创建用户 表空间 临时表空间 并赋予相应权限
- HYSBZ 4300 绝世好题
- 视觉问答(Visual Question Answering)论文初步整理
- 应用层管理器详细介绍
- Storm集群安装配置详细方法
- ActiveMQ实现负载均衡+高可用部署方案
- Linux下获取内核版本号的函数q
- c语言中有符号和无符号数据类型的区别(II)
- android - anim translate中 fromXDelta、toXDelta、fromYDelta、toXDelta属性介绍
- 你应该知道的Android 7.0
- Java--文件内存映射--NIO--MapedByteBuffer