论文笔记 :Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
来源:互联网 发布:剑三小八捏脸数据 编辑:程序博客网 时间:2024/04/30 04:14
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
AkiraFukui*1,2 DongHukPark*1 DaylenYang*1 AnnaRohrbach*1,3 TrevorDarrell1 MarcusRohrbach1 1UC Berkeley EECS, CA, United States 2Sony Corp., Tokyo, Japan 3Max Planck Institute for Informatics, Saarbr¨ucken, Germany
arXiv:1606.01847v2 [cs.CV] 23 Jun 2016
摘要:
近年来对从大规模语言或者可视化数据集中训练针对针对文本或者可视化信息的向量表征建模取得成功。
然而VQA要求‘融合’这些向量表征,关于多模态池化的途径包括:点积,相加和链接。
本文假定以上方法不如外积有效的表征,然而外积由于维度太高一般实际上不可行。
本文提出MCB(Multimodal Compact Bilinear)来有效表征多模态组合。
本人提出一种两次使用MCB的架构,一次在空间特征的预测注意力中,一次在融合'注意力表征'和'问题表征'中
0 0
- 论文笔记 :Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- 阅读笔记(Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding)
- Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- Multimodal Compact Bilinear Pooling for Multimodal Neural Machine Translation
- 论文笔记:Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
- 论文笔记: Compact Bilinear Pooling
- 论文笔记: Hierarchical Question-Image Co-Attention for Visual Question Answering
- 论文笔记: (compact) Bilinear Pooling, Confusion
- VQA 之 Multimodal Compact Bilinear Pooling
- VQA 之 Multimodal Compact Bilinear Pooling
- Paper Reading - Snap and ask: Answering Multimodal Question by Naming Visual Instance
- Dynamic Memory Networks for Visual and Textual Question Answering
- Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
- 论文笔记:Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answeri
- Hierarchical Question-Image Co-Attention for Visual Question Answering
- Hierarchical Question-Image Co-Attention for Visual Question Answering
- 视觉问答(Visual Question Answering)论文初步整理
- 论文笔记: HADAMARD PRODUCT FOR LOW-RANK BILINEAR POOLING
- appium2-基于python调用unittest框架对iOS进行自动化测试
- firefox浏览器相关的2个坑
- (23)Java学习笔记——常用对象API / StringBuffer类
- coreDate的使用
- /bin/bash^M: bad interpreter: No such file or directory
- 论文笔记 :Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
- Excel VBA 之 UBound
- Java wait() notify()方法使用实例讲解
- Classifier
- 1021. 个位数统计 (15)
- 基于arduino-due,jlink以及OpenOCD的zephyr调试平台的搭建
- A1070/B1020 Mooncake (25)
- JAVASE基础-day12(常见对象(Scanner,String))
- CODE[VS] 天梯 1160 蛇形矩阵