Deep Reinforcement Learning-based Image Captioning with Embedding Reward
来源:互联网 发布:基站离线数据库 2017 编辑:程序博客网 时间:2024/06/09 00:29
https://arxiv.org/abs/1704.03899
Image captioning is a challenging problem owing to the complexity in understanding the image content and diverse ways of describing it in natural language. Recent advances in deep neural networks have substantially improved the performance of this task. Most state-of-the-art approaches follow an encoder-decoder framework, which generates captions using a sequential recurrent prediction model. However, in this paper, we introduce a novel decision-making framework for image captioning. We utilize a "policy network" and a "value network" to collaboratively generate captions. The policy network serves as a local guidance by providing the confidence of predicting the next word according to the current state. Additionally, the value network serves as a global and lookahead guidance by evaluating all possible extensions of the current state. In essence, it adjusts the goal of predicting the correct words towards the goal of generating captions similar to the ground truth captions. We train both networks using an actor-critic reinforcement learning model, with a novel reward defined by visual-semantic embedding. Extensive experiments and analyses on the Microsoft COCO dataset show that the proposed framework outperforms state-of-the-art approaches across different evaluation metrics.
- Deep Reinforcement Learning-based Image Captioning with Embedding Reward
- Deep Reinforcement Learning-based Image Captioning with Embedding Reward
- 17-11-22 Deep Reinforcement Learning-based Image Captioning with Embedding Reward论文随笔
- Playing Atari with Deep Reinforcement Learning
- Playing Atari with Deep Reinforcement Learning
- Continuous control with deep reinforcement learning
- Continuous control with Deep Reinforcement Learning
- learning to communicate with deep multi-agent reinforcement learning
- Deep Reinforcement Learning 基础知识
- Deep Reinforcement Learning
- Deep Reinforcement learning
- Paper Reading 1 - Playing Atari with Deep Reinforcement Learning
- Paper Reading 3:Continuous control with Deep Reinforcement Learning
- Playing Atari with Deep Reinforcement Learning算法解读
- NOTE:Deep Reinforcement Learning with a Natural Language Action Space
- 解读continuous control with deep reinforcement learning(DDPG)
- 解读continuous control with deep reinforcement learning(DDPG)
- NIPS-2013:Playing Atari with Deep Reinforcement Learning
- C++后台开发面试笔记
- 【MVC】Razor语法一
- STM32 CAN总线应用程序,调试通过仅供参考
- XMind思维导图教程——如何画好思维导图?(二)
- jenkins+maven+svn 自动化布署
- Deep Reinforcement Learning-based Image Captioning with Embedding Reward
- 杭电 1103 Flo's Restaurant
- java 重写equals()方法
- Spring @value 遇到的问题
- NYOJ 309 BOBSLEDDING(细节题)
- ORACLE 解锁
- 移动开发知识技能
- 面试题:有1、2、3、4四个数字,能组成多少个互不相同且一个数字中无重复数字的三位数,并把它们都输出。
- 如何还原系统?系统还原教程