Online Sequence-to-Sequence Active Learning for Open-Domain Dialogue Generation

来源:互联网 发布:重启后正在准备windows 编辑:程序博客网 时间:2024/05/21 22:59

introduction

  • Seq2Seq drawback: generate short, dull and inconsistent responses.
  • DRL:
    • reward function: most hand-crafted

this paper propose an end-to-end, neural network based generative conversational model that learns open-domain conversation skills via online interaction with human users.

Model

  • Offline Two-Phase Supervised Learning
    • responses are short and dull
    • use Online Active Learning to tackle this issue
  • Online Active Learning
    • interacts with real users and learns incrementally from their feedback at each turn of dialog

datasets: considerably small (300K and 8K
resp.)

阅读全文
0 0
原创粉丝点击