RNN(LSTM)网络可以使用那些正则化方法

来源:互联网 发布:淘宝优惠券app制作 编辑:程序博客网 时间:2024/06/04 19:11

r2-regularization, weight decay

input dropout

mask dropout

weight dropout

DropConnect (Wan et al. 2013) applied on the RNN hidden to hidden matrix

activation regularization(AR)

temporal activation regularization(TAR)

adversarial dropout, fraternal dropout

Fraternal Dropout train two identical copies of an RNN (that share parameters) with different dropout masks while minimizing the difference between their (pre-softmax) predictions.