caffe training tricks

来源:互联网 发布:电脑学打字软件 编辑:程序博客网 时间:2024/04/30 00:44
  • Implement the model that won the classification task of ImageNet 2013 #33
  • choosing batch sizes and tuning sgd #218
  • Training imagenet: loss does not decrease #401
  • How to train imagenet with reduced memory and batch size? #430
  • Again: Training imagenet: loss does not decrease #3243

Originally base_lr = 0.01 with batch_size=128, we have also used with batch_size=256 and still works. In theory when you reduce the batch_size by a factor of X then you should increase the base_lr by a factor of sqrt(X), but Alex have used a factor of X (see http://arxiv.org/abs/1404.5997)

by sguada

0 0