【Pytorch】RuntimeError: arguments are located on different GPUs
来源:互联网 发布:自己讲故事软件下载 编辑:程序博客网 时间:2024/05/18 00:55
0x00 前言
Pytorch里使用optimizer的时候,由于其会记录step等信息,
有时会希望将optimizer的内容记录下来,以备之后继续使用,
那么自然而然的会想到使用API中自带的 torch.save(object, path)
torch.load(path)
再配合上 optimizer.state_dict()
optimizer.load_state_dict(obj)
来实现这一需求了~
于是,大家自然而然地会自信满满敲出如下这样的语句——
torch.save(optimizer.state_dict(), path)optimizer.load_state_dict(torch.load(path))
并收获如下的Error——
RuntimeError Traceback (most recent call last)<ipython-input-160-19f8d61b5e53> in <module>() 37 optimizer.zero_grad() 38 loss.backward()---> 39 optimizer.step() 40 print(model.state_dict()['linear_layer.weight']) 41 /usr/local/anaconda2/lib/python2.7/site-packages/torch/optim/adam.pyc in step(self, closure) 63 64 # Decay the first and second moment running average coefficient---> 65 exp_avg.mul_(beta1).add_(1 - beta1, grad) 66 exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) 67 RuntimeError: arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:215
0x01 解决方案
二话不说上解决方案是我的习惯
# Load from dictoptimizer.load_state_dict(check_point['optim'])# Load from fileoptimizer.load_state_dict(torch.load(optim_path))# Add thisfor state in optimizer.state.values(): for k, v in state.items(): print (type(v)) if torch.is_tensor(v): state[k] = v.cuda(cuda_id)
0x02 原理解释
然后在慢慢的讲为啥子~
首先,这个方案是我在Issue中翻看到的:
Thanks to pytorch/issues/2830
可以这么理解,举例说明,虽说你之前是放在GPU3上的,数据类型叫做 cuda.Tensor(GPU 3),
但是天晓得你这个GPU3是哪台机器上的GPU3哦,机器问了一下GPU3:是不是你家的啊,
GPU3看了一眼计算完被打扫干净的战场,已经空无一物——“不是吧,我家没人啊”,
然后就委婉的拒绝了它。
所以,我们可以对load完毕的optimizer逐个询问,只要是个tensor,我们就再把它介绍给GPU3一次~
阅读全文
0 0
- 【Pytorch】RuntimeError: arguments are located on different GPUs
- how the same name folder with different case on linux server are displayed on windows server
- There are many different
- Maximizing Shared Memory Bandwidth on NVIDIA Kepler GPUs
- IBM Juices Hadoop With Java On Tesla GPUs
- Argument(s) are different! Wanted:
- there are no arguments to 'malloc' that depend on a template parameter, so a declaration of 'malloc'
- Focus on different
- RuntimeError: invalid argument 4: out of range at pytorch/torch/lib/TH/generic/THTensor.c:439
- Invalid arguments ' Candidates are: void * malloc(?) '
- deepwalk遇到RuntimeError on windows trying python multiprocessing问题解决办法
- String object and String literal are different
- Mouse Buttons on different Browsers
- some command on different platform
- GCC编译错误 There are no arguments to 'X' that depend on a template parameter, so a declaration of 'X' m
- we are on grails
- 89.89% on CIFAR-10 in Pytorch
- pytorch torchvision.datasets.CocoCaptions on Linux
- mongoDB基本启动命令
- linux上项目正常部署之后正常启动,但是访问不到。。。。。
- numpy 参数(一) —— np.linalg
- iPhone x首周销量惊人?奈何玩起《王者荣耀》来却是个梗!
- BeautifulSoup4
- 【Pytorch】RuntimeError: arguments are located on different GPUs
- Ubuntu 清理 uboot
- POI操作Excel详解,HSSF和XSSF两种方式
- python 迭代器与生成器
- Linux数组笔记
- MySQL自动设置create_time和update_time
- 原生弹窗参考
- lesson6-2
- ElasticSearch 地理位置聚合