Metal Performance Shader使用小结

来源：互联网发布：淘宝心级怎么算买家编辑：程序博客网时间：2024/04/30 13:03

Metal Performance Shader介绍

metal performance shader是apple推出的一套通过metal来在ios上实现深度学习的工具，它主要封装了MPSImage来存储数据管理内存，相当于caffe中的Blob、MXNet中的NDArray，实现了Convolution、Pooling、Fullconnetcion、ReLU等常用的卷积神经网络中的Layer。可以说使用MPS是在ios上使用GPU来实现CNN网络最方便、运行效率最高的方式，因为apple设计的MPS api接口很规范，而且它针对它针对iphone、ipad上特定的硬件架构做了很多优化。
我们在ios上实现深度学习通常指的是inference过程，也就是应用，而不是training不是训练。通常的做法是在PC端、服务端使用MXNet、Caffe、Tensorflow、Torch这些工具做好训练，然后将训练好的网络模型转换到MPS可读的存储方式，在ios移动端使用这些转换过的训练好的模型参数进行应用。

模型转换

在服务端训练好的模型参数需要进行转换才能被MPS使用。一般的CNN网络包含可训练参数的Layer基本上只有Convolution、Fullconnetcion、Normalization这三种layer。也就是说只要把这三种层的参数拿出来转化为MPS需要的格式就可以给MPS使用了。这里给出一个例子，通常主要包含两个处理部分——通过对Conv层的权重进行变换处理将batch normalization层去掉和调整Conv层中weight的Cin、Cout、Kernel_width、Kernel_height的存储顺序。（mps为weight[channel_out][kernel_height][kernel_weight][chennel_in]）

import mxnet as mximport numpy as npimport  symbolimport osimport mathdir = './params'   #save path#load mxnet modelarg_params = mx.nd.load('models/simple1_args.nd')aux_params = mx.nd.load('models/simple1_auxs.nd')#conv layers followed by a batch normalization layerconv_bn_layers = {'conv0':'conv0_bn',                    'down0':'down0_bn',                    'down1':'down1_bn',                    'conv1':'conv1_bn',                    'conv2':'conv2_bn',                    'conv3':'conv3_bn'}#没有BN的卷积层名称conv_layers= ['conv4']for conv_layer in conv_bn_layers:    bn_layer = conv_bn_layers[conv_layer]    weight = arg_params[conv_layer+'_weight'].asnumpy()    bias = arg_params[conv_layer+'_bias'].asnumpy()    gamma = arg_params[bn_layer+'_gamma'].asnumpy()    beta = arg_params[bn_layer+'_beta'].asnumpy()    var = aux_params[bn_layer+'_moving_var'].asnumpy()    mean = aux_params[bn_layer+'_moving_mean'].asnumpy()    kernel_shape = weight.shape   #BN处理    for i in xrange(kernel_shape[0]):        m = mean[i]        a = gamma[i] / math.sqrt(var[i]+0.001)        bias[i] = beta[i] + a * (bias[i]-m)        weight[i,:,:,:] = a * weight[i,:,:,:]    #MXNet : weight[Cout][Cin][H][W]   -> MPS : weight[Cout][H][W][Cin]    weight = np.transpose(weight, (0,2,3,1))    #save to file     bias.tofile(os.path.join(dir,conv_layer+'_bias.dat'),'')    weight.tofile(os.path.join(dir,conv_layer+'_weight.dat'),'')for conv_layer in conv_layers:    weight = arg_params[conv_layer+'_weight'].asnumpy()    bias = arg_params[conv_layer+'_bias'].asnumpy()    #MXNet : weight[Cout][Cin][H][W]   -> MPS : weight[Cout][H][W][Cin]    weight = np.transpose(weight, (0,2,3,1))    #save to file     bias.tofile(os.path.join(dir,conv_layer+'_bias.dat'),'')    weight.tofile(os.path.join(dir,conv_layer+'_weight.dat'),'')

MPS使用流程

mps https://developer.apple.com/reference/metalperformanceshaders
inception-v3_demo https://developer.apple.com/library/content/samplecode/MetalImageRecognition/Introduction/Intro.html
metal https://developer.apple.com/library/content/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html

Tips

Metal_debugger_tools https://developer.apple.com/library/content/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Dev-Technique/Dev-Technique.html

https://developer.apple.com/videos/play/wwdc2015/610/

0 0