Numpy混淆点理解

来源:互联网 发布:鲁滨逊漂流记java 编辑:程序博客网 时间:2024/05/16 18:03

Numpy 混淆点理解

在学习吴恩达老师机器学习课程中遇到了几个迷惑的地方,顺便动手实验记录备忘。

shape

可以类比TensorFlow里张量理解。举个例子,shape(x ,y ,z),加入有一整块大空间,这一整块空间被装进了一个巨大的箱子,首先把他分成x份,每一份用一个箱子装起来。然后每个箱子里的空间再分成y份,每一份再用一个箱子装起来。然后每箱子里的空间分开成z份,没个z份在一个箱子里面装着。那么可以说这个shape有3个维度(或者轴)。假入x,y,z为4,3,2:

import numpy as npbig_big_box = np.ones([4,3,2])big_box = big_big_box[0]box = big_box[0]print('------------')print(big_big_box)print('------------')print(big_box)print('------------')print(box)print('------------')
------------[[[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]]]------------[[ 1.  1.] [ 1.  1.] [ 1.  1.]]------------[ 1.  1.]------------

用一对中括号[ ]表示一个箱子,可以看到这个大箱子的结构如上所示。

reshape

reshape就是在总空间数(元素数)不变的情况下对其进行重新装箱。为了有对比,首先随机生成一个[4 ,3, 2]的大箱子

big_big_box = np.random.random([4,3,2])big_box = big_big_box[0]box = big_box[0]print('------------')print(big_big_box)print('------------')print(big_box)print('------------')print(box)print('------------')
------------[[[ 0.57147915  0.20856683]  [ 0.24160326  0.91972301]  [ 0.84122294  0.50314707]] [[ 0.10866992  0.37181455]  [ 0.64949617  0.07105958]  [ 0.95536348  0.01871577]] [[ 0.39776123  0.86779423]  [ 0.73960679  0.5238423 ]  [ 0.73531635  0.07187795]] [[ 0.51312551  0.47686621]  [ 0.9699309   0.97417971]  [ 0.43582324  0.47256879]]]------------[[ 0.57147915  0.20856683] [ 0.24160326  0.91972301] [ 0.84122294  0.50314707]]------------[ 0.57147915  0.20856683]------------

把箱子里面的每个元素都拿出来,装大一个大箱子里,注意观察顺序!

box1 = big_big_box.reshape(4*3*2)print(box1)
[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707  0.10866992  0.37181455  0.64949617  0.07105958  0.95536348  0.01871577  0.39776123  0.86779423  0.73960679  0.5238423   0.73531635  0.07187795  0.51312551  0.47686621  0.9699309   0.97417971  0.43582324  0.47256879]

假如装进大两个箱子里,注意观察顺序

box2 = big_big_box.reshape(2, 4*3)print(box2)
[[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707   0.10866992  0.37181455  0.64949617  0.07105958  0.95536348  0.01871577] [ 0.39776123  0.86779423  0.73960679  0.5238423   0.73531635  0.07187795   0.51312551  0.47686621  0.9699309   0.97417971  0.43582324  0.47256879]]

装进3个呢?

box3 = big_big_box.reshape(3, 4*2)print(box3)
[[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707   0.10866992  0.37181455] [ 0.64949617  0.07105958  0.95536348  0.01871577  0.39776123  0.86779423   0.73960679  0.5238423 ] [ 0.73531635  0.07187795  0.51312551  0.47686621  0.9699309   0.97417971   0.43582324  0.47256879]]

可以发现每次装箱,每个元素的顺序是不发生改变的,只不过每个元素所属的箱子不同了。

# 负数表示自动推算维度box4 = box3.reshape(2,-1)box5 = box3.reshape(4,-1)box6 = box3.reshape(6,-1)box7 = box3.reshape(2,2,-1)print(box4)print('-----------')print(box5)print('-----------')print(box6)print('-----------')print(box7)
[[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707   0.10866992  0.37181455  0.64949617  0.07105958  0.95536348  0.01871577] [ 0.39776123  0.86779423  0.73960679  0.5238423   0.73531635  0.07187795   0.51312551  0.47686621  0.9699309   0.97417971  0.43582324  0.47256879]]-----------[[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707] [ 0.10866992  0.37181455  0.64949617  0.07105958  0.95536348  0.01871577] [ 0.39776123  0.86779423  0.73960679  0.5238423   0.73531635  0.07187795] [ 0.51312551  0.47686621  0.9699309   0.97417971  0.43582324  0.47256879]]-----------[[ 0.57147915  0.20856683  0.24160326  0.91972301] [ 0.84122294  0.50314707  0.10866992  0.37181455] [ 0.64949617  0.07105958  0.95536348  0.01871577] [ 0.39776123  0.86779423  0.73960679  0.5238423 ] [ 0.73531635  0.07187795  0.51312551  0.47686621] [ 0.9699309   0.97417971  0.43582324  0.47256879]]-----------[[[ 0.57147915  0.20856683  0.24160326  0.91972301  0.84122294  0.50314707]  [ 0.10866992  0.37181455  0.64949617  0.07105958  0.95536348  0.01871577]] [[ 0.39776123  0.86779423  0.73960679  0.5238423   0.73531635  0.07187795]  [ 0.51312551  0.47686621  0.9699309   0.97417971  0.43582324  0.47256879]]]

axis

在np.sum,np.linalg.norm,函数中,有一个参数axis,指定运算的轴,如果没有次参数默认都每一个元素运算,指定后对该轴运算,这里的轴就是指shape(x, y, z)有三个轴,标号为0,1,2。

先生成一个有三个轴的单位矩阵:

box = np.ones([4,3,2])print(box)
[[[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]] [[ 1.  1.]  [ 1.  1.]  [ 1.  1.]]]
# 不指定轴,所有元素相加print(np.sum(box))print(np.sum(box, keepdims=True))
24.0[[[ 24.]]]
# 指定轴 print(np.sum(box, axis=0))print(np.sum(box, axis=0, keepdims=True))print('------------')print(np.sum(box, axis=1))print(np.sum(box, axis=1, keepdims=True))print('------------')print(np.sum(box, axis=2))print(np.sum(box, axis=2, keepdims=True))
[[ 4.  4.] [ 4.  4.] [ 4.  4.]][[[ 4.  4.]  [ 4.  4.]  [ 4.  4.]]]------------[[ 3.  3.] [ 3.  3.] [ 3.  3.] [ 3.  3.]][[[ 3.  3.]] [[ 3.  3.]] [[ 3.  3.]] [[ 3.  3.]]]------------[[ 2.  2.  2.] [ 2.  2.  2.] [ 2.  2.  2.] [ 2.  2.  2.]][[[ 2.]  [ 2.]  [ 2.]] [[ 2.]  [ 2.]  [ 2.]] [[ 2.]  [ 2.]  [ 2.]] [[ 2.]  [ 2.]  [ 2.]]]

轴为i,就对第i轴里面的数据进行运算!
axis = 0 时,对把每个[3, 2]相加,依次类推。

存储图片

假如我有100张图像,每张图像大小为64*32,每张图像有3个通道,那么存储图像的shape为[100,64,32,3]

把图像平铺成列向量 reshape(100,-1).T。如果通过reshape(64*32*3,100)进行平铺这样对吗?显然是不对的。

原创粉丝点击