数据降维笔记——非负矩阵分解（NMF）,人脸数据特征提取

来源：互联网发布：淘宝客服常用问候语编辑：程序博客网时间：2024/06/05 08:03

数据降维——非负矩阵分解（NMF）

一、原理

Non-negative Matrix Factorization,实在矩阵中所有元素均为非负数约束条件之下的矩阵分解方法。

NMF能够广泛应用于图像分析、文本挖掘和语音处理等领域。

基本思想：给定一个非负矩阵V, NMF能够找到一个非负矩阵W和一个非负矩阵H，使得矩阵W和H的乘积近似等于矩阵V中的值。

V(F∗N)=W(F∗K)∗H(K∗N)

W矩阵：基础图像矩阵，相当于从原矩阵V中抽取出来的特征。

H矩阵：系数矩阵。

V(F∗N)=W(F∗K)∗H(K∗N)

矩阵分解优化目标：最小化W矩阵H矩阵的乘积和原始矩阵之间的差别。

1、基于欧式距离损失函数

2、基于KL散度的损失函数

具体求解方法：http://blog.csdn.net/acdreamers/article/details/44663421/

使用方法：

sklearn.decomposition.NMF加载NMF算法，参数：

n_components:分解有矩阵单个维度k;
init: W矩阵和H矩阵的初始化方法，默认"nndsvdar".

二、实例

Olivetti400*64*64人脸数据特征提取：

from numpy.random import RandomState
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_olivetti_faces
from sklearn import decomposition

#图像展示的排列情况
n_row, n_col = 2, 3
#设置提取特征的数目为6
n_components = n_row * n_col
#人脸数据图片大小
image_shape = (64, 64)
# Load faces data，并打乱顺序
dataset = fetch_olivetti_faces(shuffle=True, random_state=RandomState(0))
faces = dataset.data

###############################################################################
def plot_gallery(title, images, n_col=n_col, n_row=n_row):
    plt.figure(figsize=(1. * n_col, 1.26 * n_row))
    plt.suptitle(title, size=16)

    for i, comp in enumerate(images):
        plt.subplot(n_row, n_col, i + 1)
        vmax = max(comp.max(), -comp.min())

    #对数值归一化并以灰度图形式显示
        plt.imshow(comp.reshape(image_shape), cmap=plt.cm.gray,
                   interpolation='nearest', vmin=-vmax, vmax=vmax)
        plt.xticks(())  #去除子图的坐标轴标签
        plt.yticks(())
    #调整子图位置及间隔
    plt.subplots_adjust(0.01, 0.05, 0.99, 0.94, 0.04, 0.)


plot_gallery("First centered Olivetti faces", faces[:n_components])
###############################################################################

estimators = [
    ('Eigenfaces - PCA using randomized SVD',
         decomposition.PCA(n_components=6,whiten=True)),

    ('Non-negative components - NMF',
         decomposition.NMF(n_components=6, init='nndsvda', tol=5e-3))
]

###############################################################################

for name, estimator in estimators:
    print("Extracting the top %d %s..." % (n_components, name))
    print(faces.shape)
    estimator.fit(faces)
    components_ = estimator.components_
    plot_gallery(name, components_[:n_components])

plt.show()

Extracting the top 6 Eigenfaces - PCA using randomized SVD...
(400, 4096)
Extracting the top 6 Non-negative components - NMF...
(400, 4096)

输出结果：

阅读全文

0 0