学习笔记TF058:人脸识别

来源:互联网 发布:修改软件下载 编辑:程序博客网 时间:2024/06/16 04:48

人脸识别,基于人脸部特征信息识别身份的生物识别技术。摄像机、摄像头采集人脸图像或视频流,自动检测、跟踪图像中人脸,做脸部相关技术处理,人脸检测、人脸关键点检测、人脸验证等。《麻省理工科技评论》(MIT Technology Review),2017年全球十大突破性技术榜单,支付宝“刷脸支付”(Paying with Your Face)入围。

人脸识别优势,非强制性(采集方式不容易被察觉,被识别人脸图像可主动获取)、非接触性(用户不需要与设备接触)、并发性(可同时多人脸检测、跟踪、识别)。深度学习前,人脸识别两步骤:高维人工特征提取、降维。传统人脸识别技术基于可见光图像。深度学习+大数据(海量有标注人脸数据)为人脸识别领域主流技术路线。神经网络人脸识别技术,大量样本图像训练识别模型,无需人工选取特征,样本训练过程自行学习,识别准确率可以达到99%。

人脸识别技术流程。

人脸图像采集、检测。人脸图像采集,摄像头把人脸图像采集下来,静态图像、动态图像、不同位置、不同表情。用户在采集设备拍报范围内,采集设置自动搜索并拍摄。人脸检测属于目标检测(object detection)。对要检测目标对象概率统计,得到待检测对象特征,建立目标检测模型。用模型匹配输入图像,输出匹配区域。人脸检测是人脸识别预处理,准确标定人脸在图像的位置大小。人脸图像模式特征丰富,直方图特征、颜色特征、模板特征、结构特征、哈尔特征(Haar-like feature)。人脸检测挑出有用信息,用特征检测人脸。人脸检测算法,模板匹配模型、Adaboost模型,Adaboost模型速度。精度综合性能最好,训练慢、检测快,可达到视频流实时检测效果。

人脸图像预处理。基于人脸检测结果,处理图像,服务特征提取。系统获取人脸图像受到各种条件限制、随机干扰,需缩放、旋转、拉伸、光线补偿、灰度变换、直方图均衡化、规范化、几何校正、过滤、锐化等图像预处理。

人脸图像特征提取。人脸图像信息数字化,人脸图像转变为一串数字(特征向量)。如,眼睛左边、嘴唇右边、鼻子、下巴位置,特征点间欧氏距离、曲率、角度提取出特征分量,相关特征连接成长特征向量。

人脸图像匹配、识别。提取人脸图像特征数据与数据库存储人脸特征模板搜索匹配,根据相似程度对身份信息进行判断,设定阈值,相似度越过阈值,输出匹配结果。确认,一对一(1:1)图像比较,证明“你就是你”,金融核实身份、信息安全领域。辨认,一对多(1:N)图像匹配,“N人中找你”,视频流,人走进识别范围就完成识别,安防领域。

人脸识别分类。

人脸检测。检测、定位图片人脸,返回高业饿呀人脸框坐标。对人脸分析、处理的第一步。“滑动窗口”,选择图像矩形区域作滑动窗口,窗口中提取特征对图像区域描述,根据特征描述判断窗口是否人脸。不断遍历需要观察窗口。

人脸关键点检测。定位、返回人脸五官、轮廓关键点坐标位置。人脸轮廓、眼睛、眉毛、嘴唇、鼻子轮廓。Face++提供高达106点关键点。人脸关键点定位技术,级联形回归(cascaded shape regression, CSR)。人脸识别,基于DeepID网络结构。DeepID网络结构类似卷积神经网络结构,倒数第二层,有DeepID层,与卷积层4、最大池化层3相连,卷积神经网络层数越高视野域越大,既考虑局部特征,又考虑全局特征。输入层 31x39x1、卷积层1 28x36x20(卷积核4x4x1)、最大池化层1 12x18x20(过滤器2x2)、卷积层2 12x16x20(卷积核3x3x20)、最大池化层2 6x8x40(过滤器2x2)、卷积层3 4x6x60(卷积核3x3x40)、最大池化层2 2x3x60(过滤器2x2)、卷积层4 2x2x80(卷积核2x2x60)、DeepID层 1x160、全连接层 Softmax。《Deep Learning Face Representation from Predicting 10000 Classes》 http://mmlab.ie.cuhk.edu.hk/pdf/YiSun_CVPR14.pdf 。

人脸验证。分析两张人脸同一人可能性大小。输入两张人脸,得到置信度分类、相应阈值,评估相似度。

人脸属性检测。人脸属性辩识、人脸情绪分析。https://www.betaface.com/wpa/ 在线人脸识别测试。给出人年龄、是否有胡子、情绪(高兴、正常、生气、愤怒)、性别、是否带眼镜、肤色。

人脸识别应用,美图秀秀美颜应用、世纪佳缘查看潜在配偶“面相”相似度,支付领域“刷脸支付”,安防领域“人脸鉴权”。Face++、商汤科技,提供人脸识别SDK。

人脸检测。https://github.com/davidsandberg/facenet 。

Florian Schroff、Dmitry Kalenichenko、James Philbin论文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 https://arxiv.org/abs/1503.03832 。https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw 。

LFW(Labeled Faces in the Wild Home)数据集。http://vis-www.cs.umass.edu/lfw/ 。美国马萨诸塞大学阿姆斯特分校计算机视觉实验室整理。13233张图片,5749人。4096人只有一张图片,1680人多于一张。每张图片尺寸250x250。人脸图片在每个人物名字文件夹下。

数据预处理。校准代码 https://github.com/davidsandberg/facenet/blob/master/src/align/align_dataset_mtcnn.py 。
检测所用数据集校准为和预训练模型所用数据集大小一致。
设置环境变量

export PYTHONPATH=[...]/facenet/src

校准命令

for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

预训练模型20170216-091149.zip https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk 。
训练集 MS-Celeb-1M数据集 https://www.microsoft.com/en-us/research/project/ms-celeb-1m-challenge-recognizing-one-million-celebrities-real-world/ 。微软人脸识别数据库,名人榜选择前100万名人,搜索引擎采集每个名人100张人脸图片。预训练模型准确率0.993+-0.004。

检测。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models
基准比较,采用facenet/data/pairs.txt,官方随机生成数据,匹配和不匹配人名和图片编号。

十折交叉验证(10-fold cross validation),精度测试方法。数据集分成10份,轮流将其中9份做训练集,1份做测试保,10次结果均值作算法精度估计。一般需要多次10折交叉验证求均值。

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport tensorflow as tfimport numpy as npimport argparseimport facenetimport lfwimport osimport sysimport mathfrom sklearn import metricsfrom scipy.optimize import brentqfrom scipy import interpolatedef main(args):    with tf.Graph().as_default():        with tf.Session() as sess:            # Read the file containing the pairs used for testing            # 1. 读入之前的pairs.txt文件            # 读入后如[['Abel_Pacheco','1','4']]            pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))            # Get the paths for the corresponding images            # 获取文件路径和是否匹配关系对            paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)            # Load the model            # 2. 加载模型            facenet.load_model(args.model)            # Get input and output tensors            # 获取输入输出张量            images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")            embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")            #image_size = images_placeholder.get_shape()[1]  # For some reason this doesn't work for frozen graphs            image_size = args.image_size            embedding_size = embeddings.get_shape()[1]            # Run forward pass to calculate embeddings            # 3. 使用前向传播验证            print('Runnning forward pass on LFW images')            batch_size = args.lfw_batch_size            nrof_images = len(paths)            nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 总共批次数            emb_array = np.zeros((nrof_images, embedding_size))            for i in range(nrof_batches):                start_index = i*batch_size                end_index = min((i+1)*batch_size, nrof_images)                paths_batch = paths[start_index:end_index]                images = facenet.load_data(paths_batch, False, False, image_size)                feed_dict = { images_placeholder:images, phase_train_placeholder:False }                emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict)            # 4. 计算准确率、验证率,十折交叉验证方法            tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array,                 actual_issame, nrof_folds=args.lfw_nrof_folds)            print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy)))            print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))            # 得到auc值            auc = metrics.auc(fpr, tpr)            print('Area Under Curve (AUC): %1.3f' % auc)            # 1得到错误率(eer)            eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)            print('Equal Error Rate (EER): %1.3f' % eer)def parse_arguments(argv):    parser = argparse.ArgumentParser()    parser.add_argument('lfw_dir', type=str,        help='Path to the data directory containing aligned LFW face patches.')    parser.add_argument('--lfw_batch_size', type=int,        help='Number of images to process in a batch in the LFW test set.', default=100)    parser.add_argument('model', type=str,         help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')    parser.add_argument('--image_size', type=int,        help='Image size (height, width) in pixels.', default=160)    parser.add_argument('--lfw_pairs', type=str,        help='The file containing the pairs to use for validation.', default='data/pairs.txt')    parser.add_argument('--lfw_file_ext', type=str,        help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png'])    parser.add_argument('--lfw_nrof_folds', type=int,        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)    return parser.parse_args(argv)if __name__ == '__main__':    main(parse_arguments(sys.argv[1:]))

性别、年龄识别。https://github.com/dpressel/rude-carnie 。

Adience 数据集。http://www.openu.ac.il/home/hassner/Adience/data.html#agegender 。26580张图片,2284类,年龄范围8个区段(0~2、4~6、8~13、15~20、25~32、38~43、48~53、60~),含有噪声、姿势、光照变化。aligned # 经过剪裁对齐数据,faces # 原始数据。fold_0_data.txt至fold_4_data.txt 全部数据标记。fold_frontal_0_data.txt至fold_frontal_4_data.txt 仅用近似正面姿态面部标记。数据结构 user_id 用户Flickr帐户ID、original_image 图片文件名、face_id 人标识符、age、gender、x、y、dx、dy 人脸边框、tilt_ang 切斜角度、fiducial_yaw_angle 基准偏移角度、fiducial_score 基准分数。https://www.flickr.com/

数据预处理。脚本把数据处理成TFRecords格式。https://github.com/dpressel/rude-carnie/blob/master/preproc.py 。https://github.com/GilLevi/AgeGenderDeepLearning/tree/master/Folds文件夹,已经对训练集、测试集划分、标注。gender_train.txt、gender_val.txt 图片列表 Adience 数据集处理TFRecords文件。图片处理为大小256x256 JPEG编码RGB图像。tf.python_io.TFRecordWriter写入TFRecords文件,输出文件output_file。

构建模型。年龄、性别训练模型,Gil Levi、Tal Hassner论文《Age and Gender Classification Using Convolutional Neural Networks》http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.722.9654&rank=1 。模型 https://github.com/dpressel/rude-carnie/blob/master/model.py 。tenforflow.contrib.slim。

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom datetime import datetimeimport timeimport osimport numpy as npimport tensorflow as tffrom data import distorted_inputsimport refrom tensorflow.contrib.layers import *from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_baseTOWER_NAME = 'tower'def select_model(name):    if name.startswith('inception'):        print('selected (fine-tuning) inception model')        return inception_v3    elif name == 'bn':        print('selected batch norm model')        return levi_hassner_bn    print('selected default model')    return levi_hassnerdef get_checkpoint(checkpoint_path, requested_step=None, basename='checkpoint'):    if requested_step is not None:        model_checkpoint_path = '%s/%s-%s' % (checkpoint_path, basename, requested_step)        if os.path.exists(model_checkpoint_path) is None:            print('No checkpoint file found at [%s]' % checkpoint_path)            exit(-1)            print(model_checkpoint_path)        print(model_checkpoint_path)        return model_checkpoint_path, requested_step    ckpt = tf.train.get_checkpoint_state(checkpoint_path)    if ckpt and ckpt.model_checkpoint_path:        # Restore checkpoint as described in top of this program        print(ckpt.model_checkpoint_path)        global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]        return ckpt.model_checkpoint_path, global_step    else:        print('No checkpoint file found at [%s]' % checkpoint_path)        exit(-1)def _activation_summary(x):    tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name)    tf.summary.histogram(tensor_name + '/activations', x)    tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))def inception_v3(nlabels, images, pkeep, is_training):    batch_norm_params = {        "is_training": is_training,        "trainable": True,        # Decay for the moving averages.        "decay": 0.9997,        # Epsilon to prevent 0s in variance.        "epsilon": 0.001,        # Collection containing the moving mean and moving variance.        "variables_collections": {            "beta": None,            "gamma": None,            "moving_mean": ["moving_vars"],            "moving_variance": ["moving_vars"],        }    }    weight_decay = 0.00004    stddev=0.1    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)    with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope:        with tf.contrib.slim.arg_scope(                [tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected],                weights_regularizer=weights_regularizer,                trainable=True):            with tf.contrib.slim.arg_scope(                    [tf.contrib.slim.conv2d],                    weights_initializer=tf.truncated_normal_initializer(stddev=stddev),                    activation_fn=tf.nn.relu,                    normalizer_fn=batch_norm,                    normalizer_params=batch_norm_params):                net, end_points = inception_v3_base(images, scope=scope)                with tf.variable_scope("logits"):                    shape = net.get_shape()                    net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool")                    net = tf.nn.dropout(net, pkeep, name='droplast')                    net = flatten(net, scope="flatten")    with tf.variable_scope('output') as scope:        weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name='weights')        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')        output = tf.add(tf.matmul(net, weights), biases, name=scope.name)        _activation_summary(output)    return outputdef levi_hassner_bn(nlabels, images, pkeep, is_training):    batch_norm_params = {        "is_training": is_training,        "trainable": True,        # Decay for the moving averages.        "decay": 0.9997,        # Epsilon to prevent 0s in variance.        "epsilon": 0.001,        # Collection containing the moving mean and moving variance.        "variables_collections": {            "beta": None,            "gamma": None,            "moving_mean": ["moving_vars"],            "moving_variance": ["moving_vars"],        }    }    weight_decay = 0.0005    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)    with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope:        with tf.contrib.slim.arg_scope(                [convolution2d, fully_connected],                weights_regularizer=weights_regularizer,                biases_initializer=tf.constant_initializer(1.),                weights_initializer=tf.random_normal_initializer(stddev=0.005),                trainable=True):            with tf.contrib.slim.arg_scope(                    [convolution2d],                    weights_initializer=tf.random_normal_initializer(stddev=0.01),                    normalizer_fn=batch_norm,                    normalizer_params=batch_norm_params):                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')                conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')                 pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')                conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding='SAME', biases_initializer=tf.constant_initializer(0.), scope='conv3')                pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')                # can use tf.contrib.layer.flatten                flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')                full1 = fully_connected(flat, 512, scope='full1')                drop1 = tf.nn.dropout(full1, pkeep, name='drop1')                full2 = fully_connected(drop1, 512, scope='full2')                drop2 = tf.nn.dropout(full2, pkeep, name='drop2')    with tf.variable_scope('output') as scope:        weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')        output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)    return outputdef levi_hassner(nlabels, images, pkeep, is_training):    weight_decay = 0.0005    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)    with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:        with tf.contrib.slim.arg_scope(                [convolution2d, fully_connected],                weights_regularizer=weights_regularizer,                biases_initializer=tf.constant_initializer(1.),                weights_initializer=tf.random_normal_initializer(stddev=0.005),                trainable=True):            with tf.contrib.slim.arg_scope(                    [convolution2d],                    weights_initializer=tf.random_normal_initializer(stddev=0.01)):                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')                norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1')                conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')                 pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')                norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2')                conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3')                pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')                flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')                full1 = fully_connected(flat, 512, scope='full1')                drop1 = tf.nn.dropout(full1, pkeep, name='drop1')                full2 = fully_connected(drop1, 512, scope='full2')                drop2 = tf.nn.dropout(full2, pkeep, name='drop2')    with tf.variable_scope('output') as scope:        weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')        output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)    return output

训练模型。https://github.com/dpressel/rude-carnie/blob/master/train.py 。

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom six.moves import xrangefrom datetime import datetimeimport timeimport osimport numpy as npimport tensorflow as tffrom data import distorted_inputsfrom model import select_modelimport jsonimport reLAMBDA = 0.01MOM = 0.9tf.app.flags.DEFINE_string('pre_checkpoint_path', '',                           """If specified, restore this pretrained model """                           """before beginning any training.""")tf.app.flags.DEFINE_string('train_dir', '/home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0',                           'Training directory')tf.app.flags.DEFINE_boolean('log_device_placement', False,                            """Whether to log device placement.""")tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,                            'Number of preprocessing threads')tf.app.flags.DEFINE_string('optim', 'Momentum',                           'Optimizer')tf.app.flags.DEFINE_integer('image_size', 227,                            'Image size')tf.app.flags.DEFINE_float('eta', 0.01,                          'Learning rate')tf.app.flags.DEFINE_float('pdrop', 0.,                          'Dropout probability')tf.app.flags.DEFINE_integer('max_steps', 40000,                          'Number of iterations')tf.app.flags.DEFINE_integer('steps_per_decay', 10000,                            'Number of steps before learning rate decay')tf.app.flags.DEFINE_float('eta_decay_rate', 0.1,                          'Learning rate decay')tf.app.flags.DEFINE_integer('epochs', -1,                            'Number of epochs')tf.app.flags.DEFINE_integer('batch_size', 128,                            'Batch size')tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',                          'Checkpoint name')tf.app.flags.DEFINE_string('model_type', 'default',                           'Type of convnet')tf.app.flags.DEFINE_string('pre_model',                            '',#'./inception_v3.ckpt',                           'checkpoint file')FLAGS = tf.app.flags.FLAGS# Every 5k steps cut learning rate in halfdef exponential_staircase_decay(at_step=10000, decay_rate=0.1):    print('decay [%f] every [%d] steps' % (decay_rate, at_step))    def _decay(lr, global_step):        return tf.train.exponential_decay(lr, global_step,                                          at_step, decay_rate, staircase=True)    return _decaydef optimizer(optim, eta, loss_fn, at_step, decay_rate):    global_step = tf.Variable(0, trainable=False)    optz = optim    if optim == 'Adadelta':        optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6)        lr_decay_fn = None    elif optim == 'Momentum':        optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM)        lr_decay_fn = exponential_staircase_decay(at_step, decay_rate)    return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)def loss(logits, labels):    labels = tf.cast(labels, tf.int32)    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(        logits=logits, labels=labels, name='cross_entropy_per_example')    cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')    tf.add_to_collection('losses', cross_entropy_mean)    losses = tf.get_collection('losses')    regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)    total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses)    tf.summary.scalar('tl (raw)', total_loss)    #total_loss = tf.add_n(losses + regularization_losses, name='total_loss')    loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')    loss_averages_op = loss_averages.apply(losses + [total_loss])    for l in losses + [total_loss]:        tf.summary.scalar(l.op.name + ' (raw)', l)        tf.summary.scalar(l.op.name, loss_averages.average(l))    with tf.control_dependencies([loss_averages_op]):        total_loss = tf.identity(total_loss)    return total_lossdef main(argv=None):    with tf.Graph().as_default():        model_fn = select_model(FLAGS.model_type)        # Open the metadata file and figure out nlabels, and size of epoch        # 打开元数据文件md.json,这个文件是在预处理数据时生成。找出nlabels、epoch大小        input_file = os.path.join(FLAGS.train_dir, 'md.json')        print(input_file)        with open(input_file, 'r') as f:            md = json.load(f)        images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads)        logits = model_fn(md['nlabels'], images, 1-FLAGS.pdrop, True)        total_loss = loss(logits, labels)        train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate)        saver = tf.train.Saver(tf.global_variables())        summary_op = tf.summary.merge_all()        sess = tf.Session(config=tf.ConfigProto(            log_device_placement=FLAGS.log_device_placement))        tf.global_variables_initializer().run(session=sess)        # This is total hackland, it only works to fine-tune iv3        # 本例可以输入预训练模型Inception V3,可用来微调 Inception V3        if FLAGS.pre_model:            inception_variables = tf.get_collection(                tf.GraphKeys.VARIABLES, scope="InceptionV3")            restorer = tf.train.Saver(inception_variables)            restorer.restore(sess, FLAGS.pre_model)        if FLAGS.pre_checkpoint_path:            if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True:                print('Trying to restore checkpoint from %s' % FLAGS.pre_checkpoint_path)                restorer = tf.train.Saver()                tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path)                print('%s: Pre-trained model restored from %s' %                      (datetime.now(), FLAGS.pre_checkpoint_path))        # 将ckpt文件存储在run-(pid)目录        run_dir = '%s/run-%d' % (FLAGS.train_dir, os.getpid())        checkpoint_path = '%s/%s' % (run_dir, FLAGS.checkpoint)        if tf.gfile.Exists(run_dir) is False:            print('Creating %s' % run_dir)            tf.gfile.MakeDirs(run_dir)        tf.train.write_graph(sess.graph_def, run_dir, 'model.pb', as_text=True)        tf.train.start_queue_runners(sess=sess)        summary_writer = tf.summary.FileWriter(run_dir, sess.graph)        steps_per_train_epoch = int(md['train_counts'] / FLAGS.batch_size)        num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch        print('Requested number of steps [%d]' % num_steps)        for step in xrange(num_steps):            start_time = time.time()            _, loss_value = sess.run([train_op, total_loss])            duration = time.time() - start_time            assert not np.isnan(loss_value), 'Model diverged with loss = NaN'            # 每10步记录一次摘要文件,保存一个检查点文件            if step % 10 == 0:                num_examples_per_step = FLAGS.batch_size                examples_per_sec = num_examples_per_step / duration                sec_per_batch = float(duration)                format_str = ('%s: step %d, loss = %.3f (%.1f examples/sec; %.3f ' 'sec/batch)')                print(format_str % (datetime.now(), step, loss_value,                                    examples_per_sec, sec_per_batch))            # Loss only actually evaluated every 100 steps?            if step % 100 == 0:                summary_str = sess.run(summary_op)                summary_writer.add_summary(summary_str, step)            if step % 1000 == 0 or (step + 1) == num_steps:                saver.save(sess, checkpoint_path, global_step=step)if __name__ == '__main__':    tf.app.run()

验证模型。https://github.com/dpressel/rude-carnie/blob/master/guess.py 。

from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionfrom datetime import datetimeimport mathimport timefrom data import inputsimport numpy as npimport tensorflow as tffrom model import select_model, get_checkpointfrom utils import *import osimport jsonimport csvRESIZE_FINAL = 227GENDER_LIST =['M','F']AGE_LIST = ['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']MAX_BATCH_SZ = 128tf.app.flags.DEFINE_string('model_dir', '',                           'Model directory (where training data lives)')tf.app.flags.DEFINE_string('class_type', 'age',                           'Classification type (age|gender)')tf.app.flags.DEFINE_string('device_id', '/cpu:0',                           'What processing unit to execute inference on')tf.app.flags.DEFINE_string('filename', '',                           'File (Image) or File list (Text/No header TSV) to process')tf.app.flags.DEFINE_string('target', '',                           'CSV file containing the filename processed along with best guess and score')tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',                          'Checkpoint basename')tf.app.flags.DEFINE_string('model_type', 'default',                           'Type of convnet')tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')tf.app.flags.DEFINE_string('face_detection_model', '', 'Do frontal face detection with model specified')tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')FLAGS = tf.app.flags.FLAGSdef one_of(fname, types):    return any([fname.endswith('.' + ty) for ty in types])def resolve_file(fname):    if os.path.exists(fname): return fname    for suffix in ('.jpg', '.png', '.JPG', '.PNG', '.jpeg'):        cand = fname + suffix        if os.path.exists(cand):            return cand    return Nonedef classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer):    try:        num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ)        pg = ProgressBar(num_batches)        for j in range(num_batches):            start_offset = j * MAX_BATCH_SZ            end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files))            batch_image_files = image_files[start_offset:end_offset]            print(start_offset, end_offset, len(batch_image_files))            image_batch = make_multi_image_batch(batch_image_files, coder)            batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})            batch_sz = batch_results.shape[0]            for i in range(batch_sz):                output_i = batch_results[i]                best_i = np.argmax(output_i)                best_choice = (label_list[best_i], output_i[best_i])                print('Guess @ 1 %s, prob = %.2f' % best_choice)                if writer is not None:                    f = batch_image_files[i]                    writer.writerow((f, best_choice[0], '%.2f' % best_choice[1]))            pg.update()        pg.done()    except Exception as e:        print(e)        print('Failed to run all images')def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer):    try:        print('Running file %s' % image_file)        image_batch = make_multi_crop_batch(image_file, coder)        batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})        output = batch_results[0]        batch_sz = batch_results.shape[0]        for i in range(1, batch_sz):            output = output + batch_results[i]        output /= batch_sz        best = np.argmax(output) # 最可能性能分类        best_choice = (label_list[best], output[best])        print('Guess @ 1 %s, prob = %.2f' % best_choice)        nlabels = len(label_list)        if nlabels > 2:            output[best] = 0            second_best = np.argmax(output)            print('Guess @ 2 %s, prob = %.2f' % (label_list[second_best], output[second_best]))        if writer is not None:            writer.writerow((image_file, best_choice[0], '%.2f' % best_choice[1]))    except Exception as e:        print(e)        print('Failed to run image %s ' % image_file)def list_images(srcfile):    with open(srcfile, 'r') as csvfile:        delim = ',' if srcfile.endswith('.csv') else '\t'        reader = csv.reader(csvfile, delimiter=delim)        if srcfile.endswith('.csv') or srcfile.endswith('.tsv'):            print('skipping header')            _ = next(reader)        return [row[0] for row in reader]def main(argv=None):  # pylint: disable=unused-argument    files = []    if FLAGS.face_detection_model:        print('Using face detector (%s) %s' % (FLAGS.face_detection_type, FLAGS.face_detection_model))        face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model)        face_files, rectangles = face_detect.run(FLAGS.filename)        print(face_files)        files += face_files    config = tf.ConfigProto(allow_soft_placement=True)    with tf.Session(config=config) as sess:        label_list = AGE_LIST if FLAGS.class_type == 'age' else GENDER_LIST        nlabels = len(label_list)        print('Executing on %s' % FLAGS.device_id)        model_fn = select_model(FLAGS.model_type)        with tf.device(FLAGS.device_id):            images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3])            logits = model_fn(nlabels, images, 1, False)            init = tf.global_variables_initializer()            requested_step = FLAGS.requested_step if FLAGS.requested_step else None            checkpoint_path = '%s' % (FLAGS.model_dir)            model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint)            saver = tf.train.Saver()            saver.restore(sess, model_checkpoint_path)            softmax_output = tf.nn.softmax(logits)            coder = ImageCoder()            # Support a batch mode if no face detection model            if len(files) == 0:                if (os.path.isdir(FLAGS.filename)):                    for relpath in os.listdir(FLAGS.filename):                        abspath = os.path.join(FLAGS.filename, relpath)                        if os.path.isfile(abspath) and any([abspath.endswith('.' + ty) for ty in ('jpg', 'png', 'JPG', 'PNG', 'jpeg')]):                            print(abspath)                            files.append(abspath)                else:                    files.append(FLAGS.filename)                    # If it happens to be a list file, read the list and clobber the files                    if any([FLAGS.filename.endswith('.' + ty) for ty in ('csv', 'tsv', 'txt')]):                        files = list_images(FLAGS.filename)            writer = None            output = None            if FLAGS.target:                print('Creating output file %s' % FLAGS.target)                output = open(FLAGS.target, 'w')                writer = csv.writer(output)                writer.writerow(('file', 'label', 'score'))            image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files]))            print(image_files)            if FLAGS.single_look:                classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer)            else:                for image_file in image_files:                    classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer)            if output is not None:                output.close()if __name__ == '__main__':    tf.app.run()

微软脸部图片识别性别、年龄网站 http://how-old.net/ 。图片识别年龄、性别。根据问题搜索图片。

参考资料:
《TensorFlow技术解析与实战》

欢迎推荐上海机器学习工作机会,我的微信:qingxingfengzi

原创粉丝点击