[置顶] 基于深度学习的色情视频鉴定

来源：互联网发布：Python Max iteritem 编辑：程序博客网时间：2024/05/17 14:25

http://blog.csdn.net/willduan1/article/details/54577351

在生活中不可避免会出现色情视频，因此视频的鉴定就成为了我们需要解决的问题，本博客在NSFW项目（见下面参考文献）的基础上面改进了封装，用来检测视频是否是色情视频。首先，这个项目是基于Caffe 的，使用的网络结构是ResNet网络（可以查看参考文献中的论文）。

为了完成对视频的检测，博主使用了FFMPEG，用来从视频中提取帧数，每20秒提取一次图像，当然为了检测更加精确，可以在后面修改间隔时间。

检测分为三个等级，score < 0.2 的表示很安全，socre > 0.8 的表示有很大的可能性是色情的。

最后程序输出：

总共提取检测视频中的图像帧数

socre < 0.2 很安全 safe的数量，占的比重

score >= 0.2 && score <= 0.8 medimum ，介于危险和安全之间的数量，比重

score > 0.8 dangerous，有很大可能性是色情占的比重

最后我们可以根据dangerous 占的比重可以确定视频是否是色情视频。

废话不多说，下面进入实战环节。

首先就是安装ffmpeg，由于我使用的是Ubuntu 14的系统，安装这个的时候着实费了一点功夫，所幸终于找到了一个好用的源安装成功。

[plain] view plain copy
sudo add-apt-repository ppa:mc3man/trusty-media  
sudo apt-get update  
sudo apt-get install ffmpeg gstreamer0.10-ffmpeg  

接下来就是安装caffe了，如果你没有安装过，也没有关系，使用docker就可以了。

安装docker容器：

这里就不多说了。

安装caffe 容器 CPU版：

[plain] view plain copy
docker build -t caffe:cpu https://raw.githubusercontent.com/BVLC/caffe/master/docker/cpu/Dockerfile  

查看caffe版本：

[plain] view plain copy
docker run caffe:cpu caffe --version  

下载open_nsfw：

[plain] view plain copy
git clone https://github.com/yahoo/open_nsfw  

进入工作目录：

[plain] view plain copy
cd open_nsfw  

在这里需要说明的是启动docker的时候需要我们把工作目录挂载到docker中，例如：

[plain] view plain copy
docker run -ti --volume={}:/workspace caffe:cpu bash".format("/home/duan/open_nsfw/  

这里先不用着急运行这一步。

接下来就是视频帧提取代码了，每20秒提取一帧，存放于open_nsfw/picture文件夹下面：

[python] view plain copy
# -*- encoding:utf-8 -*-  
__date__ = "17/1/16"  
__author__  = "duan"  
  
  
import os  
import shutil  
from argparse import ArgumentParser  
  
  
  
def video_to_frames(video_path, frames_path, step_size = 20):  
    if not os.path.exists(frames_path):  
        os.makedirs(frames_path)  
    else:  
        shutil.rmtree(frames_path)  
        os.makedirs(frames_path)  
      
    output_file = frames_path + "/out%05d.jpg"  
    print("ffmpeg -i {} -f image2 {}".format(video_path, output_file))  
    #extract an image every 20 seconds  
    # you can also set every 10 seconds, just set fps = fps = 1/10  
    os.system("ffmpeg -i {} -f image2 -vf fps=fps=1/{} {}".format(video_path, step_size, output_file))  
  
  
if __name__ == '__main__':  
  
    parser = ArgumentParser()  
  
    parser.add_argument('--content',  
            dest='content', help='content image',  
            metavar='CONTENT', required=True)  
    parser.add_argument('--step', type=int, default = 20,  
            dest='step', help='the video step you want use',  
            metavar='STEP')  
  
    options = parser.parse_args()  
  
    video_name = options.content # the video name you want to detect  
  
    step_size = options.step # the video step you want to use  
  
    #video_name = "1994.mp4" # the video name you want to detect  
  
    video_path = "./"  # the video path, i put the video at current folder  
    frames_path = "picture"  
      
    video_to_frames(video_path + video_name, frames_path, step_size)  
  
    # start the docker and set the workspace as "/home/duan/open_nsfw"  
    # set as your own path  
    #launch the docker  
    os.system("docker run -ti --volume={}:/workspace caffe:cpu bash -c \"python video_detect.py\"".format("/home/duan/open_nsfw/"))    

接下来新建一个文件，，用来检测提取出来的图像，在上面的程序启动docker之后，就可以运行

[plain] view plain copy
python video_detect.py  

[python] view plain copy
# -*- encoding:utf-8 -*-  
__date__ = "17/1/16"  
__author__  = "duan"  
  
import os  
import shutil  
  
  
  
frames_path = "picture"  
  
files= os.listdir(frames_path)   
  
  
results = []  
  
import video_nsfw  
  
safe = 0.0  
median = 0.0  
dangerous = 0.0  
  
for file in files:     
    if not os.path.isdir(file):  
        res = video_nsfw.detact("nsfw_model/deploy.prototxt", "nsfw_model/resnet_50_1by2_nsfw.caffemodel", frames_path + "/" + file)  
        if res < 0.2:  
            safe += 1  
        elif res < 0.8:  
            median += 1  
        else:  
            dangerous += 1  
  
        results.append(res)  
  
print(len(results))  
print("safe count: {}, proportion: {}%".format(safe, round(safe / len(results) * 100, 3)))  
print("median count: {}, proportion: {}%".format(median, round(median / len(results) * 100, 3)))  
print("dangerous count: {}, proportion: {}%".format(dangerous, round(dangerous / len(results) * 100, 3)))  

核心的检测代码，文件为 video_nsfw.py：

[python] view plain copy
# -*- encoding:utf-8 -*-  
__date__ = "17/1/16"  
__author__  = "duan"  
  
  
import os  
import shutil  
  
  
  
import numpy as np  
import os  
import sys  
import argparse  
import glob  
import time  
from PIL import Image  
from StringIO import StringIO  
import caffe  
  
  
  
def resize_image(data, sz=(256, 256)):  
    """ 
    Resize image. Please use this resize logic for best results instead of the  
    caffe, since it was used to generate training dataset  
    """  
    img_data = str(data)  
    im = Image.open(StringIO(img_data))  
    if im.mode != "RGB":  
        im = im.convert('RGB')  
    imr = im.resize(sz, resample=Image.BILINEAR)  
    fh_im = StringIO()  
    imr.save(fh_im, format='JPEG')  
    fh_im.seek(0)  
    return bytearray(fh_im.read())  
  
def caffe_preprocess_and_compute(pimg, caffe_transformer=None, caffe_net=None,  
    output_layers=None):  
    """ 
    Run a Caffe network on an input image after preprocessing it to prepare 
    it for Caffe. 
    """  
    if caffe_net is not None:  
  
        # Grab the default output names if none were requested specifically.  
        if output_layers is None:  
            output_layers = caffe_net.outputs  
  
        img_data_rs = resize_image(pimg, sz=(256, 256))  
        image = caffe.io.load_image(StringIO(img_data_rs))  
  
        H, W, _ = image.shape  
        _, _, h, w = caffe_net.blobs['data'].data.shape  
        h_off = max((H - h) / 2, 0)  
        w_off = max((W - w) / 2, 0)  
        crop = image[h_off:h_off + h, w_off:w_off + w, :]  
        transformed_image = caffe_transformer.preprocess('data', crop)  
        transformed_image.shape = (1,) + transformed_image.shape  
  
        input_name = caffe_net.inputs[0]  
        all_outputs = caffe_net.forward_all(blobs=output_layers,  
                    **{input_name: transformed_image})  
  
        outputs = all_outputs[output_layers[0]][0].astype(float)  
        return outputs  
    else:  
        return []  
  
  
def detact(model_def, pretrained_model, input_file):  
      
    pycaffe_dir = os.path.dirname(__file__)  
  
    #args = parser.parse_args()  
    image_data = open(input_file).read()  
  
    # Pre-load caffe model.  
    nsfw_net = caffe.Net(model_def,  # pylint: disable=invalid-name  
        pretrained_model, caffe.TEST)  
  
    # Load transformer  
    # Note that the parameters are hard-coded for best results  
    caffe_transformer = caffe.io.Transformer({'data': nsfw_net.blobs['data'].data.shape})  
    caffe_transformer.set_transpose('data', (2, 0, 1))  # move image channels to outermost  
    caffe_transformer.set_mean('data', np.array([104, 117, 123]))  # subtract the dataset-mean value in each channel  
    caffe_transformer.set_raw_scale('data', 255)  # rescale from [0, 1] to [0, 255]  
    caffe_transformer.set_channel_swap('data', (2, 1, 0))  # swap channels from RGB to BGR  
  
    # Classify.  
    scores = caffe_preprocess_and_compute(image_data, caffe_transformer=caffe_transformer, caffe_net=nsfw_net, output_layers=['prob'])  
  
    # Scores is the array containing SFW / NSFW image probabilities  
    # scores[1] indicates the NSFW probability  
    print("NSFW score:  " , scores[1])  
  
    return scores[1]  

最后运行的时候运行：

[plain] view plain copy
python launch_video_detact.py --content 1995.mp4 --step 20  

step选项是隔几秒提取的帧数，可以省略，默认20。当然最后的效果也与这个的选取有关。

本文最后检测了一下《肖申克的救赎》，实验结果如下：

总共以每隔20秒的时间提取视频，共检测429帧，因此可以以93.7%的概率确定《肖申克的救赎》非常安全。

当然你可以自己检测色情视频，嘿嘿。

转载请注明：转载自 http://blog.csdn.net/willduan1/article/details/54577351

------------------------EOF------------------------------

参考文献：

https://github.com/yahoo/open_nsfw/blob/master/README.md

He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning for image recognition” arXiv preprint arXiv:1512.03385 (2015).

Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.”; arXiv preprint arXiv:1409.1556(2014).

Iandola, Forrest N., Matthew W. Moskewicz, Khalid Ashraf, Song Han, William J. Dally, and Kurt Keutzer. “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and 1MB model size.”; arXiv preprint arXiv:1602.07360 (2016).

He, Kaiming, and Jian Sun. “Convolutional neural networks at constrained time cost.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5353-5360. 2015.

Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet,Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. “Going deeper with convolutions” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9. 2015.

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks” In Advances in neural information processing systems, pp. 1097-1105. 2012.

阅读全文

0 0