开源项目domain-transfer-network-master学习

来源:互联网 发布:网络直播设备人员 编辑:程序博客网 时间:2024/06/09 16:17

date:2017.5.6

第一次用markdown写这玩意儿,屏幕太小,编辑起来感觉怪怪的,如果不好用,之后换回原版。

源域

关于conv2d的疑问

net = slim.conv2d(net, 128, [3, 3], scope='conv2')     # (batch_size, 8, 8, 128)                    net = slim.batch_norm(net, scope='bn2')                    net = slim.conv2d(net, 256, [3, 3], scope='conv3')     # (batch_size, 4, 4, 256)

一个8x8的图片经过3x3的核卷积之后新的特征大小不应该是6x6吗?

tensorboard相关

            loss_summary = tf.summary.scalar('classification_loss', self.loss)            accuracy_summary = tf.summary.scalar('accuracy', self.accuracy)            self.summary_op = tf.summary.merge([loss_summary, accuracy_summary])

上面的代码是用于可视化

训练

    elif self.mode == 'train':        self.src_images = tf.placeholder(tf.float32, [None, 32, 32, 3], 'svhn_images')        self.trg_images = tf.placeholder(tf.float32, [None, 32, 32, 1], 'mnist_images')        # source domain (svhn to mnist)        self.fx = self.content_extractor(self.src_images)        self.fake_images = self.generator(self.fx)        self.logits = self.discriminator(self.fake_images)        self.fgfx = self.content_extractor(self.fake_images, reuse=True)        # loss        self.d_loss_src = slim.losses.sigmoid_cross_entropy(self.logits, tf.zeros_like(self.logits))        self.g_loss_src = slim.losses.sigmoid_cross_entropy(self.logits, tf.ones_like(self.logits))        self.f_loss_src = tf.reduce_mean(tf.square(self.fx - self.fgfx)) * 15.0        # optimizer        self.d_optimizer_src = tf.train.AdamOptimizer(self.learning_rate)        self.g_optimizer_src = tf.train.AdamOptimizer(self.learning_rate)        self.f_optimizer_src = tf.train.AdamOptimizer(self.learning_rate)        t_vars = tf.trainable_variables()        d_vars = [var for var in t_vars if 'discriminator' in var.name]        g_vars = [var for var in t_vars if 'generator' in var.name]        f_vars = [var for var in t_vars if 'content_extractor' in var.name]        # train op        with tf.name_scope('source_train_op'):            self.d_train_op_src = slim.learning.create_train_op(self.d_loss_src, self.d_optimizer_src, variables_to_train=d_vars)            self.g_train_op_src = slim.learning.create_train_op(self.g_loss_src, self.g_optimizer_src, variables_to_train=g_vars)            self.f_train_op_src = slim.learning.create_train_op(self.f_loss_src, self.f_optimizer_src, variables_to_train=f_vars)        # summary op        d_loss_src_summary = tf.summary.scalar('src_d_loss', self.d_loss_src)        g_loss_src_summary = tf.summary.scalar('src_g_loss', self.g_loss_src)        f_loss_src_summary = tf.summary.scalar('src_f_loss', self.f_loss_src)        origin_images_summary = tf.summary.image('src_origin_images', self.src_images)        sampled_images_summary = tf.summary.image('src_sampled_images', self.fake_images)        self.summary_op_src = tf.summary.merge([d_loss_src_summary, g_loss_src_summary,                                                 f_loss_src_summary, origin_images_summary,                                                 sampled_images_summary])        # target domain (mnist)        self.fx = self.content_extractor(self.trg_images, reuse=True)        self.reconst_images = self.generator(self.fx, reuse=True)        self.logits_fake = self.discriminator(self.reconst_images, reuse=True)        self.logits_real = self.discriminator(self.trg_images, reuse=True)        # loss        self.d_loss_fake_trg = slim.losses.sigmoid_cross_entropy(self.logits_fake, tf.zeros_like(self.logits_fake))        self.d_loss_real_trg = slim.losses.sigmoid_cross_entropy(self.logits_real, tf.ones_like(self.logits_real))        self.d_loss_trg = self.d_loss_fake_trg + self.d_loss_real_trg        self.g_loss_fake_trg = slim.losses.sigmoid_cross_entropy(self.logits_fake, tf.ones_like(self.logits_fake))        self.g_loss_const_trg = tf.reduce_mean(tf.square(self.trg_images - self.reconst_images)) * 15.0        self.g_loss_trg = self.g_loss_fake_trg + self.g_loss_const_trg        # optimizer        self.d_optimizer_trg = tf.train.AdamOptimizer(self.learning_rate)        self.g_optimizer_trg = tf.train.AdamOptimizer(self.learning_rate)        # train op        with tf.name_scope('target_train_op'):            self.d_train_op_trg = slim.learning.create_train_op(self.d_loss_trg, self.d_optimizer_trg, variables_to_train=d_vars)            self.g_train_op_trg = slim.learning.create_train_op(self.g_loss_trg, self.g_optimizer_trg, variables_to_train=g_vars)        # summary op        d_loss_fake_trg_summary = tf.summary.scalar('trg_d_loss_fake', self.d_loss_fake_trg)        d_loss_real_trg_summary = tf.summary.scalar('trg_d_loss_real', self.d_loss_real_trg)        d_loss_trg_summary = tf.summary.scalar('trg_d_loss', self.d_loss_trg)        g_loss_fake_trg_summary = tf.summary.scalar('trg_g_loss_fake', self.g_loss_fake_trg)        g_loss_const_trg_summary = tf.summary.scalar('trg_g_loss_const', self.g_loss_const_trg)        g_loss_trg_summary = tf.summary.scalar('trg_g_loss', self.g_loss_trg)        origin_images_summary = tf.summary.image('trg_origin_images', self.trg_images)        sampled_images_summary = tf.summary.image('trg_reconstructed_images', self.reconst_images)        self.summary_op_trg = tf.summary.merge([d_loss_trg_summary, g_loss_trg_summary,                                                 d_loss_fake_trg_summary, d_loss_real_trg_summary,                                                g_loss_fake_trg_summary, g_loss_const_trg_summary,                                                origin_images_summary, sampled_images_summary])        for var in tf.trainable_variables():            tf.summary.histogram(var.op.name, var)

data

首先获得数据集,然后通过内容抽取器得到fx,利用生成器生成fake images,计算假图片通过判别器的logit,最后计算fgfx。

loss

计算d,g的sigmoid交叉熵损失,f的均方误差,15是超参数,还有计算交叉熵损失为何分别用zero like和one like 做target,使dg(z)和d(x)两者差异尽量大?

说到这里百度了相关的四种交叉熵损失,主要有sigmoid,带权重的sigmoid类,softmax,稀疏softmax(懒人福音,不需要前置onehot
code),前两者是可多分类的,后两者是单分类,即前两者对一个人可分为司机又可以分为丈夫,后两者只能分为女或男。

优化

设置了d g f的优化算法
有点神奇,后面的t_var也是,这就在进行迭代优化了?最后才发现运算是在sover中

目标域

target domain (mnist)

        self.fx = self.content_extractor(self.trg_images, reuse=True)        self.reconst_images = self.generator(self.fx, reuse=True)        self.logits_fake = self.discriminator(self.reconst_images, reuse=True)        self.logits_real = self.discriminator(self.trg_images, reuse=True)        # loss        self.d_loss_fake_trg = slim.losses.sigmoid_cross_entropy(self.logits_fake, tf.zeros_like(self.logits_fake))        self.d_loss_real_trg = slim.losses.sigmoid_cross_entropy(self.logits_real, tf.ones_like(self.logits_real))        self.d_loss_trg = self.d_loss_fake_trg + self.d_loss_real_trg        self.g_loss_fake_trg = slim.losses.sigmoid_cross_entropy(self.logits_fake, tf.ones_like(self.logits_fake))        self.g_loss_const_trg = tf.reduce_mean(tf.square(self.trg_images - self.reconst_images)) * 15.0        self.g_loss_trg = self.g_loss_fake_trg + self.g_loss_const_trg        # optimizer        self.d_optimizer_trg = tf.train.AdamOptimizer(self.learning_rate)        self.g_optimizer_trg = tf.train.AdamOptimizer(self.learning_rate)

对mnist数据集提取特征得fx,通过生成器得reconst images,reconst images 通过判别器得fake logit,原生数据集通过判别器得real logit,流程如下

Created with Raphaël 2.1.0data set(mnist)content_extractorgeneratordisriminatorlogits_fake
Created with Raphaël 2.1.0data set(mnist)disriminatorlogits_real

然后计算d的real和fake的sigmoid交叉熵损失,g的fake的sigmoid交叉熵损失,g的内容抽取器均方差损失。

prepro

def save_pickle(data, path):    with open(path, 'wb') as f:        pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)        print ('Saved %s..' %path)

dump方法

pickle.dump(obj, file, [,protocol])   注解:将对象obj保存到文件file中去。
     protocol为序列化使用的协议版本,0:ASCII协议,所序列化的对象使用可打印的ASCII码表示;1:老式的二进制协议;2:2.3版本引入的新二进制协议,较以前的更高效。其中协议0和1兼容老版本的python。protocol默认值为0。
     file:对象保存到的类文件对象。file必须有write()接口, file可以是一个以’w’方式打开的文件或者一个StringIO对象或者其他任何实现write()接口的对象。如果protocol>=1,文件对象需要是二进制模式打开的。

  pickle.load(file)   注解:从file中读取一个字符串,并将它重构为原来的python对象。
  file:类文件对象,有read()和readline()接口。

与之对应的是load方法,从文件加载程序。

    for idx, (s, t) in enumerate(zip(sources, targets)):

这个zip的使用传送门,说明简单清晰明了。

solver

        for step in range(self.pretrain_iter+1):            i = step % int(train_images.shape[0] / self.batch_size)            batch_images = train_images[i*self.batch_size:(i+1)*self.batch_size]            batch_labels = train_labels[i*self.batch_size:(i+1)*self.batch_size]             feed_dict = {model.images: batch_images, model.labels: batch_labels}            sess.run(model.train_op, feed_dict) 

其中step没有给初值啊?

sess.run() 中的feed_dict

我们都知道feed_dict的作用是给使用placeholder创建出来的tensor赋值。其实,他的作用更加广泛:feed 使用一个
值临时替换一个 op 的输出结果. 你可以提供 feed 数据作为 run() 调用的参数. feed 只在调用它的方法内有效, 方法结束,
feed 就会消失.

import tensorflow as tf
y = tf.Variable(1)
b = tf.identity(y)
with tf.Session() as sess:
tf.global_variables_initializer().run()
print(sess.run(b,feed_dict={y:3})) #使用3 替换掉tf.Variable(1)的输出结果,所以打印出来3
print(sess.run(b)) #由于feed只在调用他的方法范围内有效,所以这个打印的结果是 1
输出是3 1

总结

整个项目就是用mnist和svhn数据集同时喂给GAN进行训练,不断调整G D和F,最终最小化损失函数,从作用来看比之前的neural style transfer更进一步,泛用性更强。也算磕磕绊绊的把GAN弄清楚了,不像之前半知半解。

0 0
原创粉丝点击