第一篇博文--TensorFlow学习1

来源：互联网发布：宽带有网络wifi连不上编辑：程序博客网时间：2024/05/16 03:52

整理一下思路：我的研究课题是词义消歧，读了谷歌大神的论文，用神经网络作词义消歧。然后用Keras还原，后来发现Keras的效果怎么都不好，经过和师兄的交流，师兄严肃地建议我用TensorFlow还原作者的实验。于是开始学习TF，假期在家看了莫凡的教学视频，但是都是一些很基础的东西，读github上的代码还是很吃力，因为不像Keras的汉化做的那么好，TF的各种方法都没有中文的使用说明。

那怎么办呢？跟论文作者联系要一下源码，被告知开源在即，所以转换一下策略，不再急于写出用于实验的代码，而是专心学习一下TF。（老师催就催吧，急于求成等于一事无成），学习这个东西思绪必须要清晰，现准备从头读一篇代码，弄懂其中每一步的意义。直接看英文的说明太懵了，那么多方法也不能一下都记住。所以选择了这个方式。

从这个教程中的代码开始：

http://www.tensorfly.cn/tfdoc/tutorials/recurrent.html

https://github.com/tensorflow/models/tree/master/tutorials/rnn/ptb

从if __name__=='__main__'开始：之前一直不知道这句干嘛用的，这句话的意思是：当模块被直接运行时，才运行此段代码下的代码块，如果此模块被导入，就不运行。

下面开始正式的代码了。从程序运行的过程一步步来看。简单的说明在程序代码右侧加注释。如果比较复杂的函数我会在代码下面进行说明。

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# =============================================================================="""Example / benchmark for building a PTB LSTM model.Trains the model described in:(Zaremba, et. al.) Recurrent Neural Network Regularization                        #训练这篇文章中的RNNhttp://arxiv.org/abs/1409.2329There are 3 supported model configurations:===========================================| config | epochs | train | valid  | test===========================================| small  | 13     | 37.99 | 121.39 | 115.91| medium | 39     | 48.45 |  86.16 |  82.07| large  | 55     | 37.87 |  82.62 |  78.29The exact results may vary depending on the random initialization.               #实际结果可能和随机初始化的不同而变化The hyperparameters used in the model:                                           #模型中使用的参数- init_scale - the initial scale of the weights                                  - learning_rate - the initial value of the learning rate- max_grad_norm - the maximum permissible norm of the gradient                   #梯度的最大容许标准（不太懂）- num_layers - the number of LSTM layers- num_steps - the number of unrolled steps of LSTM                               #这个指的就是time_step，也就是输入的词的个数- hidden_size - the number of LSTM units                                         - max_epoch - the number of epochs trained with the initial learning rate        #初始学习效率- max_max_epoch - the total number of epochs for training                         - keep_prob - the probability of keeping weights in the dropout layer            #1-dropout- lr_decay - the decay of the learning rate for each epoch after "max_epoch"     #学习效率衰减- batch_size - the batch size    The data required for this example is in the data/ dir of thePTB dataset from Tomas Mikolov's webpage:$ wget http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz$ tar xvf simple-examples.tgzTo run:$ python ptb_word_lm.py --data_path=simple-examples/data/"""from __future__ import absolute_importfrom __future__ import divisionfrom __future__ import print_functionimport timeimport numpy as npimport tensorflow as tfimport reader#在构建模型和训练之前，我们首先需要设置一些参数。tf中可以使用tf.flags来进行全局的参数设置flags = tf.flagslogging = tf.loggingflags.DEFINE_string(    "model", "small",    "A type of model. Possible options are: small, medium, large.")          #定义变量model的值为小，后面的是注释flags.DEFINE_string("data_path", None,                    "Where the training/test data is stored.")               #定义下载好的数据存放位置flags.DEFINE_string("save_path", None,                    "Model output directory.")                               #是否使用float16格式flags.DEFINE_bool("use_fp16", False,                  "Train using 16-bit floats instead of 32bit floats")
FLAGS = flags.FLAGS # 可以使用FLAGS.model来调用变量 model的值
这有一篇比较好的博文，先看一下再接着写http://www.cnblogs.com/wuzhitj/p/6297992.html
init_scale = 0.1        # 相关参数的初始值为随机均匀分布，范围是[-init_scale,+init_scale]learning_rate = 1.0     # 学习速率,在文本循环次数超过max_epoch以后会逐渐降低max_grad_norm = 5       # 用于控制梯度膨胀，如果梯度向量的L2模超过max_grad_norm，则等比例缩小num_layers = 2          # lstm层数num_steps = 20          # 单个数据中，序列的长度。hidden_size = 200       # 隐藏层中单元数目max_epoch = 4           # epoch<max_epoch时，lr_decay值=1,epoch>max_epoch时,lr_decay逐渐减小max_max_epoch = 13      # 指的是整个文本循环次数。keep_prob = 1.0         # 用于dropout.每批数据输入时神经网络中的每个单元会以1-keep_prob的概率不工作，可以防止过拟合lr_decay = 0.5          # 学习速率衰减batch_size = 20         # 每批数据的规模，每批有20个。vocab_size = 10000      # 词典规模，总共10K个词
if __name__ == "__main__":  tf.app.run()
 main中只有这一句，参考这篇博文http://blog.csdn.net/helei001/article/details/51859423。其实我也不太懂，先往下看。
main函数：
(看代码说明要去英文官网，需要翻墙，)
def main(_):  if not FLAGS.data_path:                                            #如果data.path==None,报错    raise ValueError("Must set --data_path to PTB data directory")       
  raw_data = reader.ptb_raw_data(FLAGS.data_path)             
源码如下：
def ptb_raw_data(data_path=None):                                      """Load PTB raw data from data directory "data_path".  Reads PTB text files, converts strings to integer ids,  and performs mini-batching of the inputs.  The PTB dataset comes from Tomas Mikolov's webpage:  http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz  Args:    data_path: string path to the directory where simple-examples.tgz has      been extracted.  Returns:    tuple (train_data, valid_data, test_data, vocabulary)    where each of the data objects can be passed to PTBIterator.  """#各文件路径和文件名  train_path = os.path.join(data_path, "ptb.train.txt")             valid_path = os.path.join(data_path, "ptb.valid.txt")  test_path = os.path.join(data_path, "ptb.test.txt")  word_to_id = _build_vocab(train_path)                  
def _build_vocab(filename):  data = _read_words(filename)                                      #将所有的句子中的换行替换为<eos>，然后split(),按顺序返回一个所有词的列表
                                                                    #如I have a pen . \n  变成['I','have','a','pen','.','<eos>']def _read_words(filename):  with tf.gfile.GFile(filename, "r") as f:    return f.read().decode("utf-8").replace("\n", "<eos>").split()
  counter = collections.Counter(data)                               #计数，返回一个Counter({'N':2,'<eos>':2}),且由大到小排序  count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0]))#类型转换，返回一个元组的列表，如[('<eos>',2),('N',2),....]，按出现次数由大到小，次数    相同的由小到大排列  words, _ = list(zip(*count_pairs))                                #返回两个元组，一个元组装所有的词，一个装所有的出现次数，并且是一一对应关系。
    #('<eos>','N')(2,2)
                                                                    #http://www.cnblogs.com/frydsh/archive/2012/07/10/2585370.html,zip函数说明  word_to_id = dict(zip(words, range(len(words))))                  #返回一个字典，{'<eos>':0,'N':1}，出现次数越多的词对应的整数越小  return word_to_id
  train_data = _file_to_word_ids(train_path, word_to_id)  valid_data = _file_to_word_ids(valid_path, word_to_id)  test_data =_file_to_word_ids(test_path, word_to_id)def _file_to_word_ids(filename, word_to_id):  data = _read_words(filename)                                      return [word_to_id[word] for word in data if word in word_to_id] #把整个文件的换行替换成<eos>，然后将词的列表替换成对应的id列表。
  vocabulary = len(word_to_id)                                     #有多少个词  return train_data, valid_data, test_data, vocabulary             #返回各个文件的id列表和字典长度






                                                     0        0           	
					
					   第一篇博文--TensorFlow学习1
	  	   Tensorflow的学习（第一篇）
	  	   第一篇：tensorflow入门
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文！
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文？
	  	   第一篇博文
	  	   第一篇博文。
	  	   第一篇博文
	  	   第一篇博文
	  	   第一篇博文.
	  	   第一篇博文
	     		  
	  	   thinkphp中的url参数传值问题
	  	   Java开发者常犯的10个错误
	  	   JavaScript 函数
	  	   Android的manifest文件中的application中的android:supportsRtl="true"
	  	   重入锁
	  	   第一篇博文--TensorFlow学习1
	  	   WPF DatePicker自定义时间格式
	  	   HDU1498-二分图行列匹配
	  	   微信退款 坑-1
	  	   mcfw框架相关
	  	   android opencv2.4.10使用SIFT编译出libnonfree.so
	  	   Linux下编译Android源码问题汇总
	  	   ls命令详细使用
	  	   如何用CorelDRAW做扁平化扇形统计图