当前位置：首页 > news >正文

网站后台请示小程序 wordpress绑定

news 2026/1/13 17:52:08

网站后台请示,小程序 wordpress绑定,罗湖住房和建设局官网,深圳哪个网站建设公司好基于RNN的语言模型 RNN语言模型理论基础参考文献 cbow/skip gram 的局限性#### 解决方案 rnn模型细节数学表示一个输入一个输出的不是循环神经网络。 RNN语言模型实践 demo1 1A. 优化上一节课的RNN模型在第一个版本里面#xff0c;我们将上一节课的代码包装为Class我们将上一节课的代码包装为Class并且使用tensorflow 自带的 rnn 实现 forward-propagation 功能要讨论如何使用交叉验证 class BatchGenerator(object):#产生minibatch的类每次返回一个minibatchdef __init__(self, tensor_in, tensor_out, batch_size, seq_length):初始化mini-batch产生器BaseClassInput:batch_size: 每一个mini-batch里面有多少样本。seq_length: 每一个样本的长度和batch_size一起决定了每个minibatch的数据量。self.batch_size batch_sizeself.seq_length seq_lengthself.tensor_in tensor_inself.tensor_out tensor_outself.create_batches()self.reset_batch_pointer()def reset_batch_pointer(self):self.pointer 0def create_batches(self):self.num_batches int(self.tensor_in.size / (self.batch_size * self.seq_length))self.tensor_in self.tensor_in[:self.num_batches * self.batch_size * self.seq_length]self.tensor_out self.tensor_out[:self.num_batches * self.batch_size * self.seq_length]# When the data (tesor) is too small, lets give them a better error messageif self.num_batches 0:assert False, Not enough data. Make seq_length and batch_size small.# np.split将tensor_in沿着轴1分割成num_batches组self.x_batches np.split(self.tensor_in.reshape(self.batch_size, -1), self.num_batches, 1)self.y_batches np.split(self.tensor_out.reshape(self.batch_size, -1), self.num_batches, 1)def next_batch(self):x, y self.x_batches[self.pointer], self.y_batches[self.pointer]self.pointer 1return x, yclass CopyBatchGenerator(BatchGenerator):def __init__(self, data, batch_size, seq_length):初始化mini-batch产生器输入一个长度为T的sequencesequence的前T-1个元素为inputsequence的后面T-1个元素为output。用来训练RNNLM。Input:batch_size: 每一个mini-batch里面有多少样本。seq_length: 每一个样本的长度和batch_size一起决定了每个minibatch的数据量。self.batch_size batch_sizeself.seq_length seq_lengthtensor_in np.array(data)tensor_out np.copy(tensor_in)tensor_out[:-1] tensor_in[1:]tensor_out[-1] tensor_in[0]super(CopyBatchGenerator, self).__init__(tensor_in, tensor_out, batch_size, seq_length)class PredBatchGenerator(BatchGenerator):def __init__(self, data_in, data_out, batch_size, seq_length):初始化mini-batch产生器输入两个长度为T的sequence其中一个是输入sequence另一个是输出sequence。Input:batch_size: 每一个mini-batch里面有多少样本。seq_length: 每一个样本的长度和batch_size一起决定了每个minibatch的数据量。self.batch_size batch_sizeself.seq_length seq_lengthtensor_in np.array(data_in)tensor_out np.array(data_out)super(PredBatchGenerator, self).__init__(tensor_in, tensor_out, batch_size, seq_length)定义CharRNN 模型和上一节课一样这一节课里我们的RNN模型的输入和输出是同样长度的序列我们叫做char-level-RNN模型下周我们将研究以句子为单位输入输出 BasicRNNCell是抽象类RNNCell的一个最简单的实现。是tensorflow python自己的函数这里面的rnn输出会输出两次一边给外面一边给下一个cell class BasicRNNCell(RNNCell):def __init__(self, num_units, activationNone, reuseNone):super(BasicRNNCell, self).__init__(_reusereuse)self._num_units num_unitsself._activation activation or math_ops.tanhself._linear Nonepropertydef state_size(self):return self._num_unitspropertydef output_size(self):return self._num_unitsdef call(self, inputs, state):if self._linear is None:self._linear _Linear([inputs, state], self._num_units, True)output self._activation(self._linear([inputs, state]))return output, output厚度为词向量的维度。 class CharRNNLM(object):def __init__(self, batch_size, num_unrollings, vocab_size,hidden_size, embedding_size, learning_rate):Character-2-Character RNN 模型。这个模型的训练数据是两个相同长度的sequence其中一个sequence是input另外一个sequence是output。self.batch_size batch_sizeself.num_unrollings num_unrollingsself.hidden_size hidden_sizeself.vocab_size vocab_sizeself.embedding_size embedding_sizeself.input_data tf.placeholder(tf.int64, [self.batch_size, self.num_unrollings], nameinputs)self.targets tf.placeholder(tf.int64, [self.batch_size, self.num_unrollings], nametargets)cell_fn tf.nn.rnn_cell.BasicRNNCellparams dict()cell cell_fn(self.hidden_size, **params)with tf.name_scope(initial_state):self.zero_state cell.zero_state(self.batch_size, tf.float32)self.initial_state tf.placeholder(tf.float32,[self.batch_size, cell.state_size],initial_state)with tf.name_scope(embedding_layer):## 定义词向量参数并通过查询将输入的整数序列每一个元素转换为embedding向量# 如果提供了embedding的维度我们声明一个embedding参数即词向量参数矩阵# 否则我们使用Identity矩阵作为词向量参数矩阵#embedding一行一个次向量if embedding_size 0:self.embedding tf.get_variable(embedding, [self.vocab_size, self.embedding_size])else:self.embedding tf.constant(np.eye(self.vocab_size), dtypetf.float32)inputs tf.nn.embedding_lookup(self.embedding, self.input_data)with tf.name_scope(slice_inputs):# 我们将要使用static_rnn方法需要将长度为num_unrolling的序列切割成# num_unrolling个单位存在一个list里面,# 即输入格式为# [ num_unrollings, (batch_size, embedding_size)]sliced_inputs [tf.squeeze(input_, [1]) for input_ in tf.split(axis1, num_or_size_splitsself.num_unrollings, valueinputs)]# 调用static_rnn方法作forward propagation# 为方便阅读我们将static_rnn的注释贴到这里#tf.nn.rnn创建一个展开图的一个固定的网络长度。这意味着如果有200次输入的步骤你与200步骤创建一个静态的图tf.nn.rnn RNN。# 首先创建graphh较慢。第二您无法传递比最初指定的更长的序列 200。但是动态的tf.nn.dynamic_rnn解决这。当它被执行# 时它使用循环来动态构建图形。这意味着图形创建速度更快并且可以提供可变大小的批处理。#静态rnn就是先建立完成所有的cell再开始运行## 输入# inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size]# initial_state: An initial state for the RNN.# If cell.state_size is an integer, this must be a Tensor of appropriate# type and shape [batch_size, cell.state_size]# 输出# outputs: a length T list of outputs (one for each input), or a nested tuple of such elements.# state: the final stateoutputs, final_state tf.nn.static_rnn(cellcell,#我的rnn cell内部结构inputssliced_inputs,#输入initial_stateself.initial_state)self.final_state final_statewith tf.name_scope(flatten_outputs):flat_outputs tf.reshape(tf.concat(axis1, valuesoutputs), [-1, hidden_size])with tf.name_scope(flatten_targets):flat_targets tf.reshape(tf.concat(axis1, valuesself.targets), [-1])with tf.variable_scope(softmax) as sm_vs:softmax_w tf.get_variable(softmax_w, [hidden_size, vocab_size])softmax_b tf.get_variable(softmax_b, [vocab_size])self.logits tf.matmul(flat_outputs, softmax_w) softmax_bself.probs tf.nn.softmax(self.logits)with tf.name_scope(loss):loss tf.nn.sparse_softmax_cross_entropy_with_logits(logitsself.logits, labelsflat_targets)self.mean_loss tf.reduce_mean(loss)with tf.name_scope(loss_montor):count tf.Variable(1.0, namecount)sum_mean_loss tf.Variable(1.0, namesum_mean_loss)self.reset_loss_monitor tf.group(sum_mean_loss.assign(0.0),count.assign(0.0), namereset_loss_monitor)self.update_loss_monitor tf.group(sum_mean_loss.assign(sum_mean_loss self.mean_loss),count.assign(count 1), nameupdate_loss_monitor)with tf.control_dependencies([self.update_loss_monitor]):self.average_loss sum_mean_loss / countself.ppl tf.exp(self.average_loss)self.global_step tf.get_variable(global_step, [], initializertf.constant_initializer(0.0))self.learning_rate tf.placeholder(tf.float32, [], namelearning_rate)tvars tf.trainable_variables()grads tf.gradients(self.mean_loss, tvars)optimizer tf.train.AdamOptimizer(self.learning_rate)self.train_op optimizer.apply_gradients(zip(grads, tvars), global_stepself.global_step)# 运行一个epoch# 注意我们将session作为一个input argument# 参考下图解释def run_epoch(self, session, batch_generator, learning_rate, freq10):epoch_size batch_generator.num_batchesextra_op self.train_opstate self.zero_state.eval()self.reset_loss_monitor.run()batch_generator.reset_batch_pointer()start_time time.time()for step in range(epoch_size):x, y batch_generator.next_batch()ops [self.average_loss, self.ppl, self.final_state, extra_op, self.global_step]feed_dict {self.input_data: x, self.targets: y,self.initial_state: state,self.learning_rate: learning_rate}results session.run(ops, feed_dict)# option 1. 将上一个 minibatch 的 final state# 作为下一个 minibatch 的 initial stateaverage_loss, ppl, state, _, global_step results# option 2. 总是使用 0-tensor 作为下一个 minibatch 的 initial state# average_loss, ppl, final_state, _, global_step resultsreturn ppl, global_step调用产生合成数据的module from data.synthetic.synthetic_binary import gen_data演示variable scope的冲突如果下面的code cell被连续调用两次则会有下述错误注意reuse) ValueError: Variable embedding already exists, disallowed. Did you mean to set reuseTrue in VarScope? Originally defined at: File ipython-input-1-2c5b9a1002a7, line 36, in __init__self.embedding tf.get_variable(embedding, [self.vocab_size, self.embedding_size])File ipython-input-2-44e044623871, line 9, in modulevocab_size, hidden_size, embedding_size, learning_rate)File /home/dong/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py, line 2881, in run_codeexec(code_obj, self.user_global_ns, self.user_ns)测试下效果 batch_size 16 num_unrollings 20 vocab_size 2 hidden_size 16 embedding_size 16 learning_rate 0.01model CharRNNLM(batch_size, num_unrollings,vocab_size, hidden_size, embedding_size, learning_rate) dataset gen_data(size 1000000) batch_size 16 seq_length num_unrollings batch_generator PredBatchGenerator(data_in dataset[0],data_out dataset[1],batch_size batch_size,seq_length seq_length) #batch_generator BatchGenerator(dataset[0], batch_size, seq_length) session tf.Session() with session.as_default():for epoch in range(1):session.run(tf.global_variables_initializer())ppl, global_step model.run_epoch(session, batch_generator, learning_rate, freq10)print(ppl)输出 1.58694 1.59246 1.59855 1.59121 1.59335打印一下变量 all_vars [node.name for node in tf.global_variables()]for var in all_vars:print(var) #打印结果 embedding:0 rnn/basic_rnn_cell/kernel:0 rnn/basic_rnn_cell/bias:0 softmax/softmax_w:0 softmax/softmax_b:0 loss_montor/count:0 loss_montor/sum_mean_loss:0 global_step:0 beta1_power:0 beta2_power:0 embedding/Adam:0 embedding/Adam_1:0 rnn/basic_rnn_cell/kernel/Adam:0 rnn/basic_rnn_cell/kernel/Adam_1:0 rnn/basic_rnn_cell/bias/Adam:0 rnn/basic_rnn_cell/bias/Adam_1:0 softmax/softmax_w/Adam:0 softmax/softmax_w/Adam_1:0 softmax/softmax_b/Adam:0 softmax/softmax_b/Adam_1:0改进如何cross-validation? 定义另一个CharRNN对象使用validation数据计算ppl tf.get_variable_scope().reuse_variables() valid_model CharRNNLM(batch_size, num_unrollings,vocab_size, hidden_size, embedding_size, learning_rate)画重点一个debug练习 ValueError: Variable embedding/Adam_2/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuseNone in VarScope? 关键是Adam_2,参考上面的Variable 列表注意Adam_1 我们创建validation以及test对象的时候应该disable优化器我们evaluate模型的时候我们并不希望更新参数去掉优化器我们可以避免上面的错误除此以外我们添加summary功能见第二版。为了解决上述问题加上参数加上is_training参数 import time import numpy as np import tensorflow as tfclass CharRNNLM(object):def __init__(self, is_training, batch_size, num_unrollings, vocab_size,hidden_size, embedding_size, learning_rate):New arguments:is_training: 是否在训练阶段在训练的时候才定义优化器不训练就不定义优化器了也就不存在version1的错误了 # mark: 从version1到version2的更新if is_training:tvars tf.trainable_variables()grads tf.gradients(self.mean_loss, tvars)optimizer tf.train.AdamOptimizer(self.learning_rate)self.train_op optimizer.apply_gradients(zip(grads, tvars), global_stepself.global_step)添加了summary功能 # mark: version 1 -- version 2# 增加总结summary方便通过tensorboard观察训练过程average_loss_summary tf.summary.scalar(name average_loss, tensor self.average_loss)ppl_summary tf.summary.scalar(name perplexity, tensor self.ppl)self.summaries tf.summary.merge(inputs [average_loss_summary, ppl_summary], nameloss_monitor)ppl-》Perplexity有可计算出来的理论值吗不会为0。未必会单调收敛到一个值训练数据的会但是test和valid的ppl不一定。可能会先下降再上升过拟合当计算完损失函数后使用优化器char-rnn是特例rnnlm思路上两者一样调参使用统一规则产生的train和valid,结果很好。dnn、rnn深度学习的机器人说的话未知而模板机器人规则说的话可以预测到固定领域的机器人预测下一个字是什么

查看全文

http://www.yutouwan.com/news/152198/