当前位置: 首页 > news >正文

建设部网站6.1文件哪家公司的网站做得好

建设部网站6.1文件,哪家公司的网站做得好,什么平台免费推广效果最好,毕业设计资料网站有哪些1. 背景 有些场景下#xff0c;开始的时候数据量很小#xff0c;如果我们用一个几千条数据训练一个全新的深度机器学习的文本分类模型#xff0c;效果不会很好。这个时候你有两种选择#xff0c;1.用传统的机器学习训练#xff0c;2.利用迁移学习在一个预训练的模型上训练…1. 背景 有些场景下开始的时候数据量很小如果我们用一个几千条数据训练一个全新的深度机器学习的文本分类模型效果不会很好。这个时候你有两种选择1.用传统的机器学习训练2.利用迁移学习在一个预训练的模型上训练。本博客教你怎么用tensorflow Hub和keras 在少量的数据上训练一个文本分类模型。 2. 实践 2.1. 下载IMDB 数据集参考下面博客。 Imdb影评的数据集介绍与下载_imdb影评数据集-CSDN博客 2.2.  预处理数据 替换掉imdb目录 imdb_raw_data_dir. 创建dataset目录。 import numpy as np import os as osimport re from sklearn.model_selection import train_test_splitvocab_size 30000 maxlen 200 imdb_raw_data_dir /Users/harry/Documents/apps/ml/aclImdb save_dir datasetdef get_data(datapath rD:\train_data\aclImdb\aclImdb\train ):pos_files os.listdir(datapath /pos)neg_files os.listdir(datapath /neg)print(len(pos_files))print(len(neg_files))pos_all []neg_all []for pf, nf in zip(pos_files, neg_files):with open(datapath /pos / pf, encodingutf-8) as f:s f.read()s process(s)pos_all.append(s)with open(datapath /neg / nf, encodingutf-8) as f:s f.read()s process(s)neg_all.append(s)print(len(pos_all))# print(pos_all[0])print(len(neg_all))X_orig np.array(pos_all neg_all)# print(X_orig)Y_orig np.array([1 for _ in range(len(pos_all))] [0 for _ in range(len(neg_all))])print(X_orig:, X_orig.shape)print(Y_orig:, Y_orig.shape)return X_orig, Y_origdef generate_dataset():X_orig, Y_orig get_data(imdb_raw_data_dir r/train)X_orig_test, Y_orig_test get_data(imdb_raw_data_dir r/test)X_orig np.concatenate([X_orig, X_orig_test])Y_orig np.concatenate([Y_orig, Y_orig_test])X X_origY Y_orignp.random.seed 1random_indexs np.random.permutation(len(X))X X[random_indexs]Y Y[random_indexs]X_train, X_test, y_train, y_test train_test_split(X, Y, test_size0.3)print(X_train:, X_train.shape)print(y_train:, y_train.shape)print(X_test:, X_test.shape)print(y_test:, y_test.shape)np.savez(save_dir /train_test, X_trainX_train, y_trainy_train, X_test X_test, y_testy_test )def rm_tags(text):re_tag re.compile(r[^])return re_tag.sub( , text)def clean_str(string):string re.sub(r[^A-Za-z0-9(),!?\\], , string)string re.sub(r\s, \s, string) # its - it sstring re.sub(r\ve, \ve, string) # Ive - I vestring re.sub(rn\t, n\t, string) # doesnt - does ntstring re.sub(r\re, \re, string) # youre - you arestring re.sub(r\d, \d, string) # youd - you dstring re.sub(r\ll, \ll, string) # youll - you llstring re.sub(r\m, \m, string) # Im - I mstring re.sub(r,, , , string)string re.sub(r!, ! , string)string re.sub(r\(, \( , string)string re.sub(r\), \) , string)string re.sub(r\?, \? , string)string re.sub(r\s{2,}, , string)return string.strip().lower()def process(text):text clean_str(text)text rm_tags(text)#text text.lower()return textif __name__ __main__:generate_dataset() 执行完后产生train_test.npz 文件 2.3.  训练模型 1. 取数据集 def get_dataset_to_train():train_test np.load(dataset/train_test.npz, allow_pickleTrue)x_train train_test[X_train]y_train train_test[y_train]x_test train_test[X_test]y_test train_test[y_test]return x_train, y_train, x_test, y_test 2. 创建模型 基于nnlm-en-dim50/2 预训练的文本嵌入向量在模型外面加了两层全连接。 def get_model():hub_layer hub.KerasLayer(embedding_url, input_shape[], dtypetf.string, trainableTrue)# Build the modelmodel Sequential([hub_layer,Dense(16, activationrelu),Dropout(0.5),Dense(2, activationsoftmax)])print(model.summary())model.compile(optimizerkeras.optimizers.Adam(),losskeras.losses.SparseCategoricalCrossentropy(),metrics[keras.metrics.SparseCategoricalAccuracy()])return model 还可以使用来自 TFHub 的许多其他预训练文本嵌入向量 google/nnlm-en-dim128/2 - 基于与 google/nnlm-en-dim50/2 相同的数据并使用相同的 NNLM 架构进行训练但具有更大的嵌入向量维度。更大维度的嵌入向量可以改进您的任务但可能需要更长的时间来训练您的模型。google/nnlm-en-dim128-with-normalization/2 - 与 google/nnlm-en-dim128/2 相同但具有额外的文本归一化例如移除标点符号。如果您的任务中的文本包含附加字符或标点符号这会有所帮助。google/universal-sentence-encoder/4 - 一个可产生 512 维嵌入向量的更大模型使用深度平均网络 (DAN) 编码器训练。 还有很多在 TFHub 上查找更多文本嵌入向量模型。 3. 评估你的模型 def evaluate_model(test_data, test_labels):model load_trained_model()# Evaluate the modelresults model.evaluate(test_data, test_labels, verbose2)print(Test accuracy:, results[1])def load_trained_model():# model get_model()# model.load_weights(./models/model_new1.h5)model tf.keras.models.load_model(models_pb)return model 4. 测试几个例子 def predict(real_data):model load_trained_model()probabilities model.predict([real_data]);print(probabilities :,probabilities)result get_label(probabilities)return resultdef get_label(probabilities):index np.argmax(probabilities[0])print(index : str(index))result_str index_dic.get(str(index))# result_str list(index_dic.keys())[list(index_dic.values()).index(index)]return result_strdef predict_my_module():# review I dont like it# review this is bad movie # review This is good moviereview this is terrible movie# review This isn‘t great movie# review i think this is bad movie# review Im not very disappoint for this movie# review Im not very disappoint for this movie# review I am very happy for this movie#neg:0 postive:1s predict(review)print(s)if __name__ __main__:x_train, y_train, x_test, y_test get_dataset_to_train()model get_model()model train(model, x_train, y_train, x_test, y_test)evaluate_model(x_test, y_test)predict_my_module() 完整代码 import numpy as np import tensorflow as tf from keras.models import Sequential from keras.layers import Dense, Dropout import keras as keras from keras.callbacks import EarlyStopping, ModelCheckpoint import tensorflow_hub as hubembedding_url https://tfhub.dev/google/nnlm-en-dim50/2index_dic {0:negative, 1: positive}def get_dataset_to_train():train_test np.load(dataset/train_test.npz, allow_pickleTrue)x_train train_test[X_train]y_train train_test[y_train]x_test train_test[X_test]y_test train_test[y_test]return x_train, y_train, x_test, y_testdef get_model():hub_layer hub.KerasLayer(embedding_url, input_shape[], dtypetf.string, trainableTrue)# Build the modelmodel Sequential([hub_layer,Dense(16, activationrelu),Dropout(0.5),Dense(2, activationsoftmax)])print(model.summary())model.compile(optimizerkeras.optimizers.Adam(),losskeras.losses.SparseCategoricalCrossentropy(),metrics[keras.metrics.SparseCategoricalAccuracy()])return modeldef train(model , train_data, train_labels, test_data, test_labels):# train_data, train_labels, test_data, test_labels get_dataset_to_train()train_data [tf.compat.as_str(tf.compat.as_bytes(str(x))) for x in train_data]test_data [tf.compat.as_str(tf.compat.as_bytes(str(x))) for x in test_data]train_data np.asarray(train_data) # Convert to numpy arraytest_data np.asarray(test_data) # Convert to numpy arrayprint(train_data.shape, test_data.shape)early_stop EarlyStopping(monitorval_sparse_categorical_accuracy, patience4, modemax, verbose1)# 定义ModelCheckpoint回调函数# checkpoint ModelCheckpoint( ./models/model_new1.h5, monitorval_sparse_categorical_accuracy, save_best_onlyTrue,# modemax, verbose1)checkpoint_pb ModelCheckpoint(filepath./models_pb/, monitorval_sparse_categorical_accuracy, save_weights_onlyFalse, save_best_onlyTrue)history model.fit(train_data[:2000], train_labels[:2000], epochs45, batch_size45, validation_data(test_data, test_labels), shuffleTrue,verbose1, callbacks[early_stop, checkpoint_pb])print(history, history)return modeldef evaluate_model(test_data, test_labels):model load_trained_model()# Evaluate the modelresults model.evaluate(test_data, test_labels, verbose2)print(Test accuracy:, results[1])def predict(real_data):model load_trained_model()probabilities model.predict([real_data]);print(probabilities :,probabilities)result get_label(probabilities)return resultdef get_label(probabilities):index np.argmax(probabilities[0])print(index : str(index))result_str index_dic.get(str(index))# result_str list(index_dic.keys())[list(index_dic.values()).index(index)]return result_strdef load_trained_model():# model get_model()# model.load_weights(./models/model_new1.h5)model tf.keras.models.load_model(models_pb)return modeldef predict_my_module():# review I dont like it# review this is bad movie # review This is good moviereview this is terrible movie# review This isn‘t great movie# review i think this is bad movie# review Im not very disappoint for this movie# review Im not very disappoint for this movie# review I am very happy for this movie#neg:0 postive:1s predict(review)print(s)if __name__ __main__:x_train, y_train, x_test, y_test get_dataset_to_train()model get_model()model train(model, x_train, y_train, x_test, y_test)evaluate_model(x_test, y_test)predict_my_module()
http://www.yutouwan.com/news/139799/

相关文章:

  • 东莞网站建设模具网站建设 客户需求
  • 网站前台展示做网站为什么能赚钱吗
  • win10 中国建设银行网站网站pv统计方法
  • 珠海网站免费制作数字营销策略有哪些
  • 衡水电子网站建设做的好的茶叶网站有哪些
  • 扁平化网站设计教程昆明软件开发公司推荐
  • 网站注册都需要什么品牌策划公司收费
  • vps 建网站 代理安徽省住房城乡建设厅网站官网
  • 做微网站用什么框架教育网站制作公司
  • 网站怎么推广出去比较好婴儿用品网站模板
  • 做网站流量要钱吗站内营销推广方式
  • 郑州旅游网站搭建外贸专业网站制作
  • 做网站维护有没有前途创新的中山网站建设
  • 直接用ip访问网站要备案吗做网站国外网站
  • 机票网站建设公司做生存曲线的网站
  • 网站怎么提高收录微信模板怎么制作
  • 如何做新闻自动采集网站网站先做前端还是后端
  • 怎么测网站流量吗wordpress 删除 后台菜单
  • 哪个网站可以做面料订单东莞智通人才网
  • 做网站运营的股票购物网站排行
  • WordPress建站 用插件怀化网络推广哪家服务好
  • 视频网站设计oa软件排行
  • 高端网站设计公司排名企业网站的建设专业服务
  • 昆明官渡区网站建设网页设计基础读书笔记
  • jsp网站建设教程网站制作的流程
  • 网站建设需要的资质个人备案的公司网站
  • html社交网站模板域名网站教程
  • 微信公众号搭建网站怎么做企业网站优化需要多少钱
  • 点击图片进入网站要怎么做网页制作工具哪个好用
  • 专业网站设计第三方网站群建设公司排行榜