当前位置：首页 > news >正文

深圳建设培训中心网站百度网页提交入口

news 2026/1/13 21:26:07

深圳建设培训中心网站,百度网页提交入口,怎么申请免费的网站,专门做化妆品平台的网站有哪些华子目录 Requests介绍安装requests模块常用方法常用属性实例引入各种请求方式基于get请求带参数的get请求推荐写法基于post请求添加headers信息content获取二进制数据bytes类型获取json数据第一种方式第二种方式 response响应状态码判断高级操作会话维持通过cookie维持会话通… 华子目录 Requests介绍安装requests模块常用方法常用属性实例引入各种请求方式基于get请求带参数的get请求推荐写法基于post请求添加headers信息content获取二进制数据bytes类型获取json数据第一种方式第二种方式 response响应状态码判断高级操作会话维持通过cookie维持会话通过session维持会话代理设置超时设置异常处理 Requests介绍作用发送网络请求获得响应数据官方文档https://requests.readthedocs.io/zh_CN/latest/index.htmlRequests是用python语言基于urllib编写的采用的是Apache2 Licensed开源协议的http库它比urllib更加方便可以节约大量的工作完全满足http测试需求的库。安装requests模块输入cmd打开命令行模式输入 windows操作系统pip install requests Linux操作系统sodo pip install requests常用方法其中最常用的方法是get和post方法分别用于发送get请求和post请求返回响应体对象响应源码响应状态码响应url 常用属性实例引入 import requests # https://www.baidu.com/ response requests.get(https://www.baidu.com/) print(response) # 响应体对象响应源码响应状态码响应url print(response.text) # 响应体内容 print(type(response.text)) # 响应体内容类型为str print(response.status_code) # 响应状态码 print(response.url) # 查看响应方的urlResponse [200] !DOCTYPE html !--STATUS OK--html headmeta http-equivcontent-type contenttext/html;charsetutf-8meta http-equivX-UA-Compatible contentIEEdgemeta contentalways namereferrerlink relstylesheet typetext/css hrefhttps://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.csstitleç™¾åº¦ä¸€ä¸‹ï¼Œä½ å°±çŸ¥é“/title/head body link#0000cc div idwrapper div idhead div classhead_wrapper div classs_form div classs_form_wrapper div idlg img hidefocustrue src//www.baidu.com/img/bd_logo1.png width270 height129 /div form idform namef action//www.baidu.com/s classfm input typehidden namebdorz_come value1 input typehidden nameie valueutf-8 input typehidden namef value8 input typehidden namersv_bp value1 input typehidden namersv_idx value1 input typehidden nametn valuebaiduspan classbg s_ipt_wrinput idkw namewd classs_ipt value maxlength255 autocompleteoff autofocusautofocus/spanspan classbg s_btn_wrinput typesubmit idsu valueç™¾åº¦ä¸€ä¸‹ classbg s_btn autofocus/span /form /div /div div idu1 a hrefhttp://news.baidu.com nametj_trnews classmnavæ–°é—»/a a hrefhttps://www.hao123.com nametj_trhao123 classmnavhao123/a a hrefhttp://map.baidu.com nametj_trmap classmnavåœ°å›¾/a a hrefhttp://v.baidu.com nametj_trvideo classmnavè§†é¢‘/a a hrefhttp://tieba.baidu.com nametj_trtieba classmnavè´´å§/a noscript a hrefhttp://www.baidu.com/bdorz/login.gif?loginamp;tplmnamp;uhttp%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 nametj_login classlbç™»å½•/a /noscript scriptdocument.write(a hrefhttp://www.baidu.com/bdorz/login.gif?logintplmnu encodeURIComponent(window.location.href (window.location.search ? ? : ) bdorz_come1) nametj_login classlbç™»å½•/a);/script a href//www.baidu.com/more/ nametj_briicon classbri styledisplay: block;æ›´å¤šäº§å“/a /div /div /div div idftCon div idftConw p idlh a hrefhttp://home.baidu.comå ³äºŽç™¾åº¦/a a hrefhttp://ir.baidu.comAbout Baidu/a /p p idcpcopy;2017nbsp;Baidunbsp;a hrefhttp://www.baidu.com/duty/ä½¿ç”¨ç™¾åº¦å‰å¿ è¯»/anbsp; a hrefhttp://jianyi.baidu.com/ classcp-feedbackæ„è§åé¦ˆ/anbsp;äº¬ICPè¯030173å·nbsp; img src//www.baidu.com/img/gs.gif /p /div /div /div /body /htmlclass str 200 https://www.baidu.com/各种请求方式 import requests url http://httpbin.org/put print(requests.get(url)) print(requests.post(url)) print(requests.put(url)) print(requests.delete(url)) print(requests.head(url)) print(requests.options(url))基于get请求 import requests url http://httpbin.org/get # 目标站点 re requests.get(url) print(re.status_code) print(re.text) print(type(re.text))200 {args: {}, headers: {Accept: */*, Accept-Encoding: gzip, deflate, Host: httpbin.org, User-Agent: python-requests/2.31.0, X-Amzn-Trace-Id: Root1-6550ee3e-1138be3d1596f4b820f87a82}, origin: 111.18.40.246, url: http://httpbin.org/get }class str带参数的get请求 import requests url http://httpbin.org/get?age21namehuazi # 目标站点 re requests.get(url) print(re.status_code) print(re.text) print(type(re.text))200 {args: {age: 21, name: huazi}, headers: {Accept: */*, Accept-Encoding: gzip, deflate, Host: httpbin.org, User-Agent: python-requests/2.31.0, X-Amzn-Trace-Id: Root1-6550eff0-11976ad80c73c287054a519e}, origin: 111.18.40.239, url: http://httpbin.org/get?age21namehuazi }class str推荐写法把参数单独构建在字典里 import requests param {name:huazi,age:10 } url http://httpbin.org/get?age21namehuazi # 目标站点 re requests.get(url,paramsparam) # params携带get的参数 print(re.status_code) print(re.text) print(type(re.text))200 {args: {age: [21, 10], name: [huazi, huazi]}, headers: {Accept: */*, Accept-Encoding: gzip, deflate, Host: httpbin.org, User-Agent: python-requests/2.31.0, X-Amzn-Trace-Id: Root1-6550f2a3-7e41a0ad12af5b99601cefda}, origin: 111.18.40.234, url: http://httpbin.org/get?age21namehuazinamehuaziage10 }class str基于post请求 import requests url http://httpbin.org/post d {age:10,name:huazi } re requests.post(url, datad) # data携带post请求的参数 print(re.status_code) print(re.url) print(re.text)200 http://httpbin.org/post {args: {}, data: , files: {}, form: {age: 10, name: huazi}, headers: {Accept: */*, Accept-Encoding: gzip, deflate, Content-Length: 17, Content-Type: application/x-www-form-urlencoded, Host: httpbin.org, User-Agent: python-requests/2.31.0, X-Amzn-Trace-Id: Root1-6550f5eb-73f133fb497a4aca38ae755c}, json: null, origin: 111.18.40.243, url: http://httpbin.org/post }添加headers信息浏览器用户身份的标识缺少的话服务器会认为你不是一个正常的浏览器用户而是一个爬虫程序。 user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36import requests# 将参数name和age定义到字典params中 params {name: tony,age: 20 } url http://httpbin.org/get# 定义HTTP头信息cookie,UA和referer headers {User-agent: Mozilla/5.0 (Linux; Android 8.1.0; SM-P585Y) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36,referer: https://www.abidu.com,Cookies: 1234565678 }# 发送请求参数 res requests.get(url url,params params,headers headers) # headers携带伪装参数# 输出返回对象的文本结果 print(res.text){args: {age: 20, name: tony}, headers: {Accept: */*, Accept-Encoding: gzip, deflate, Cookies: 1234565678, Host: httpbin.org, Referer: https://www.abidu.com, User-Agent: Mozilla/5.0 (Linux; Android 8.1.0; SM-P585Y) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36, X-Amzn-Trace-Id: Root1-6550fcb0-7316ea826ef4c4664b0c1dff}, origin: 111.18.40.215, url: http://httpbin.org/get?nametonyage20 }content获取二进制数据 import requests # 目标站点百度logo图片https://www.baidu.com/img/baidu_igylogo3.gif url https://www.baidu.com/img/baidu_jgylogo3.gif re requests.get(url) print(re.text)我们可以看到结果是一堆乱码 (ɨt{,w| BZaK7|MPh %n8FN:F|V1~wyr 9khlOj!s\m\AZPQ~yXRż WEz85 .Da,L vٱ#Uamf*L03]x\y2)JhiHtHKDK ;这是我们就要用到response.content获取二进制数据 import requests # 目标站点百度logo图片https://www.baidu.com/img/baidu_igylogo3.gif url https://www.baidu.com/img/baidu_jgylogo3.gif re requests.get(url) print(re.content) # content:获取二进制数据 with open(./baidu.png, wb)as f: # 在当前同级目录中创建baidu.png照片f.write(re.content)bytes类型 bytes类型是指一推字节的集合在python中以b开头的字符串都是bytes类型bytes类型的作用1.在python中数据转成二进制后不是直接以010101的形式表示的而是用一种叫bytes(字节)的类型来表示2.计算机只能存储二进制数据我们的字符图片视频音乐等想存到硬盘上也必须以正确的方式编码成二进制后再存储。3.记住一句话再python中字符串必须编码成bytes后才能存到硬盘上。获取json数据第一种方式使用json自带的函数json.loads()反序列化将…转为…对象(dict,list,tuple,set) import requests import jsonurl http://httpbin.org/get re requests.get(url) a re.text # 返回json数据 # 利用内置模块json print(a) dict_data json.loads(a) # str 转为dict print(dict_data) print(type(dict_data)) # 为字典类型的数据 res dict_data[url] print(res) response dict_data[headers][Host] print(response)第二种方式使用response.json()方法将响应体对象转为字典对象 import requests import jsonurl http://httpbin.org/get re requests.get(url) dict_data re.json() # 将响应体对象转为字典对象 print(dict_data) print(type(dict_data))注为什么两种方法都是将json数据转为dict类型因为dict类型的数据便于及进行提取 response响应 url https://www.jianshu.com h { User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36 } re requests.get(url,headersh) print(re.status_code) # 状态码 print(re.headers) # 查看响应体信息 print(re.url) # 查看url print(re.history) # 查看网页是否跳转为[],则没有发生跳转200 {Date: Sun, 12 Nov 2023 17:21:03 GMT, Content-Type: text/html; charsetutf-8, Transfer-Encoding: chunked, Connection: keep-alive, Vary: Accept-Encoding, X-Frame-Options: SAMEORIGIN, X-XSS-Protection: 1; modeblock, X-Content-Type-Options: nosniff, ETag: W/41ecb3f916a6731629ac139b5e2cc204, Cache-Control: max-age0, private, must-revalidate, Set-Cookie: localezh-CN; path/, X-Request-Id: 4b3cc972-e9c3-4326-859d-d13ad5a7b556, X-Runtime: 0.003260, Strict-Transport-Security: max-age31536000; includeSubDomains; preload, Content-Encoding: gzip} https://www.jianshu.com/ []状态码判断 200 请求成功 301、302 请求发生跳转 404 页面没找到 500 502 503服务器内部错误100: (continue,), 101: (switching_protocols,), 102: (processing,), 103: (checkpoint,), 122: (uri_too_long, request_uri_too_long), 200: (ok, okay, all_ok, all_okay, all_good, \\o/, ✓), 201: (created,), 202: (accepted,), 203: (non_authoritative_info, non_authoritative_information), 204: (no_content,), 205: (reset_content, reset), 206: (partial_content, partial), 207: (multi_status, multiple_status, multi_stati, multiple_stati), 208: (already_reported,), 226: (im_used,),# Redirection. 300: (multiple_choices,), 301: (moved_permanently, moved, \\o-), 302: (found,), 303: (see_other, other), 304: (not_modified,), 305: (use_proxy,), 306: (switch_proxy,), 307: (temporary_redirect, temporary_moved, temporary), 308: (permanent_redirect,resume_incomplete, resume,), # These 2 to be removed in 3.0# Client Error. 400: (bad_request, bad), 401: (unauthorized,), 402: (payment_required, payment), 403: (forbidden,), 404: (not_found, -o-), 405: (method_not_allowed, not_allowed), 406: (not_acceptable,), 407: (proxy_authentication_required, proxy_auth, proxy_authentication), 408: (request_timeout, timeout), 409: (conflict,), 410: (gone,), 411: (length_required,), 412: (precondition_failed, precondition), 413: (request_entity_too_large,), 414: (request_uri_too_large,), 415: (unsupported_media_type, unsupported_media, media_type), 416: (requested_range_not_satisfiable, requested_range, range_not_satisfiable), 417: (expectation_failed,), 418: (im_a_teapot, teapot, i_am_a_teapot), 421: (misdirected_request,), 422: (unprocessable_entity, unprocessable), 423: (locked,), 424: (failed_dependency, dependency), 425: (unordered_collection, unordered), 426: (upgrade_required, upgrade), 428: (precondition_required, precondition), 429: (too_many_requests, too_many), 431: (header_fields_too_large, fields_too_large), 444: (no_response, none), 449: (retry_with, retry), 450: (blocked_by_windows_parental_controls, parental_controls), 451: (unavailable_for_legal_reasons, legal_reasons), 499: (client_closed_request,),# Server Error. 500: (internal_server_error, server_error, /o\\, ✗), 501: (not_implemented,), 502: (bad_gateway,), 503: (service_unavailable, unavailable), 504: (gateway_timeout,), 505: (http_version_not_supported, http_version), 506: (variant_also_negotiates,), 507: (insufficient_storage,), 509: (bandwidth_limit_exceeded, bandwidth), 510: (not_extended,), 511: (network_authentication_required, network_auth, network_authentication),高级操作会话维持通过cookie维持会话通过session维持会话代理设置超时设置异常处理

查看全文

http://www.yutouwan.com/news/32672/