python网络编程实战-使用requests网络数据请求，写入csv中

时间：2022-09-24 21:00:31人气：次作者：快盘下载我要评论

实战例子说明;

使用python的requests网络进行数据请求;获取同程酒店相关数据;并且保存在csv文件中。
特别声明;本例子程序仅限于学习交流使用;切勿用于商业用途。
目的;学习网络数据的抓取;以及csv文件的写入。

实现步骤和注意事项;

配置hearders属性;通过网页源代码来查看获取;
分析网页各个url地址和具体的网页内容;
用request抓取数据;并且写入csv文件。

注意事项;
常用hearders属性的配置;

“Accept”: “application/json, text/plain, /”, //接受的数据格式
“Accept-Encoding”: “gzip, deflate, br”, //编码格式
“Accept-Language”: “zh-CN,zh;q=0.9,en;q=0.8”, //语言
“Cookie”: “elongUser=userid=xxxxx”; //Cookie
“Referer”: “https://www.ly.com/hotel/hotellist?xxxxx”, //防止盗链
“User-Agent”: “xxxxxx” //用户的电脑浏览器配置

代码;

功能;获取同程酒店相关数据。
相关网址为;

url = ;https://www.ly.com/tapi/v2/list;

 
python代码; 
import requests
import os
import csv

#hearder
hearders_param = {
        ;Accept;: ;application/json, text/plain, */*;,
        ;Accept-Encoding;: ;gzip, deflate, br;,
        ;Accept-Language;: ;zh-CN,zh;q=0.9,en;q=0.8;,
        ;appfrom;: ;16;,
        ;Cache-Control;: ;no-cache;,
        ;cluster;: ;idc;,
        ;Connection;: ;keep-alive;,
        ;Cookie;: ;xxxxxx;,  #使用你自己的值
        ;Host;: ;www.ly.com;,
        ;Pragma;: ;no-cache;,
        ;Referer;: ;https://www.ly.com/hotel/hotellist?pageSize=20&city=394&inDate=2022-06-05&outDate=2022-06-06&filterList=8888_1&pageIndex=6&t=1654414710837;,
        ;sec-ch-ua;: ;; Not A;Brand;;v=;99;, ;Chromium;;v=;102;, ;Google chrome;;v=;102;;,
        ;sec-ch-ua-mobile;: ;?0;,
        ;sec-ch-ua-platform;: ;;macOS;;,
        ;Sec-Fetch-Dest;: ;empty;,
        ;Sec-Fetch-Mode;: ;cors;,
        ;Sec-Fetch-Site;: ;same-origin;,
        ;Tmapi-Client;: ;tpc;,
        ;traceid;: ;a8015495-8733-4649-8e44-fb8be958f5da;,
        ;User-Agent;: ;xxxxxx; #使用你自己的值
    }

#数据请求
def request_data():
    #请求1——5页
    for page in range(1,3):
        print(;page======;,page)
        url = ;https://www.ly.com/tapi/v2/list;
        params = {
            ;pageSize;: ;20;,
            ;city;: ;394;,
            ;inDate;: ;2022-06-05;,
            ;outDate;: ;2022-06-06;,
            ;filterList;: ;8888_1;,
            ;pageIndex;: page,
            ;sugActInfo;: ;;,
            ;traceToken;: ;|*|cityId:401|*|qId:43d77d90-6614-4af7-a78b-c1a7bc6afb88|*|st:city|*|sId:401|*|scene_ids:0|*|bkt:r3|*|;
        }
        resp = requests.get(url, params=params, headers=hearders_param).json()
        print(resp)
        print(resp[;data;][;hotelList;][0][;hotelName;])
        print(resp[;data;][;hotelList;][0][;hotelAddress;])
        print(resp[;data;][;hotelList;][0][;commentScore;])
        print(resp[;data;][;hotelList;][0][;starLevelDes;])
        print(resp[;data;][;hotelList;][0][;hotelId;])
        print(resp[;data;][;hotelList;][0][;commentCount;])
        print(resp[;data;][;hotelList;][0][;discountPrice;])
        parse_response(resp)

#list解析
def parse_response(resp):
    for i in range(len(resp[;data;][;hotelList;])):
        url2 = ;https://www.ly.com/tapi/gethotelDetailInfo;
        params2 = {
            ;indate;: ;2022-06-05;,
            ;outdate;: ;2022-06-06;,
            ;hotelid;: resp[;data;][;hotelList;][i][;hotelId;],
            ;couponOrder;: ;;
        }
        resp2 = requests.get(url2, params=params2, headers=hearders_param).json()
        print(resp2)
        print(resp2[;data;][;hotelOpenDate;])
        print(resp2[;data;][;roomNum;])

        write_file(resp,i,resp2);

#将数据写入csv文件
def write_file(resp,i,resp2):
    path_file_name = ;data1.csv; #csv文件名
    if not os.path.exists(path_file_name):
        print(;新建并且写入;)
        with open(path_file_name, ;a;;, encoding=;utf_8_sig;, newline=;;) as csvfile:
            writer = csv.writer(csvfile)
            writer.writerow([;酒店名称;, ;酒店地址;, ;评分;, ;类型;, ;评论数;, ;价格;, ;开业时间;, ;床位数;])
            writer.writerow([resp[;data;][;hotelList;][i][;hotelName;], resp[;data;][;hotelList;][i][;hotelAddress;],
                             resp[;data;][;hotelList;][i][;commentScore;],resp[;data;][;hotelList;][i][;starLevelDes;],
                             resp[;data;][;hotelList;][i][;commentCount;],
                             resp[;data;][;hotelList;][i][;discountPrice;], resp2[;data;][;hotelOpenDate;],
                             resp2[;data;][;roomNum;],resp[;data;][;hotelList;][i][;hotelId;]])
    else:
        with open(path_file_name, ;a;;, encoding=;utf_8_sig;, newline=;;) as csvfile:
            print(%s写入完成%resp[;data;][;hotelList;][i][;hotelName;])
            writer = csv.writer(csvfile)
            writer.writerow([resp[;data;][;hotelList;][i][;hotelName;], resp[;data;][;hotelList;][i][;hotelAddress;],
                             resp[;data;][;hotelList;][i][;commentScore;],
                             resp[;data;][;hotelList;][i][;starLevelDes;],
                             resp[;data;][;hotelList;][i][;commentCount;],
                             resp[;data;][;hotelList;][i][;discountPrice;], resp2[;data;][;hotelOpenDate;],
                             resp2[;data;][;roomNum;],resp[;data;][;hotelList;][i][;hotelId;]])
#主函数入口
if __name__ == ;__main__;:
    request_data()

 
代码说明;
 hearders_param;是hearder参数配置;request请求时需要传入。
 request_data;是请求实现的业务函数;
 parse_response;分析网页内容;继续请求子网页数据;
 write_file;将需要的数据写入csv文件。 
补充;
 csv文件特点;
 CSV是一种通用的、相对简单的文件格式;是一个纯文本文件。可以和诸如excel表格等结构型数据格式进行转换;以方便查看和使用。
 “CSV”并不是一种单一的、定义明确的格式;尽管RFC 4180有一个被通常使用的定义;。因此在实践中;术语“CSV”泛指具有以下特征的任何文件; 
纯文本;使用某个字符集;比如ASCII、Unicode、EBCDIC或GB2312;由记录组成;典型的是每行一条记录;;每条记录被分隔符分隔为字段;典型分隔符有逗号、分号或制表符;有时分隔符4可以包括可选的空格;;每条记录都有同样的字段序列。 
<本篇完>



        上一篇：KVM创建虚拟机设置桥接网络，使不同网段的宿主机所创建的虚拟机之间可以互相通信
        下一篇：C语言笔试和面试重要知识点——内存分配

python网络编程实战-使用requests网络数据请求，写入csv中

实战例子说明;

实现步骤和注意事项;

代码;

网友评论

推荐文章

最新文章

文章排行