【无标题】_python

爬虫猪八戒网
代码内容是和B站的视频学习的，但是视频内的代码有些问题，如：在 python3.8中不能使用 from lxml import etree。因为 lxml从某个版本之后移除了对 lxml的支持，所以视频的中代码跑不起来。

这里笔者用的python版本为Python 3.8.9；lxml的版本为4.8.0 requests的版本为2.27.1

代码如下:

import requests
from lxml import html

etree = html.etree
url = "https://shanghai.zbj.com/search/f/"
input_content = input("请输入您要搜索的内容：")
param = {
    "kw": input_content
}
resp = requests.get(url, params=param)
doc = html.fromstring(resp.text)
divs = doc.xpath("/html/body/div[6]/div/div/div[2]/div[5]/div[1]/div")
for div in divs:
    price = div.xpath("./div/div/a[2]/div[2]/div[1]/span[1]/text()")
    title = input_content.join(div.xpath("./div/div/a[2]/div[2]/div[2]/p/text()"))
    place = div.xpath("./div/div/a[1]/div[1]/div/span/text()")
    print(price, title, place)
resp.close()

另外：这个网站有点问题（应该是我的问题）:
其中.../div[2]/... 当input_content=java时，好像div[2]是没有输出的。。要把doc.xpath中的div[2]变成div[3]。

欢迎分享，转载请注明来源：内存溢出

原文地址:https://54852.com/langs/942544.html

【无标题】

发表评论

评论列表（0条）