
您可以使用 request 和 bs4, 来获取数据,几乎(如果不是全部)asp站点总是需要提供一些后置参数,例如 EVENTTARGET
, EVENTVALIDATION 等。:
from bs4 import BeautifulSoupimport requestsdata = {"__EVENTTARGET": "ctl00$ContentPlaceHolder$ctl00$ctl00$RadAjaxPanel_GV", "__EVENTARGUMENT": "LISTINGS;0", "ctl00$ContentPlaceHolder$ctl00$ctl00$ctl00$hdnProductID": "139", "ctl00$ContentPlaceHolder$ctl00$ctl00$hdnProductID": "139", "ctl00$ContentPlaceHolder$ctl00$ctl00$drpSortField": "Listing Number", "ctl00$ContentPlaceHolder$ctl00$ctl00$drpSortDirection": "A-Z, Low-High", "__ASYNCPOST": "true"}对于实际的帖子,我们需要添加一些其他值以发布帖子数据:
post = "https://seahawks.strmarketplace.com/Charter-Seat-Licenses/Charter-Seat-Licenses.aspx"with requests.Session() as s: s.headers.update({"User-Agent":"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:47.0) Gecko/20100101 Firefox/47.0"}) soup = BeautifulSoup(s.get(post).content) data["__VIEWSTATEGENERATOR"] = soup.select_one("#__VIEWSTATEGENERATOR")["value"] data["__EVENTVALIDATION"] = soup.select_one("#__EVENTVALIDATION")["value"] data["__VIEWSTATE"] = soup.select_one("#__VIEWSTATE")["value"] r = s.post(post, data=data) soup2 = BeautifulSoup(r.content) table = soup2.select_one("div.GridListings") print(table)运行代码时,您将看到打印的表格。
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)