
另外html和Python都支持使用单引号来表示字符串。这样你可以吧html中统一用单引号‘,python用双引号,或者反过来。
import os,redef check_flag(flag):
regex = re.compile(r'images\/')
result = True if regex.match(flag) else False
return result
#soup = BeautifulSoup(open('index.html'))
from bs4 import BeautifulSoup
html_content = '''
<a href="https://xxx.com">测试01</a>
<a href="https://yyy.com/123">测试02</a>
<a href="https://xxx.com">测试01</a>
<a href="https://xxx.com">测试01</a>
'''
file = open(r'favour-en.html','r',encoding="UTF-8")
soup = BeautifulSoup(file, 'html.parser')
for element in soup.find_all('img'):
if 'src' in element.attrs:
print(element.attrs['src'])
if check_flag(element.attrs['src']):
#if element.attrs['src'].find("png"):
element.attrs['src'] = "michenxxxxxxxxxxxx" +'/'+ element.attrs['src']
print("##################################")
with open('index.html', 'w',encoding="UTF-8") as fp:
fp.write(soup.prettify()) # prettify()的作⽤是将sp美化⼀下,有可读性
没法在源html上直接改。。。open这个html
for 循环遍历读到的每一行
如果发现要替换的内容
则进行替换
写到新的html里
新的html覆盖掉原来的html
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)