avatar

目錄
【Python 爬蟲】抓取網頁某個特定 class 的標題/連結

記錄一下。


抓取某一個class

python
from bs4 import BeautifulSoup

with open(r"C:\Users\ouoholly\Downloads\ACS.htm", encoding="utf-8") as f:
soup = BeautifulSoup(f)

c = soup.find_all(class_='title')
for cc in c:
txt = cc.text
x = txt.strip()
print(x)

抓取urls

抓取 h4 tag 下的連結

python
from bs4 import BeautifulSoup

with open(r"C:\Users\ouoholly\Downloads\ACS.htm", encoding="utf-8") as f:
soup = BeautifulSoup(f)

c = soup.select('h4 > a')
for cc in c:
print(cc.get('href') )

抓取某class下的連結

python
from bs4 import BeautifulSoup

with open(r"C:\Users\ouoholly\Downloads\ACS.htm", encoding="utf-8") as f:
soup = BeautifulSoup(f)

c = soup.select('.title > a')
for cc in c:
print(cc.get('href') )

如果您喜歡我的文章,歡迎幫我在下面按5下讚!感謝您的鼓勵和支持!

文章作者: ouoholly
文章鏈接: https://ouoholly.github.io/post/python-web-crawl-class-title-urls/
版權聲明: 本博客所有文章除特別聲明外,均採用 CC BY-NC-SA 4.0 許可協議。歡迎「部份引用」與介紹(如要全文轉貼請先留言詢問),轉載引用請註明來源 ouoholly 的倉庫,謝謝!

評論