Website Scraping with Python
Parsing robots.txt
page 28
Using Beautiful Soup
用来解析 HTML
- page 56 - 使用
- page 101 - 利用 strainer 只解析想要的数据
Exporting the Data
- page 80 - CSV
- page 87 - JSON
- page 90 - SQLite
- page 97 - MongoDB
Using Scrapy
- page 111
Handling JavaScript
- page 186 - Splash
- page 196 - Selenium
Cloud
- page 206 - Scrapy Cloud
- page 211 - mlab
- page 216 - PythonAnywhere