Juan Riaza - Dive into Scrapy
[EuroPython 2015]
[21 July 2015]
[Bilbao, Euskadi, Spain]
Scrapy is a fast high-level screen scraping and web crawling
framework, used to crawl websites and extract structured data from
their pages. It can be used for a wide range of purposes, from data
mining to monitoring and automated testing.
In this talk some advanced techniques will be shown based on how
Scrapy is used at Scrapinghub.
Goals:
- Understand why its necessary to _Scrapy-ify_ early on.
- Anatomy of a Scrapy Spider.
- Using the interactive shell.
- What are items and how to use item loaders.
- Examples of pipelines and middlewares.
- Techniques to avoid getting banned.
- How to deploy Scrapy projects.
Ещё видео!