前言
这是 Scrapy 系列学习文章之一,本章主要介绍 Exceptions 的相关的内容;
本文为作者的原创作品,转载需注明出处;
内置 Exceptions 一览
DropItem
1 | exception scrapy.exceptions.DropItem |
The exception that must be raised by item pipeline stages to stop processing an Item. For more information see 爬虫 Scrapy 学习系列之九:Item Pipeline">Item Pipeline;
在 Item Pipeline 执行过程中,如果需要终止对某个 Item 的执行,那么此异常是必须被抛出的;
CloseSpider
1 | exception scrapy.exceptions.CloseSpider(reason='cancelled') |
在 spider 的回调方法中抛出,目的是终止当前的 spider 继续执行;
Parameters: reason (str) – the reason for closing
1 | def parse_page(self, response): |
IgnoreRequest
1 | exception scrapy.exceptions.IgnoreRequest |
This exception can be raised by the
Scheduler
or anydownloader middleware
to indicate that the request should beignored
.
NotConfigured
1 | exception scrapy.exceptions.NotConfigured |
This exception can be raised by some components to indicate that they will remain disabled. Those components include:
通过该异常表示某些 components 当前不可用
- Extensions
- Item pipelines
- Downloader middlewares
- Spider middlewares
The exception must
be raised in the component’s __init__ method.
NotSupported
1 | exception scrapy.exceptions.NotSupported |
This exception is raised to indicate an unsupported feature.