If you manually define cookies on a Request
object, and that Request
object gets a redirect response, the new Request
object scheduled to follow the redirect keeps those user-defined cookies, regardless of the target domain.
Upgrade to Scrapy 2.6.0, which resets cookies when creating Request
objects to follow redirects¹, and drops the Cookie
header if manually-defined if the redirect target URL domain name does not match the source URL domain name².
If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.6.0 is not an option, you may upgrade to Scrapy 1.8.2 instead.
¹ At that point the original, user-set cookies have been processed by the cookie middleware into the global or request-specific cookiejar, with their domain restricted to the domain of the original URL, so when the cookie middleware processes the new (redirect) request it will incorporate those cookies into the new request as long as the domain of the new request matches the domain of the original request.
² This prevents cookie leaks to unintended domains even if the cookies middleware is not used.
If you cannot upgrade, set your cookies using a list of dictionaries instead of a single dictionary, as described in the Request
documentation, and set the right domain for each cookie.
Alternatively, you can disable cookies altogether, or limit target domains to domains that you trust with all your user-set cookies.
If you have any questions or comments about this advisory: * Open an issue * Email us