GHSA-23j4-mw76-5v7h

Suggest an improvement
Source
https://github.com/advisories/GHSA-23j4-mw76-5v7h
Import Source
https://github.com/github/advisory-database/blob/main/advisories/github-reviewed/2024/05/GHSA-23j4-mw76-5v7h/GHSA-23j4-mw76-5v7h.json
JSON Data
https://api.osv.dev/v1/vulns/GHSA-23j4-mw76-5v7h
Published
2024-05-14T20:14:49Z
Modified
2024-05-14T20:30:42.185054Z
Severity
  • 6.5 (Medium) CVSS_V3 - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N CVSS Calculator
Summary
Scrapy allows redirect following in protocols other than HTTP
Details

Impact

Scrapy was following redirects regardless of the URL protocol, so redirects were working for data://, file://, ftp://, s3://, and any other scheme defined in the DOWNLOAD_HANDLERS setting.

However, HTTP redirects should only work between URLs that use the http:// or https:// schemes.

A malicious actor, given write access to the start requests (e.g. ability to define start_urls) of a spider and read access to the spider output, could exploit this vulnerability to: - Redirect to any local file using the file:// scheme to read its contents. - Redirect to an ftp:// URL of a malicious FTP server to obtain the FTP username and password configured in the spider or project. - Redirect to any s3:// URL to read its content using the S3 credentials configured in the spider or project.

For file:// and s3://, how the spider implements its parsing of input data into an output item determines what data would be vulnerable. A spider that always outputs the entire contents of a response would be completely vulnerable, while a spider that extracted only fragments from the response could significantly limit vulnerable data.

Patches

Upgrade to Scrapy 2.11.2.

Workarounds

Replace the built-in retry middlewares (RedirectMiddleware and MetaRefreshMiddleware) with custom ones that implement the fix from Scrapy 2.11.2, and verify that they work as intended.

References

This security issue was reported by @mvsantos at https://github.com/scrapy/scrapy/issues/457.

References

Affected packages

PyPI / scrapy

Package

Affected ranges

Type
ECOSYSTEM
Events
Introduced
0Unknown introduced version / All previous versions are affected
Fixed
2.11.2

Affected versions

0.*

0.7
0.8
0.9
0.10.4.2364
0.12.0.2550
0.14.1
0.14.2
0.14.3
0.14.4
0.16.0
0.16.1
0.16.2
0.16.3
0.16.4
0.16.5
0.18.0
0.18.1
0.18.2
0.18.3
0.18.4
0.20.0
0.20.1
0.20.2
0.22.0
0.22.1
0.22.2
0.24.0
0.24.1
0.24.2
0.24.3
0.24.4
0.24.5
0.24.6

1.*

1.0.0rc1
1.0.0rc2
1.0.0rc3
1.0.0
1.0.1
1.0.2
1.0.3
1.0.4
1.0.5
1.0.6
1.0.7
1.1.0rc1
1.1.0rc2
1.1.0rc3
1.1.0rc4
1.1.0
1.1.1
1.1.2
1.1.3
1.1.4
1.2.0
1.2.1
1.2.2
1.2.3
1.3.0
1.3.1
1.3.2
1.3.3
1.4.0
1.5.0
1.5.1
1.5.2
1.6.0
1.7.0
1.7.1
1.7.2
1.7.3
1.7.4
1.8.0
1.8.1
1.8.2
1.8.3
1.8.4

2.*

2.0.0
2.0.1
2.1.0
2.2.0
2.2.1
2.3.0
2.4.0
2.4.1
2.5.0
2.5.1
2.6.0
2.6.1
2.6.2
2.6.3
2.7.0
2.7.1
2.8.0
2.9.0
2.10.0
2.10.1
2.11.0
2.11.1