Label Studio, an open source data labeling tool had a remote import feature allowed users to import data from a remote web source, that was downloaded and could be viewed on the website. Prior to version 1.10.1, this feature could had been abused to download a HTML file that executed malicious JavaScript code in the context of the Label Studio website. Executing arbitrary JavaScript could result in an attacker performing malicious actions on Label Studio users if they visit the crafted avatar image. For an example, an attacker can craft a JavaScript payload that adds a new Django Super Administrator user if a Django administrator visits the image.
data_import/uploader.py
lines 125C5 through 146 showed that if a URL passed the server side request forgery verification checks, the contents of the file would be downloaded using the filename in the URL. The downloaded file path could then be retrieved by sending a request to /api/projects/{project_id}/file-uploads?ids=[{download_id}]
where {project_id}
was the ID of the project and {download_id}
was the ID of the downloaded file. Once the downloaded file path was retrieved by the previous API endpoint, data_import/api.py
lines 595C1 through 616C62 demonstrated that the Content-Type
of the response was determined by the file extension, since mimetypes.guess_type
guesses the Content-Type
based on the file extension. Since the Content-Type
was determined by the file extension of the downloaded file, an attacker could import in a .html
file that would execute JavaScript when visited.
Version 1.10.1 contains a patch for this issue. Other remediation strategies are also available. For all user provided files that are downloaded by Label Studio, set the Content-Security-Policy: sandbox;
response header when viewed on the site. The sandbox
directive restricts a page's actions to prevent popups, execution of plugins and scripts and enforces a same-origin
policy. Alternatively, restrict the allowed file extensions that may be downloaded.