Prior to Spark 2.3.3, in certain situations Spark would write user data to local disk unencrypted, even if spark.io.encryption.enabled=true. This includes cached blocks that are fetched to disk (controlled by spark.maxRemoteBlockSizeFetchToMem); in SparkR, using parallelize; in Pyspark, using broadcast and parallelize; and use of python udfs.
{ "nvd_published_at": "2019-08-07T17:15:00Z", "cwe_ids": [ "CWE-312" ], "severity": "HIGH", "github_reviewed": true, "github_reviewed_at": "2019-08-08T15:16:27Z" }