In the Linux kernel, the following vulnerability has been resolved:
net/mlx5e: Avoid field-overflowing memcpy()
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields.
Use flexible arrays instead of zero-element arrays (which look like they are always overflowing) and split the cross-field memcpy() into two halves that can be appropriately bounds-checked by the compiler.
We were doing:
#define ETH_HLEN 14
#define VLAN_HLEN 4
...
#define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN)
...
struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(wq, pi);
...
struct mlx5_wqe_eth_seg *eseg = &wqe->eth;
struct mlx5_wqe_data_seg *dseg = wqe->data;
...
memcpy(eseg->inline_hdr.start, xdptxd->data, MLX5E_XDP_MIN_INLINE);
target is wqe->eth.inlinehdr.start (which the compiler sees as being 2 bytes in size), but copying 18, intending to write across start (really vlantci, 2 bytes). The remaining 16 bytes get written into wqe->data[0], covering byte_count (4 bytes), lkey (4 bytes), and addr (8 bytes).
struct mlx5etxwqe { struct mlx5wqectrlseg ctrl; /* 0 16 */ struct mlx5wqeethseg eth; /* 16 16 / struct mlx5_wqe_data_seg data[]; / 32 0 */
/* size: 32, cachelines: 1, members: 3 */
/* last cacheline: 32 bytes */
};
struct mlx5wqeethseg { u8 swpouterl4offset; /* 0 1 / u8 swp_outer_l3_offset; / 1 1 / u8 swp_inner_l4_offset; / 2 1 / u8 swp_inner_l3_offset; / 3 1 / u8 cs_flags; / 4 1 / u8 swp_flags; / 5 1 / __be16 mss; / 6 2 / __be32 flow_table_metadata; / 8 4 / union { struct { __be16 sz; / 12 2 / u8 start[2]; / 14 2 / } inline_hdr; / 12 4 / struct { __be16 type; / 12 2 / __be16 vlan_tci; / 14 2 / } insert; / 12 4 / __be32 trailer; / 12 4 / }; / 12 4 */
/* size: 16, cachelines: 1, members: 9 */
/* last cacheline: 16 bytes */
};
struct mlx5wqedataseg { _be32 bytecount; /* 0 4 */ _be32 lkey; /* 4 4 / __be64 addr; / 8 8 */
/* size: 16, cachelines: 1, members: 3 */
/* last cacheline: 16 bytes */
};
So, split the memcpy() so the compiler can reason about the buffer sizes.
"pahole" shows no size nor member offset changes to struct mlx5etxwqe nor struct mlx5eumrwqe. "objdump -d" shows no meaningful object code changes (i.e. only source line number induced differences and optimizations).