In the Linux kernel, the following vulnerability has been resolved:
drm/i915/ttm: fix CCS handling
Crucible + recent Mesa seems to sometimes hit:
GEMBUGON(numccsblks > NUMCCSBLKSPERXFER)
And it looks like we can also trigger this with gemlmemswapping, if we modify the test to use slightly larger object sizes.
Looking closer it looks like we have the following issues in migrate_copy():
We are using plain integer in various places, which we can easily overflow with a large object.
We pass the entire object size (when the src is lmem) into emitpte() and then try to copy it, which doesn't work, since we only have a few fixed sized windows in which to map the pages and perform the copy. With an object > 8M we therefore aren't properly copying the pages. And then with an object > 64M we trigger the GEMBUGON(numccsblks > NUMCCSBLKSPER_XFER).
So it looks like our copy handling for any object > 8M (which is our CHUNK_SZ) is currently broken on DG2.
Testcase: igt@gemlmemswapping (cherry picked from commit 8676145eb2f53a9940ff70910caf0125bd8a4bc2)