diff options
author | Jordan Justen <[email protected]> | 2018-03-06 23:28:00 -0800 |
---|---|---|
committer | Emil Velikov <[email protected]> | 2018-03-20 16:57:25 +0000 |
commit | 611a88d4a6c89e3ab4793bea4bb78da4de76d200 (patch) | |
tree | 349dae57835302b74912f54e5bb32e4e33e95091 /src | |
parent | 412ea8789e6fba6f979bb8db386b462cdd505e10 (diff) |
intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.
Cc: <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
(cherry picked from commit 24b415270ffeef873ba4772d1b3c7c185c9b1958)
Diffstat (limited to 'src')
-rw-r--r-- | src/intel/vulkan/anv_allocator.c | 45 |
1 files changed, 28 insertions, 17 deletions
diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_allocator.c index fe14d6cfabd..a27af4eccc4 100644 --- a/src/intel/vulkan/anv_allocator.c +++ b/src/intel/vulkan/anv_allocator.c @@ -1097,24 +1097,35 @@ anv_scratch_pool_alloc(struct anv_device *device, struct anv_scratch_pool *pool, &device->instance->physicalDevice; const struct gen_device_info *devinfo = &physical_device->info; - /* WaCSScratchSize:hsw - * - * Haswell's scratch space address calculation appears to be sparse - * rather than tightly packed. The Thread ID has bits indicating which - * subslice, EU within a subslice, and thread within an EU it is. - * There's a maximum of two slices and two subslices, so these can be - * stored with a single bit. Even though there are only 10 EUs per - * subslice, this is stored in 4 bits, so there's an effective maximum - * value of 16 EUs. Similarly, although there are only 7 threads per EU, - * this is stored in a 3 bit number, giving an effective maximum value - * of 8 threads per EU. - * - * This means that we need to use 16 * 8 instead of 10 * 7 for the - * number of threads per subslice. - */ const unsigned subslices = MAX2(physical_device->subslice_total, 1); - const unsigned scratch_ids_per_subslice = - device->info.is_haswell ? 16 * 8 : devinfo->max_cs_threads; + + unsigned scratch_ids_per_subslice; + if (devinfo->is_haswell) { + /* WaCSScratchSize:hsw + * + * Haswell's scratch space address calculation appears to be sparse + * rather than tightly packed. The Thread ID has bits indicating + * which subslice, EU within a subslice, and thread within an EU it + * is. There's a maximum of two slices and two subslices, so these + * can be stored with a single bit. Even though there are only 10 EUs + * per subslice, this is stored in 4 bits, so there's an effective + * maximum value of 16 EUs. Similarly, although there are only 7 + * threads per EU, this is stored in a 3 bit number, giving an + * effective maximum value of 8 threads per EU. + * + * This means that we need to use 16 * 8 instead of 10 * 7 for the + * number of threads per subslice. + */ + scratch_ids_per_subslice = 16 * 8; + } else if (devinfo->is_cherryview) { + /* Cherryview devices have either 6 or 8 EUs per subslice, and each EU + * has 7 threads. The 6 EU devices appear to calculate thread IDs as if + * it had 8 EUs. + */ + scratch_ids_per_subslice = 8 * 7; + } else { + scratch_ids_per_subslice = devinfo->max_cs_threads; + } uint32_t max_threads[] = { [MESA_SHADER_VERTEX] = devinfo->max_vs_threads, |