mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	intel/blorp: Use wide formats for nicely aligned stencil clears	Jason Ekstrand	2019-09-06	2	-0/+122
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the case where the stencil clear is nicely aligned, we can clear stencil much more efficiently by mapping it as a wide format (say RGBA32_UINT) and blasting out the stencil clear value with a repclear. On Unigine Heaven, this makes one stencil clear go from non-trivial to unnoticeable when looking at per-draw timings. In order for this change to work properly, ANV needs to do a bit more flushing around depth and stencil clears. i965 and iris already have the cache tracking logic to handle this so no changes are required there. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/blorp: Expose surf_fake_interleaved_msaa internally	Jason Ekstrand	2019-09-06	2	-5/+8
\|
*	intel/blorp: Expose surf_retile_w_to_y internally	Jason Ekstrand	2019-09-06	2	-5/+8
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	blorp: Memset surface info to zero when initializing it	Jason Ekstrand	2019-09-06	1	-0/+1
\| \| \| \| \| \| \| \|	This isn't known to fix any current bugs but it does prevent a regression in a subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/tools: Decode PS kernels on SNB	Jason Ekstrand	2019-09-06	1	-1/+4
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/tools: Decode 3DSTATE_BINDING_TABLE_POINTERS on SNB	Jason Ekstrand	2019-09-06	1	-0/+15
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv: add support for vk_x11_override_min_image_count	Eric Engestrom	2019-09-06	1	-0/+3
\| \| \| \| \| \| \|	Cc: [email protected] Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv: add support for driconf	Eric Engestrom	2019-09-06	4	-3/+19
\| \| \| \| \| \| \| \| \|	No option is supported yet, this is just the boilerplate. Cc: [email protected] Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv,iris: L3ALLOC register replaces L3CNTLREG for gen12	Jordan Justen	2019-09-06	2	-7/+16
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/gen12: Add L3 configurations	Anuj Phogat	2019-09-06	1	-1/+12
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	anv: Bump maxComputeWorkgroupSize	Jason Ekstrand	2019-09-06	1	-4/+6
\| \| \| \| \| \|	Fixes: 9a129510f56f "anv: Bump maxComputeWorkgroupInvocations" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552 Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel: Stop redirecting state cache to command streamer cache section	Kenneth Graunke	2019-09-06	1	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.
*	Revert "intel/fs: Move the scalar-region conversion to the generator."	Jason Ekstrand	2019-09-06	4	-5/+5
\| \| \| \| \| \| \| \| \| \|	This reverts commit c0504569eac5e5c305e9f0c240e248aca9d8891f. Now that we're doing interpolation lowering in NIR, we can continue to stride the FS input registers directly in the brw_fs_nir code like we did before. This fixes SIMD32 fragment shaders which broke because lower_simd_width depended on the 0 stride to split PLN instructions correctly. Reviewed-by: Francisco Jerez <[email protected]>
*	intel/fs: Fix FB write inst groups	Jason Ekstrand	2019-09-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This commit does two things. First, it simplifies the way we compute the FB write group bit. There's no reason to use a ternary because inst->group / 16 can only be 0 or 1. Second, it fixes an order-of- operations bug where the ternary wasn't selecting between (1 << 11) and 0 but between (1 << 11) and 0 \| brw_dp_write_desc(...). Fixes: 0d9648416 "intel/compiler: Use generic SEND for Gen7+ FB writes" Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: allow specifying filter callback in lower_alu_to_scalar	Vasily Khoruzhick	2019-09-06	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	anv: fix format string in error message	Eric Engestrom	2019-09-04	1	-1/+1
\| \| \| \| \| \|	Fixes: 9775894f102535a79186 ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv: build libanv for gen12 in android build	Tapani Pälli	2019-08-28	1	-0/+23
\| \| \| \| \| \| \|	Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	anv: Build for gen12	Jordan Justen	2019-08-28	6	-1/+29
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/l3: Don't assert on gen12 (use gen11 config temporarily)	Jordan Justen	2019-08-28	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
*	intel/compiler: Disable compaction on gen12 for now	Jordan Justen	2019-08-28	1	-1/+7
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	intel/isl: build android libmesa_isl for gen12	Tapani Pälli	2019-08-28	1	-0/+20
\| \| \| \| \| \| \|	Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/isl: Build gen12 using gen11 code paths	Jordan Justen	2019-08-28	4	-1/+11
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml: generate pack files for gen12 on android builds	Tapani Pälli	2019-08-28	1	-0/+5
\| \| \| \| \| \| \|	Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml: Build gen12 genxml	Jordan Justen	2019-08-28	5	-2/+11
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml: Add gen12.xml as a copy of gen11.xml	Jordan Justen	2019-08-28	1	-0/+7171
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml: Run sort_xml.sh to tidy gen9.xml and gen11.xml	Jordan Justen	2019-08-28	2	-38/+36
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml/gen11: Add spaces in EnableUnormPathInColorPipe	Jordan Justen	2019-08-28	1	-1/+1
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/genxml: Handle field names with different spacing/hyphen	Jordan Justen	2019-08-28	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	If a field name differs slightly between two generations then this change will still add the fields into the same group. For example, these will be treated as equal: * "Software Exception" and "Software Exception" * "Per Thread" and "Per-Thread" Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware	Ian Romanick	2019-08-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	See the previous commit for the explanation of the Fixes tag. Hurts 21 shaders in shader-db. All of the hurt shaders are in Unreal Engine 4 tech demos. Reviewed-by: Matt Turner <[email protected]> Fixes: 7afa26d4e39 ("nir: Add lowering for nir_op_bitfield_reverse.")
*	intel/compiler: Use new Gen11 headerless RT writes for MRT cases	Kenneth Graunke	2019-08-27	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gen11 adds support for specifying the render target index and src0 alpha present bits in the extended message descriptor. Previously, we had to use a message header for this, requiring extra instructions to write the fields, and two registers of extra payload. Improves performance on my ICL 8x8 frequency locked to 700Mhz, on iris: GfxBench5 Manhattan 3.0: 2.13635% +/- 0.159859% (n=5) GfxBench5 Aztec Ruins: 1.57173% +/- 0.128749% (n=5) Synmark2 OglDeferred: 2.86914% +/- 0.191211% (n=10) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Use generic SEND for Gen7+ FB writes	Kenneth Graunke	2019-08-27	2	-6/+28
\| \| \| \| \| \| \| \|	This takes care of generate_fb_write/fire_fb_write/brw_fb_WRITE's stuff earlier in the visitor. It will also make it easier to generate SENDSC messages with indirect extended descriptors in a few patches. Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Refactor FB write message control setup into a helper.	Kenneth Graunke	2019-08-27	3	-26/+37
\| \| \| \| \| \|	This will be used by visitor code to convert directly to SEND in a bit. Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Handle bits 15:12 in brw_send_indirect_split_message()	Kenneth Graunke	2019-08-27	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Annoyingly, these bits exist in some extended message descriptors (in particular render target writes), but they don't have any corresponding bits in the ISA encoding. So we can't use an immediate and have to fall back to an indirect extended descriptor. Thanks to Jason Ekstrand for reminding me that you can still set these bits via an indirect descriptor, even if they don't exist in the ISA. Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Fix src0/desc setter ordering	Kenneth Graunke	2019-08-27	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	src0 vstride and type overlap with bits of the extended descriptor. brw_set_desc() also sets the extended descriptor to 0. So by setting the descriptor, then setting src0, we were accidentally setting a bunch of extended descriptor bits unintentionally. When using this infrastructure for framebuffer writes (in a future patch), this ended up setting the extended descriptor bit 20, which is "Null Render Target" on Icelake, causing nothing to be written to the framebuffer. Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/fs: grab fail_msg from v32 instead of v16 when v32->run_cs fails	Paulo Zanoni	2019-08-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looks like a copy/paste error. This patch prevents a segfault when running the following on BDW: INTEL_DEBUG=no8,no16,do32 ./deqp-vk -n \ dEQP-VK.subgroups.arithmetic.compute.subgroupmin_dvec4 For the curious, the message we're getting is: CS compile failed: Failure to register allocate. Reduce number of live scalar values to avoid this. Fixes: 864737ce6cd5 ("i965/fs: Build 32-wide compute shader when needed.") Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Paulo Zanoni <[email protected]>
*	isl: Don't set UnormPathInColorPipe for integer surfaces.	Kenneth Graunke	2019-08-26	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes dEQP-GLES3.functional.texture.specification subtests on iris: - texsubimage3d_depth.depth24_stencil8_2d_array - texsubimage3d_depth.depth32f_stencil8_2d_array - texsubimage3d_depth.depth_component32f_2d_array - texsubimage3d_depth.depth_component24_2d_array - texstorage2d.format.depth24_stencil8_2d - texstorage2d.format.depth32f_stencil8_2d - texstorage2d.format.depth_component24_2d - texstorage2d.format.depth_component32f_2d - texstorage3d.format.depth24_stencil8_2d_array - texstorage3d.format.depth32f_stencil8_2d_array - texstorage3d.format.depth_component24_2d_array - texstorage3d.format.depth_component32f_2d_array Here, something appears to be going wrong with having this bit set during blorp_copy operations for texture upload, which override the format to R8G8B8A8_UINT. AFAICT this bit should have no effect for integer surfaces, as it has to do with blending, and integer blending is not a thing. So it should be harmless to disable it. The Windows driver appears to be setting this bit universally, so I am unclear why we would need to. Perhaps they simply haven't run into this issue. Fixes: f741de236b5 ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <[email protected]>
*	isl: Drop UnormPathInColorPipe for buffer surfaces.	Kenneth Graunke	2019-08-26	1	-4/+0
\| \| \| \| \| \| \| \|	Jason suggested I remove this in review, and he's right. AFAICT this affects blending, and that just isn't going to happen on buffers. Fixes: f741de236b5 ("isl: Enable Unorm Path in Color Pipe") Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/fs: Drop the gl_program from fs_visitor	Jason Ekstrand	2019-08-25	12	-27/+13
\| \| \| \| \| \| \| \| \|	It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <[email protected]>
*	anv: Only re-emit non-dynamic state that has changed.	Rafael Antognolli	2019-08-23	2	-24/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On commit f6e7de41d7b, we started emitting 3DSTATE_LINE_STIPPLE as part of the non-dynamic state. That gets re-emitted every time we bind a new VkPipeline. But that instruction is non-pipelined, and it caused a perf regression of about 9-10% on Dota2. This commit makes anv_dynamic_state_copy() return a mask with only the state that has changed when copying it. 3DSTATE_LINE_STIPPLE won't be emitted anymore unless it has changed, fixing the problem above. v2: Improve commit message and add documentation about skipped checks (Jason) Fixes: f6e7de41d7b ("anv: Implement VK_EXT_line_rasterization") Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	intel/decoders: Avoid uninitialized variable warnings	Caio Marcelo de Oliveira Filho	2019-08-23	1	-2/+2
\| \| \| \| \| \| \| \| \|	Initialize `next_batch_addr` and `second_level`. If the batch is well formed, those values will be overriden, if not, they are as good as uninitialized garbage. Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	anv: Drop unused local variable	Caio Marcelo de Oliveira Filho	2019-08-23	1	-1/+0
\| \| \| \| \| \| \| \|	Leftover from 021fa28163a ("xintel/nir: Add a helper for getting BRW_AOP from an intrinsic"). Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Silence maybe-uninitialized warning in GCC 9.1.1	Caio Marcelo de Oliveira Filho	2019-08-23	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiler can't see that d is initialized. ../src/intel/compiler/brw_vec4_nir.cpp: In function ‘int brw::try_immediate_source(const nir_alu_instr, brw::src_reg, bool, const gen_device_info*)’: ../src/intel/compiler/brw_vec4_nir.cpp:984:12: warning: ‘d’ may be used uninitialized in this function [-Wmaybe-uninitialized] 984 \| d = MAX2(-d, d); Assert that we expect at least one component -- hence d going to be set. That by itself is not enough, so also zero initialize the variable. Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/nir: Add a helper for getting BRW_AOP from an intrinsic	Jason Ekstrand	2019-08-21	4	-170/+78
\| \| \| \| \| \|	So many duplicated switch statements.... Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add explicit signs to image min/max intrinsics	Jason Ekstrand	2019-08-21	4	-22/+46
\| \| \| \| \| \| \| \| \| \| \|	This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	anv: inline uniforms blocks don't count toward descriptor set limits	Arcady Goldmints-Orlov	2019-08-20	1	-0/+23
\| \| \| \| \| \| \| \|	In a descriptor set inline uniform blocks don't use up any bindings. However, the presence of any inline uniform blocks doed require the use of the descriptor buffer, which takes up one binding. Reviewed-by: Jason Ekstrand <[email protected]>
*	isl: Enable Unorm Path in Color Pipe	Kenneth Graunke	2019-08-15	2	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improves performance on my Icelake 8x8 locked to 700Mhz. For example, some GfxBench5 subtests have the following results: - [i965] gl_manhattan: ................ 7.01119% +/- 0.180971% (n=5) - [i965] gl_4 (Car Chase): 4.24351% +/- 0.175622% (n=5) - [i965] gl_blending: ................ 3.36327% +/- 0.180267% (n=5) - [i965] gl_5_normal (Aztec Ruins): 1.67962% +/- 0.243534% (n=10) - [iris] gl_manhattan: ................ 3.92357% +/- 0.073965% (n=25) - [iris] gl_4 (Car Chase): 2.17746% +/- 0.0826858% (n=5) - [iris] gl_blending: ................ 2.79599% +/- 0.803652% (n=15) - [iris] gl_5_normal (Aztec Ruins): 1.30930% +/- 0.106523% (n=25) Reviewed-by: Jason Ekstrand <[email protected]>
*	anv: Properly initialize device->slice_hash.	Rafael Antognolli	2019-08-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	When subslices_delta == 0 and we take the early return, device->slice_hash is not initialized on GEN11. It then causes a segfault when going through anv_DestroyDevice, if compiled with valgrind. Fixes: 7bc022b4bbc ("anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.) Reviewed-by: Jason Ekstrand <[email protected]>
*	intel/compiler: Fix resource leak in error path	Danylo Piliaiev	2019-08-15	1	-0/+1
\| \| \| \| \| \| \| \| \|	CID: 1452261 Fixes: 04a99515 "intel/compiler: add ability to override shader's assembly" Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	intel/tools: Fix aub_file initialization in intel_dump_gpu	Caio Marcelo de Oliveira Filho	2019-08-12	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `device` can be set earlier either by a command line or a by intercepting an ioctl call to get the I915_PARAM_CHIPSET_ID done by the application early. In both cases `aub_file` and `devinfo` would not be initialized. Fix by splitting the conditions - `device == 0`: use the FD to get both device and devinfo. - Or `devinfo.gen == 0`: use `device` to initialize it. And separatedly, initialize aub_file the first time it is needed. Fixes: d594d2a0524 ("intel/tools: use device info initializer") Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.	Rafael Antognolli	2019-08-12	3	-0/+78
\| \| \| \| \| \| \|	If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Don't need to set the mask - it's mbo (Ken).