summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv: Move get_fast_clear_state_address into anv_private.hJason Ekstrand2017-11-272-50/+33
| | | | | | | While we're at it, we break it into two nicely named functions. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Take a range of layers in blorp_ccs_resolveJason Ekstrand2017-11-272-3/+7
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Add initial support for indirect clear colorsJason Ekstrand2017-11-275-0/+96
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Add fast-clear to the special case in MSAA resolvesJason Ekstrand2017-11-271-2/+9
| | | | | | | | | | | | This doesn't go all the way of avoiding the txf_ms if it's fast-cleared, however it does at least make us only do it once. This should improve performance of MSAA resolves in the presence of lots of clear color. Without the patch, enabling fast-clears in the multisampling Sascha demo drops the framerate by about 10%. With this patch, enabling fast-clears increases the demo's framerate by 25%. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp/blit: Rename blorp_nir_txf_ms_mcsJason Ekstrand2017-11-271-4/+5
| | | | | | | | | That name is already taken by one of the helpers in blorp_nir_builder.h and, while we haven't moved the guts of blorp_blit.c there yet, we'd like to start using some things from that header. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965/vec4: fix splitting of interleaved attributesIago Toral Quiroga2017-11-241-1/+6
| | | | | | | | | | | | | | | | | | | When we split an instruction that reads an uniform value (vstride 0) we need to respect the vstride on the second half of the instruction (that is, the second half should read the same region as the first). We were doing this already, but we didn't account for stages that have interleaved input attributes which also have a vstride of 0 and need the same treatment. Fixes the following on Haswell: KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations KHR-GL45.enhanced_layouts.varying_structure_locations Reviewed-by: Matt Turner <[email protected]> Acked-by: Andres Gomez <[email protected]>
* genxml: fix assert guardsEric Engestrom2017-11-231-5/+5
| | | | | | | This removes a few hundred warnings on debug builds with asserts off. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: flag batch & instruction BOs for captureLionel Landwerlin2017-11-222-2/+6
| | | | | | | | | | | | | When the kernel support flagging our BO, let's mark batch & instruction BOs for capture so then can be included in the error state. v2: Only add EXEC_CAPTURE if supported (Kristian) v3: Fix operator precedence issue (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: setup BO flags at state_pool/block_pool creationLionel Landwerlin2017-11-227-22/+41
| | | | | | | | This will allow to set the flags on any anv_bo created/filled from a state pool or block pool later. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/genxml: Add helpers for determining field typeKristian H. Kristensen2017-11-211-6/+17
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Check ADD/MAD with immediates in satprop unit testMatt Turner2017-11-211-1/+125
| | | | | | | | | The gen had to be changed from 4 to 6 so that we could test MAD, which is new on Gen6. mad_imm_float_neg_mov_sat tests the case fixed by the previous commit. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Handle negating immediates on MADs when propagating saturatesMatt Turner2017-11-211-2/+8
| | | | | | | | | | | MADs don't take immediate sources, but we allow them in the IR since it simplifies a lot of things. I neglected to consider that case. Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate negations into MADs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616 Reported-and-Tested-by: Ruslan Kabatsayev <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* intel: fix disasm_info memory leaksTapani Pälli2017-11-212-2/+2
| | | | | | | | Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Stop including brw_cfg.h in brw_disasm_info.hJason Ekstrand2017-11-171-1/+5
| | | | | | | | | | | | The brw_disasm_info header is included by certain tools in order to get shader assembly from binaries so it's a semi-external header. Including brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit of our back-end compiler internals. Instead, make the couple of forward declarations we need and make the header more stand-alone. This fixes the meson build. Reviewed-by: Matt Turner <[email protected]> Fixes: 4f82b17287194ca7d10816f6cfe4712a3e0a03fc
* i965: Correct disasm_info usage in eu_validate testAndres Gomez2017-11-181-6/+6
| | | | | | | | Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code") Cc: Matt Turner <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Rename intel_asm_annotation -> brw_disasm_infoMatt Turner2017-11-176-8/+8
| | | | | | | It was the only file named intel_* in the compiler. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rewrite disassembly annotation codeMatt Turner2017-11-1710-170/+173
| | | | | | | | | | | | | | | The old code used an array to store each "instruction group" (the new, better name than the old overloaded "annotation"), and required a memmove() to shift elements over in the array when we needed to split a group so that we could add an error message. This was confusing and difficult to get right, not the least of which was because the array has a tail sentinel not included in .ann_count. Instead use a linked list, a data structure made for efficient insertion. Acked-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Simplify annotation_insert_error()Matt Turner2017-11-171-9/+6
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move common code out of #ifdefMatt Turner2017-11-172-9/+4
| | | | | | | | | I'm going to change the call in a later patch and with the difference in indentation level it wasn't immediately obvious that the calls were identical. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/cmd_buffer: Take bo_offset into account in fast clear state addressesJason Ekstrand2017-11-171-1/+1
| | | | | | | | Otherwise, if the image is not bound to the start of the buffer, we're going to be reading and writing its fast clear state in the wrong spot. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv/cmd_buffer: Advance the address when initializing clear colorsJason Ekstrand2017-11-171-3/+6
| | | | | | | | Found by inspection Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Cc: [email protected]
* genxml: Fix PIPELINE_SELECT on G45/Ironlake.Kenneth Graunke2017-11-162-2/+2
| | | | | | | | | Original 965 sets bits 28:27 to 0, while G45 and later set it to 1. Note that the G45 docs are incorrect in this regard - see the DevCTG+ note in the Ironlake PRMs. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: Drop mtypes.h include from brw_compiler.h.Kenneth Graunke2017-11-151-1/+0
| | | | This isn't necessary and causes trouble for a project I'm working on.
* i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code.Kenneth Graunke2017-11-155-128/+0
| | | | | | | | | | | | We use the same hardware mechanism for both atomic counters and SSBO atomics, so there's really no benefit to maintaining separate code to handle each case. Instead, we can just use Rob's shiny new NIR pass to convert atomic_uints to SSBOs, and delete piles of code. The ssbo_start section of the binding table becomes a combined ABO and SSBO section, with ABOs first, then SSBOs. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/gen10: Enable float blend optimizationAnuj Phogat2017-11-141-0/+12
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/genxml: Add Cache Mode SubSlice Register to gen10.xmlAnuj Phogat2017-11-141-0/+12
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* anv/gen10: Implement WaSampleOffsetIZ workaroundAnuj Phogat2017-11-141-0/+61
| | | | | | | | | We already have this workaround in OpenGL driver. See Mesa commit 3cf4fe2219. Signed-off-by: Anuj Phogat <[email protected]> Cc: Nanley Chery <[email protected]> Cc: Rafael Antognolli <[email protected]>
* Revert "intel/fs: Use a pure vertical stride for large register strides"Matt Turner2017-11-141-13/+3
| | | | | | | This reverts commit e8c9e65185de3e821e1e482e77906d1d51efa3ec. With the actual bug fixed (by commit 6ac2d1690192), this is not necessary. I'm doubtful of its correctness in any case.
* i965/fs: Fix extract_i8/u8 to a 64-bit destinationMatt Turner2017-11-141-2/+23
| | | | | | | | | | | | | | | The MOV instruction can extract bytes to words/double words, and words/double words to quadwords, but not byte to quadwords. For unsigned byte to quadword, we can read them as words and AND off the high byte and extract to quadword in one instruction. For signed bytes, we need to first sign extend to word and the sign extend that word to a quadword. Fixes the following test on CHV, BXT, and GLK: KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Split all 32->64-bit MOVs on CHV, BXT, GLKMatt Turner2017-11-141-4/+4
| | | | | | | Fixes the following tests on CHV, BXT, and GLK: KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115
* intel/blorp: Make the MOCS setting part of blorp_addressJason Ekstrand2017-11-134-18/+18
| | | | | | | | This makes our MOCS settings significantly more flexible. Cc: "17.3" <[email protected]> Tested-by: Lyude Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/blorp: Add a device parameter to blorp_surf_for_anv_imageJason Ekstrand2017-11-131-22/+34
| | | | | | Cc: "17.3" <[email protected]> Tested-by: Lyude Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use mocs.tex for depth stencilJason Ekstrand2017-11-131-5/+1
| | | | | | Cc: "17.3" <[email protected]> Tested-by: Lyude Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/tools/error: Decode compute shaders.Kenneth Graunke2017-11-131-7/+42
| | | | | | | | | | | | This is a bit more annoying than your average shader - we need to look at MEDIA_INTERFACE_DESCRIPTOR_LOAD in the batch buffer, then hop over to the dynamic state buffer to read the INTERFACE_DESCRIPTOR_DATA, then hop over to the instruction buffer to decode the program. Now that we store all the buffers before decoding, we can actually do this fairly easily. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/error: Use do-while for field iterator loops.Kenneth Graunke2017-11-131-6/+6
| | | | | | | | | | | | while loops skip the first field of the instruction/structure, which is not what the code intended. It works out because the field we're looking for doesn't happen to be first, but we ought to do it right regardless. Found while writing the next patch, where Kernel Start Pointer is the first field of INTERFACE_DESCRIPTOR_DATA. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/error: Decode shaders while decoding batch commands.Kenneth Graunke2017-11-131-85/+49
| | | | | | | | | | | | | This makes aubinator_error_decode's shader dumping work like aubinator. Instead of printing them after the fact, it prints them right inside the 3DSTATE_VS/HS/DS/GS/PS packet that references them. This saves you the effort of cross-referencing things and jumping back and forth. It also reduces a bunch of book-keeping, and eliminates the limitation that we could only handle 4096 programs. That code was also broken and failed to print any shaders if there were under 4096 programs. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/error: Save error state sections and decode them later.Kenneth Graunke2017-11-131-37/+58
| | | | | | | | | This lets us complete parsing and storing of each buffer's data before we begin decoding the batchbuffer. This makes it possible to inspect the state buffer and program buffer, so we can properly decode any indirect state or shader programs. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Fix null termination of ring name string.Kenneth Graunke2017-11-131-0/+1
| | | | | | Ported from intel_error_decode. We don't want to run off the end. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Drop unused MAX_RINGS #define.Kenneth Graunke2017-11-131-2/+0
| | | | | | Dead code. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Refactor buffer matching, add more buffers.Kenneth Graunke2017-11-131-62/+30
| | | | | | | | | | Based on a similar patch to intel_error_decode by Chris Wilson. While we're de-duplicating the gtt_offset calculation, we can simplify it to assume two hex digits are there - the kernel has done this since v4.6, and we already require error states from v4.10. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Only decode a few sections of error states.Kenneth Graunke2017-11-131-1/+3
| | | | | | These three are the only we can reasonably decode with genxml. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Drop unused parameters from decode() helper.Kenneth Graunke2017-11-131-5/+3
| | | | | | | | Also change count from a pointer into a value. We were supposed to be resetting it to 0 (and failed to), but that's gone since we dropped the pre-ascii85 handling. Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Drop support for non-ascii85 encoded error states.Kenneth Graunke2017-11-131-35/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Error state files used to look like: render ring --- gtt_offset = 0x0e8f6000 00000000 : 69040000 00000004 : 79090000 ... 00007ffc : 00000000 --- ringbuffer = 0x00001000 There were thousands of lines between sections. The file format changed with Kernel 4.10, and now has a single ascii85-encoded line following each section heading. This is much easier to parse. There are a bunch of bugs in our handling of the old style format, where we'd decode the wrong data, at the wrong time. Fixing all of these is going to be a giant pain. It's also a lot of extra code complexity. In order to properly decode indirect state, or compute shaders, we'll also need to parse data in advance of decoding, which is going to be a giant pain with this ad-hoc "decode everywhere!" mentality. So, let's just drop support for the older file format. This unfortunately requires an error state generated by Kernel 4.10 or later. That's probably not the end of the world, as we encourage users to upgrade to the latest kernel when encountering GPU hangs anyway. It might be a giant pain for people with LTS kernels, though... Reviewed-by: Chris Wilson <[email protected]>
* intel/tools/error: Do ascii85 decode first.Kenneth Graunke2017-11-131-31/+29
| | | | | | | | | | The dashes "---" may occur within an ascii85 block, but only an ascii85 block starts with ':' or '~'. Ported from Chris Wilson's intel-gpu-tools commit: bceec7e1d8a160226b783c6344eae8cbf4ece144 Reviewed-by: Chris Wilson <[email protected]>
* meson: Don't build intel shared components by defaultDylan Baker2017-11-133-5/+0
| | | | | | | | It's a neat idea, and still useful in some cases, but the intel common code is used by i965 and anvil only, this is a little clearer. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* aubinator: Don't skip the first field in each subgroupJason Ekstrand2017-11-131-2/+3
| | | | | | | | | | The previous iteration algorithm would advance the field pointer right after we advance the group. This meant that you would end up with skipping the first field of the group. In the common case, where the only field is a struct (e.g. 3DSTATE_VERTEX_BUFFERS), it would get skipped entirely. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Delete empty groupsJason Ekstrand2017-11-134-8/+0
| | | | | | | | | They serve no purpose other than to just fill empty space in the packet so each dword has something. Just disallowing empty groups is a bit easier on some of the tools. This does not change the generated packing headers in any way. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Don't crash on invalid heap sizes when the PCI ID is overridenJason Ekstrand2017-11-131-0/+12
|
* intel/tools: Fix detection of enabled shader stages.Kenneth Graunke2017-11-121-1/+1
| | | | | | | | | | We renamed "Function Enable" to "Enable", which broke our detection of whether shaders are enabled or not. So, we'd see a bunch of HS/DS packets with program offsets of 0, and think that was a valid TCS/TES. Fixes: c032cae9ff77e (genxml: Rename "Function Enable" to "Enable".) Reviewed-by: Lionel Landwerlin <[email protected]>
* autotools: Set C++ visibility flags on IntelDylan Baker2017-11-101-0/+3
| | | | | | | | | These flags are set for C sources, but not C++. This causes symbol visibility leaks from the C++ parts of the Intel compiler. Fixes: 700bebb958e93f4d ("i965: Move the back-end compiler to src/intel/compiler") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Matt Turner <[email protected]>