summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tgsi_to_nir: Support FACE and POSITION properly.Timur Kristóf2019-03-051-12/+68
| | | | | | | | | | | | | | Previously, FACE was hard-coded as a sysval, but TTN emulated it incorrectly. Also, POSITION was not supported when it was a sysval. This patch fixes these by allowing both of them to be sysvals or inputs, based on driver capabilities. It also fixes the TGSI FACE emulation based on the TGSI spec. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function.Timur Kristóf2019-03-051-14/+20
| | | | | | | | | | | We'll need to use the same logic in other places, so it makes sense to have a separate function for this. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Restructure system value loads.Timur Kristóf2019-03-051-10/+6
| | | | | | | | | | Minor cleanup to the way system value loads work in tgsi_to_nir. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Produce optimized NIR for a given pipe_screen.Timur Kristóf2019-03-058-13/+153
| | | | | | | | | | | | | | | | | | | With this patch, tgsi_to_nir will output NIR that is tailored to the given pipe, by reading its capabilities and adjusting the NIR code to those capabilities similarly to how glsl_to_nir works. It also adds an optimization loop that brings the output NIR in line with what glsl_to_nir outputs. This is necessary for the same reason why glsl_to_nir has its own optimization loop: currently not every driver does these optimizations yet. For uses which cannot pass a pipe_screen we also keep a variant called tgsi_to_nir_noscreen which keeps the old behavior. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Acked-By: Eric Anholt <[email protected]>
* freedreno: Plumb pipe_screen through to irX_tgsi_to_nir.Timur Kristóf2019-03-0512-19/+37
| | | | | | | | | | This patch makes it possible for freedreno to pass a pipe_screen to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading pipe capabilities. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Add multiplier argument to nir_lower_uniforms_to_ubo.Timur Kristóf2019-03-054-11/+18
| | | | | | | | | | | | | Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Move nir_lower_uniforms_to_ubo to compiler/nir.Timur Kristóf2019-03-059-11/+10
| | | | | | | | | | | | The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Split to smaller functions.Timur Kristóf2019-03-051-26/+56
| | | | | | | | | | | | Previously, tgsi_to_nir was a single big function, and this patch intends to make the code easier to understand by splitting it up to multiple smaller pieces. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Acked-By: Tested-by: Rob Clark <[email protected]>
* tgsi_to_nir: Make the TGSI IF translation code more readable.Timur Kristóf2019-03-051-4/+5
| | | | | | | | | | | This patch is a minor cleanup that only intends to make the TGSI IF translation a bit easier to read. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Fix TGSI LIT translation by using flt.Timur Kristóf2019-03-051-3/+3
| | | | | | | | | | | | TGSI spec says LIT needs a "greater than" comparison. NIR doesn't have that, so let's use "less than" and swap the arguments. Previously "greater than or equal" was used by tgsi_to_nir which is incorrect. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Fix the TGSI ARR translation by converting the result to int.Timur Kristóf2019-03-051-1/+1
| | | | | | | | | | | | | According to the TGSI spec, ARR needs to do a rounding and then a float-to-integer conversion which was missing. This patch also makes the rounding a bit more efficient by using nir_fround_even instead of the previous nir_ffloor+nir_fadd trick. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add ability for shaders to use window space coordinates.Timur Kristóf2019-03-053-0/+8
| | | | | | | | | | | | | This patch adds a shader_info field that tells the driver to use window space coordinates for a given vertex shader. It also enables this feature in radeonsi (the only NIR-capable driver that supported it in TGSI), and makes tgsi_to_nir aware of it. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Move the stores for fixed function VS output reads into NIR.Eric Anholt2019-03-055-195/+343
| | | | | | | | | | | | | | | This lets us emit the VPM_WRITEs directly from nir_intrinsic_store_output() (useful once NIR scheduling is in place so that we can reduce register pressure), and lets future NIR scheduling schedule the math to generate them. Even in the meantime, it looks like this lets NIR DCE some more code and make better decisions. total instructions in shared programs: 6429246 -> 6412976 (-0.25%) total threads in shared programs: 153924 -> 153934 (<.01%) total loops in shared programs: 486 -> 483 (-0.62%) total uniforms in shared programs: 2385436 -> 2388195 (0.12%) Acked-by: Ian Romanick <[email protected]> (nir)
* v3d: Translate f2i(fround_even) as FTOIN.Eric Anholt2019-03-051-2/+9
| | | | | This appears to be just what the opcode does. Needed for equivalence when moving FF VPM stores into NIR.
* nir: Improve printing of load_input/store_output variable names.Eric Anholt2019-03-051-2/+4
| | | | | | | We were printing only when the channel was exactly the start channel, so scalarized loads/stores would be missing the name on the rest. Reviewed-by: Ian Romanick <[email protected]>
* anv: Implement VK_EXT_inline_uniform_blockJason Ekstrand2019-03-056-16/+163
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the same types for resource indices as pointersJason Ekstrand2019-03-055-32/+79
| | | | | | | | We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the generic dereference function for OpArrayLengthJason Ekstrand2019-03-051-1/+1
| | | | | | | | With the new deref changes, the old pointer_offset version may not be the right one to call. Just call the generic one and let it sort it out. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Pull offset/stride from the pointer for OpArrayLengthJason Ekstrand2019-03-051-2/+10
| | | | | | | | | We can't pull it from the variable type because it might be an array of blocks and not just the one block. While we're here, throw in some error checking. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: Add a concept of a descriptor bufferJason Ekstrand2019-03-055-0/+281
| | | | | | | | | This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Take references to push descriptor set layoutsJason Ekstrand2019-03-051-6/+16
| | | | | | | | Technically, descriptor set layouts aren't required to survive past the function they're passed into so we need to reference them. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Refactor descriptor pushing a bitJason Ekstrand2019-03-051-28/+22
| | | | | | | | | Pull the common code out of the two entrypoints into the helper which fetches the push descriptor set for us. Now that it does more than just get a thing, call it anv_cmd_buffer_push_descriptor_set. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: drop add_var_binding from anv_nir_apply_pipeline_layout.cJason Ekstrand2019-03-051-7/+2
| | | | | | It has exactly one caller. Just inline it. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Clean up descriptor set layoutsJason Ekstrand2019-03-053-83/+85
| | | | | | | | | | | | | | | | | The descriptor set layout code in our driver has undergone many changes over the years. Some of the fields which were once essential are now useless or nearly so. The has_dynamic_offsets field was completely unused accept for the code to set and hash it. The per-stage indices were only being used to determine if a particular binding had images, samplers, etc. The fact that it's per-stage also doesn't matter because that binding should never be accessed by a shader of the wrong stage. This commit deletes a pile of cruft and replaces it all with a descriptive bitfield which states what a particular descriptor contains. This merely describes the data available and doesn't necessarily dictate how it will be lowered in anv_nir_apply_pipeline_layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Count image param entries rather than imagesJason Ekstrand2019-03-055-23/+29
| | | | | | | | | This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Stop allocating buffer views for dynamic buffersJason Ekstrand2019-03-053-24/+22
| | | | | | | We emit the surface states for those on-the-fly so we don't need the buffer view. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Rework arguments to anv_descriptor_set_write_*Jason Ekstrand2019-03-053-29/+27
| | | | | | | Make them all take a device followed by a set. This is consistent with how the actual Vulkan entrypoint parameters are laid out. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Refactor alloc/free of descriptor setsJason Ekstrand2019-03-051-59/+84
| | | | | | | This commit just puts the free list code together as part of the pool instead of having it inlined into the descriptor set create code. Reviewed-by: Lionel Landwerlin <[email protected]>
* v3d: Stop treating exec masking specially.Eric Anholt2019-03-053-14/+3
| | | | | | | | | | | | | | | In our backend, the successor edges from the blocks only point to where QPU control flow goes, not where the notional control flow goes from a "break" or "continue" modifying the execution mask to resume writing to some channels later. As a result, this attempt at restricting live ranges ended up missing the live range of a value where a conditional break/continue was present in a loop before the later def of a variable. The previous commit ended up fixing the problem that the flag tried to solve. Fixes glsl-vs-loop-continue.shader_test and/or glsl-vs-loop-redundant-condition.shader_test based on register allocation results.
* v3d: Restrict live intervals to the blocks reachable from any def.Eric Anholt2019-03-052-4/+43
| | | | | | | | | | | | | | | In the backend, we often have condition codes on writes to variables, such that there's no screening def anywhere and the previous live ranges algorithm would conclude that the start of the range extends to the start of the program. However, we do know that the live range can only extend as early as you can reach from all blocks writing to the variable. The motivation was that, while we have a couple of hacks to try to promote conditional writes up to being a def within the block, the exec_mask one was broken and needed a replacement. Based on c3c1aa5aeb92 ("intel/fs: Restrict live intervals to the subset possibly reachable from any definition.").
* gitlab-ci: install distro's ninjaAndres Gomez2019-03-052-11/+3
| | | | | | | | Ubuntu Bionic is shipping ninja 1.8.2. Therefore, we do not need to download v1.6.0 manually any more. Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: properly align the fence and EOP bug VA on GFX9Samuel Pitoiset2019-03-051-2/+5
| | | | | | | | | | If alignement is 0, offets returned by radv_cmd_buffer_upload_alloc() are always 0. These two virtual addresses were pointing at the same location. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate enough space in cmdbuf when starting a subpassSamuel Pitoiset2019-03-051-1/+1
| | | | | | | | | | | | This fixes some CTS crashes with: dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_* Ideally, we should check cmd_buffer->cs->max_dw because there is likely enough space (the internal clear draws allocate space), but keep that way for consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vulkan: import vk_layer.h from KhronosEric Engestrom2019-03-051-0/+195
| | | | | | | Instead of relying on the system having it (and the right version). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* egl: fix libdrm-less buildsEric Engestrom2019-03-052-15/+0
| | | | | | | | | | | | This function was never used, and isn't properly guarded by HAVE_LIBDRM, breaking the build on systems that don't have libdrm. Let's just remove it. Fixes: 7552fcb7b9b98392e6a8 "egl: add base EGL_EXT_device_base implementation" Reported-by: Timo Aaltonen <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Emil Velikov <[email protected]>
* vulkan: import missing file from KhronosEric Engestrom2019-03-051-0/+66
| | | | | | Fixes: 114c4aa0c84fc6d00407 "vulkan: update headers/registry to 1.1.102" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* util: #define PATH_MAX when undefined (eg. Hurd)Eric Engestrom2019-03-051-0/+4
| | | | | | | Cc: Timo Aaltonen <[email protected]> Cc: James Clarke <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: update supported patch versionLionel Landwerlin2019-03-051-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: toggle on support for VK_EXT_ycbcr_image_arraysTapani Pälli2019-03-052-0/+8
| | | | | | | | We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: update headers/registry to 1.1.102Lionel Landwerlin2019-03-053-15/+119
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: retain the is_array state in create_plane_tex_instr_implicitTapani Pälli2019-03-051-0/+1
| | | | | | | | This does not seem to fix anything ATM but is the right thing todo. Signed-off-by: Tapani Pälli <[email protected]> Fixes: f3e91e78a33775 ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Lionel Landwerlin <[email protected]>
* meson: avoid going back up the tree with include_directories()Eric Engestrom2019-03-053-4/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: Implement threaded GL support.Kenneth Graunke2019-03-053-0/+51
| | | | | | | | | | | | | | | | | | | Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Tested-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv/pipeline: Drop anv_fill_binding_tableJason Ekstrand2019-03-041-26/+0
| | | | | | | We zero out the prog data anyway and, now that bias is always zero, this function is accomplishing nothing. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use an actual binding for gl_NumWorkgroupsJason Ekstrand2019-03-043-31/+33
| | | | | | | | | | | This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel,nir: Lower TXD with min_lod when the sampler index is not < 16Jason Ekstrand2019-03-043-1/+30
| | | | | | | | | | | When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: cb98e0755f8d "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* spirv: OpImageQueryLod requires a samplerJason Ekstrand2019-03-041-1/+1
| | | | | | | | | | No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupportJason Ekstrand2019-03-041-0/+3
| | | | | | | | We were accidentally not counting those surfaces Fixes: ddc4069122 "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>