summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tgsi_to_nir: Fix the TGSI ARR translation by converting the result to int.Timur Kristóf2019-03-051-1/+1
| | | | | | | | | | | | | According to the TGSI spec, ARR needs to do a rounding and then a float-to-integer conversion which was missing. This patch also makes the rounding a bit more efficient by using nir_fround_even instead of the previous nir_ffloor+nir_fadd trick. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add ability for shaders to use window space coordinates.Timur Kristóf2019-03-053-0/+8
| | | | | | | | | | | | | This patch adds a shader_info field that tells the driver to use window space coordinates for a given vertex shader. It also enables this feature in radeonsi (the only NIR-capable driver that supported it in TGSI), and makes tgsi_to_nir aware of it. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Move the stores for fixed function VS output reads into NIR.Eric Anholt2019-03-055-195/+343
| | | | | | | | | | | | | | | This lets us emit the VPM_WRITEs directly from nir_intrinsic_store_output() (useful once NIR scheduling is in place so that we can reduce register pressure), and lets future NIR scheduling schedule the math to generate them. Even in the meantime, it looks like this lets NIR DCE some more code and make better decisions. total instructions in shared programs: 6429246 -> 6412976 (-0.25%) total threads in shared programs: 153924 -> 153934 (<.01%) total loops in shared programs: 486 -> 483 (-0.62%) total uniforms in shared programs: 2385436 -> 2388195 (0.12%) Acked-by: Ian Romanick <[email protected]> (nir)
* v3d: Translate f2i(fround_even) as FTOIN.Eric Anholt2019-03-051-2/+9
| | | | | This appears to be just what the opcode does. Needed for equivalence when moving FF VPM stores into NIR.
* nir: Improve printing of load_input/store_output variable names.Eric Anholt2019-03-051-2/+4
| | | | | | | We were printing only when the channel was exactly the start channel, so scalarized loads/stores would be missing the name on the rest. Reviewed-by: Ian Romanick <[email protected]>
* anv: Implement VK_EXT_inline_uniform_blockJason Ekstrand2019-03-056-16/+163
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the same types for resource indices as pointersJason Ekstrand2019-03-055-32/+79
| | | | | | | | We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the generic dereference function for OpArrayLengthJason Ekstrand2019-03-051-1/+1
| | | | | | | | With the new deref changes, the old pointer_offset version may not be the right one to call. Just call the generic one and let it sort it out. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Pull offset/stride from the pointer for OpArrayLengthJason Ekstrand2019-03-051-2/+10
| | | | | | | | | We can't pull it from the variable type because it might be an array of blocks and not just the one block. While we're here, throw in some error checking. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: Add a concept of a descriptor bufferJason Ekstrand2019-03-055-0/+281
| | | | | | | | | This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Take references to push descriptor set layoutsJason Ekstrand2019-03-051-6/+16
| | | | | | | | Technically, descriptor set layouts aren't required to survive past the function they're passed into so we need to reference them. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Refactor descriptor pushing a bitJason Ekstrand2019-03-051-28/+22
| | | | | | | | | Pull the common code out of the two entrypoints into the helper which fetches the push descriptor set for us. Now that it does more than just get a thing, call it anv_cmd_buffer_push_descriptor_set. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: drop add_var_binding from anv_nir_apply_pipeline_layout.cJason Ekstrand2019-03-051-7/+2
| | | | | | It has exactly one caller. Just inline it. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Clean up descriptor set layoutsJason Ekstrand2019-03-053-83/+85
| | | | | | | | | | | | | | | | | The descriptor set layout code in our driver has undergone many changes over the years. Some of the fields which were once essential are now useless or nearly so. The has_dynamic_offsets field was completely unused accept for the code to set and hash it. The per-stage indices were only being used to determine if a particular binding had images, samplers, etc. The fact that it's per-stage also doesn't matter because that binding should never be accessed by a shader of the wrong stage. This commit deletes a pile of cruft and replaces it all with a descriptive bitfield which states what a particular descriptor contains. This merely describes the data available and doesn't necessarily dictate how it will be lowered in anv_nir_apply_pipeline_layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Count image param entries rather than imagesJason Ekstrand2019-03-055-23/+29
| | | | | | | | | This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Stop allocating buffer views for dynamic buffersJason Ekstrand2019-03-053-24/+22
| | | | | | | We emit the surface states for those on-the-fly so we don't need the buffer view. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Rework arguments to anv_descriptor_set_write_*Jason Ekstrand2019-03-053-29/+27
| | | | | | | Make them all take a device followed by a set. This is consistent with how the actual Vulkan entrypoint parameters are laid out. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Refactor alloc/free of descriptor setsJason Ekstrand2019-03-051-59/+84
| | | | | | | This commit just puts the free list code together as part of the pool instead of having it inlined into the descriptor set create code. Reviewed-by: Lionel Landwerlin <[email protected]>
* v3d: Stop treating exec masking specially.Eric Anholt2019-03-053-14/+3
| | | | | | | | | | | | | | | In our backend, the successor edges from the blocks only point to where QPU control flow goes, not where the notional control flow goes from a "break" or "continue" modifying the execution mask to resume writing to some channels later. As a result, this attempt at restricting live ranges ended up missing the live range of a value where a conditional break/continue was present in a loop before the later def of a variable. The previous commit ended up fixing the problem that the flag tried to solve. Fixes glsl-vs-loop-continue.shader_test and/or glsl-vs-loop-redundant-condition.shader_test based on register allocation results.
* v3d: Restrict live intervals to the blocks reachable from any def.Eric Anholt2019-03-052-4/+43
| | | | | | | | | | | | | | | In the backend, we often have condition codes on writes to variables, such that there's no screening def anywhere and the previous live ranges algorithm would conclude that the start of the range extends to the start of the program. However, we do know that the live range can only extend as early as you can reach from all blocks writing to the variable. The motivation was that, while we have a couple of hacks to try to promote conditional writes up to being a def within the block, the exec_mask one was broken and needed a replacement. Based on c3c1aa5aeb92 ("intel/fs: Restrict live intervals to the subset possibly reachable from any definition.").
* gitlab-ci: install distro's ninjaAndres Gomez2019-03-052-11/+3
| | | | | | | | Ubuntu Bionic is shipping ninja 1.8.2. Therefore, we do not need to download v1.6.0 manually any more. Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: properly align the fence and EOP bug VA on GFX9Samuel Pitoiset2019-03-051-2/+5
| | | | | | | | | | If alignement is 0, offets returned by radv_cmd_buffer_upload_alloc() are always 0. These two virtual addresses were pointing at the same location. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate enough space in cmdbuf when starting a subpassSamuel Pitoiset2019-03-051-1/+1
| | | | | | | | | | | | This fixes some CTS crashes with: dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_* Ideally, we should check cmd_buffer->cs->max_dw because there is likely enough space (the internal clear draws allocate space), but keep that way for consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vulkan: import vk_layer.h from KhronosEric Engestrom2019-03-051-0/+195
| | | | | | | Instead of relying on the system having it (and the right version). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* egl: fix libdrm-less buildsEric Engestrom2019-03-052-15/+0
| | | | | | | | | | | | This function was never used, and isn't properly guarded by HAVE_LIBDRM, breaking the build on systems that don't have libdrm. Let's just remove it. Fixes: 7552fcb7b9b98392e6a8 "egl: add base EGL_EXT_device_base implementation" Reported-by: Timo Aaltonen <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Emil Velikov <[email protected]>
* vulkan: import missing file from KhronosEric Engestrom2019-03-051-0/+66
| | | | | | Fixes: 114c4aa0c84fc6d00407 "vulkan: update headers/registry to 1.1.102" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* util: #define PATH_MAX when undefined (eg. Hurd)Eric Engestrom2019-03-051-0/+4
| | | | | | | Cc: Timo Aaltonen <[email protected]> Cc: James Clarke <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: update supported patch versionLionel Landwerlin2019-03-051-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: toggle on support for VK_EXT_ycbcr_image_arraysTapani Pälli2019-03-052-0/+8
| | | | | | | | We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: update headers/registry to 1.1.102Lionel Landwerlin2019-03-053-15/+119
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: retain the is_array state in create_plane_tex_instr_implicitTapani Pälli2019-03-051-0/+1
| | | | | | | | This does not seem to fix anything ATM but is the right thing todo. Signed-off-by: Tapani Pälli <[email protected]> Fixes: f3e91e78a33775 ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Lionel Landwerlin <[email protected]>
* meson: avoid going back up the tree with include_directories()Eric Engestrom2019-03-053-4/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: Implement threaded GL support.Kenneth Graunke2019-03-053-0/+51
| | | | | | | | | | | | | | | | | | | Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Tested-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv/pipeline: Drop anv_fill_binding_tableJason Ekstrand2019-03-041-26/+0
| | | | | | | We zero out the prog data anyway and, now that bias is always zero, this function is accomplishing nothing. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use an actual binding for gl_NumWorkgroupsJason Ekstrand2019-03-043-31/+33
| | | | | | | | | | | This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel,nir: Lower TXD with min_lod when the sampler index is not < 16Jason Ekstrand2019-03-043-1/+30
| | | | | | | | | | | When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: cb98e0755f8d "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* spirv: OpImageQueryLod requires a samplerJason Ekstrand2019-03-041-1/+1
| | | | | | | | | | No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupportJason Ekstrand2019-03-041-0/+3
| | | | | | | | We were accidentally not counting those surfaces Fixes: ddc4069122 "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Allow [i/u]mulExtended to use new nir opcodeSagar Ghuge2019-03-041-6/+10
| | | | | | | | | | Use new nir opcode nir_[i/u]mul_2x32_64 and extract lower and higher 32 bits as needed instead of emitting mul and mul_high. v2: Surround the switch case with curly braces (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Optimize low 32 bit extractionSagar Ghuge2019-03-041-0/+2
| | | | | | | | | Optimize a situation where we only need lower 32 bits from 64 bit result. Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: [u/i]mulExtended optimization for GLSLSagar Ghuge2019-03-047-4/+126
| | | | | | | | | | | | | | | Optimize mulExtended to use 32x32->64 multiplication. Drivers which are not based on NIR, they can set the MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old behavior. v2: Add missing condition check (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <Matt Turner <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/glsl: Add another way of doing lower_imul64 for gen8+Sagar Ghuge2019-03-046-6/+47
| | | | | | | | | | | | | | | | | | | On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We can reduce our 64x64 int multiplication from 4 instructions to 3. Also instead of emitting two mul instructions, we can emit single mul instuction and extract low/high 32 bits from 64 bit result for [i/u]mulExtended v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand) 2) Add lower_mul_2x32_64 flag (Matt Turner) 3) Remove associative property as bit size is different (Connor Abbott) v3: Fix indentation and variable naming convention (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/nine: Ignore multisample quality level if no msAxel Davy2019-03-041-0/+4
| | | | | | | | | | | | Apparently instead of returning error when passing a quality level different than 0 for D3DMULTISAMPLE_NONE, we should pass. Fixes: https://github.com/iXit/Mesa-3D/issues/340 Cc: [email protected] Signed-off-by: Axel Davy <[email protected]>
* st/nine: Ignore window size if errorAxel Davy2019-03-041-1/+8
| | | | | | | | | | | | | | | | | | | | Check GetWindowInfo and ignore the computed sizes if there is an error. Fixes a regression caused by earlier commit when using old wine gallium nine patches. Should also address a crash at window destruction. Related issues: https://github.com/iXit/Mesa-3D/issues/331 https://github.com/iXit/Mesa-3D/issues/332 Cc: [email protected] Fixes: 2318ca68bbe ("st/nine: Handle window resize when a presentation buffer is used") Signed-off-by: Axel Davy <[email protected]>
* android: anv: fix libexpat shared dependencyMauro Rossi2019-03-041-1/+1
| | | | | | | | Fixes undefined reference building errors for XML_* functions Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Cc: "19.0" <[email protected]>
* android: anv: fix generated files depedencies (v2)Mauro Rossi2019-03-041-15/+25
| | | | | | | | | | | | | | | | | Fix anv_extrypoints.{c,h} and anv_extensions.{c,h} missing dependencies Rename the variable labels according to targets and python scripts Align the building rules as per Automake for simplification Fixes building errors during rebuils due to missing dependencies (v2) Fixed a missing $(VULKAN_API_XML) reference Fixes: 9a508b7 ("android: anv/extensions: fix generated sources build") Fixes: dd088d4bec7 ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Cc: "19.0" <[email protected]>
* st/wgl: init a variable to silence MinGW warningBrian Paul2019-03-041-1/+1
| | | | | | MinGW release build says 'value' may be used before being initialized. Reviewed-by: Neha Bhende <[email protected]>
* svga: silence array out of bounds warningBrian Paul2019-03-041-1/+1
| | | | | | | | MinGW release build complains about a possible out-of-bounds array access. Test i < 4 to silence it. Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>