aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK.Bas Nieuwenhuizen2020-01-073-18/+65
| | | | | | | | | | | | | | | | | | | | | | Fixes a hang on Raven with Resident Evil 2. I did not find anything more restricted to fix it: - Setting persistent_states_per_bin to 1 fixes it too, but likely does an internal break on any descriptor set changes too. - Only breaking the batch when cb_target_mask changes does not fix it (and looking at AMDVLK comments, I suspect the code in radeonsi should really be doing a FLUSH_DFSM). - Always doing a FLUSH_DFSM on shader switch helps, but that is more often than this and I don't think we should be doing that when DFSM is disabled. - Also emitting the existing break on framebuffer change when DFSM is disabled does not fix the issue. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2315 CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa/st/i965: add a ProgramResourceHash for quicker resource lookupTapani Pälli2020-01-077-5/+95
| | | | | | | | | | | | Many resource APIs require searching by name, add a hash table to make this faster. Currently we traverse the whole resource list for name based queries, this change makes all these cases use the hash. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2203 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3254>
* panfrost: Don't double-flip Z/W for 2D arraysAlyssa Rosenzweig2020-01-071-2/+5
| | | | | | | | | We need to mindful that we don't clobber the shadow comparator. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Account for z/w flip in texelFetchAlyssa Rosenzweig2020-01-071-0/+9
| | | | | | | | | Required for proper txf of 2D arrays. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.*2darray* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Adjust for mismatch between hardware/Gallium in arrays/cubeAlyssa Rosenzweig2020-01-071-11/+33
| | | | | | | | | | | The hardware separates face selection and array indexing, it looks like, whereas Gallium smushes them together with some modulus fun. Let's fix it so mipmapped 2D arrays work without regressing cubemaps. Fixes dEQP-GLES3.functional.texture.filtering.2d_array.* among others. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Respect constant buffer_offsetAlyssa Rosenzweig2020-01-071-2/+4
| | | | | | | | Fixes dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.* among others Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* glsl: use nir version of check_image_resources() for nir linkerTimothy Arceri2020-01-072-1/+2
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: add check_image_resources() for the nir linkerTimothy Arceri2020-01-071-0/+38
| | | | | | | | This is adapted from the GLSL IR code but doesn't need to iterate over the IR. I believe this also fixes a potential bug in the GLSL IR code which potentially counts the same output twice. Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: use nir linker to link atomicsTimothy Arceri2020-01-073-3/+15
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* mesa: add new UseNIRGLSLLinker constantTimothy Arceri2020-01-071-0/+3
| | | | | | | This will be used to disable some GLSL IR passes in following patches. Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: reorder link_and_validate_uniforms() callsTimothy Arceri2020-01-071-1/+1
| | | | | | This is required for the following commit. Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: add new gl_nir_link_glsl() helperTimothy Arceri2020-01-072-0/+14
| | | | | | | | This will allow us to do some linking in NIR that was previously done by the GLSL IR linker. To start with this just has calls for linking atomics. Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: add gl_nir_link_check_atomic_counter_resources()Timothy Arceri2020-01-072-0/+95
| | | | | | | This is pretty much a copy of link_check_atomic_counter_resources() updated to work with the NIR linker. Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: rename gl_nir_link() to gl_nir_link_spirv()Timothy Arceri2020-01-074-7/+7
| | | | | | | | A NIR based glsl linking function will be too different to the spirv version to bother attempting any sharing. So lets change the name to be explicit. Reviewed-by: Alejandro Piñeiro <[email protected]>
* st/mesa: Lower vars to ssa and constant prop before gl_nir_lower_buffersKristian H. Kristensen2020-01-061-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gl_nir_lower_buffers pass relies on recognizing the same literal constants as the GLSL compiler so that constant buffer array indices are constant in nir as well. Without this, get_block_array_index() would see vec1 32 ssa_723 = deref_var &const_temp@1 (function_temp int) vec1 32 ssa_724 = load_const (0x00000001 /* 0.000000 */) ... vec1 32 ssa_5 = deref_var &const_temp@1 (function_temp int) vec1 32 ssa_6 = intrinsic load_deref (ssa_5) (0) /* access=0 */ vec1 32 ssa_7 = deref_var &blockB (ssbo BlockB[1]) vec1 32 ssa_8 = deref_array &(*ssa_7)[ssa_6] (ssbo BlockB) /* &blockB[ssa_6] */ instead of a literal 1, and ultimately generate the block name BlockB[0]. That used to work, since we before the previous commits we'd compact the block binding points and names. Thus, there would always be a BlockB[0]. Now, if an entry in a block array isn't used, we don't generate that block name, which means that if entry 0 isn't used BlockB[0] isn't present and then get_block_array_index() fails to find the block. In most cases we would have dealt with this in the call to st_nir_opts() in st_nir_link_shaders(), but in the num_shaders == 1 case (for example, compute) we would call gl_nir_lower_buffers() before we lowered GLSL constants. Move that corner case up next to where we call st_nir_link_shaders() so we call st_nir_opts() at the same point in the flow for all shaders. Fixes: dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.18 Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: do not change an element index to have correct block nameAndrii Simiklit2020-01-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | When SSBO array is used with packed layout, both IR tree and as a result, NIR tree will be incorrect. In fact, the SSBO dereference indices won't match the array size in some cases like the following: "layout(packed, binding=1) buffer SSBO { vec4 a; } ssbo[3]; out vec4 color; void main() { color = ssbo[2].a; }" After linking the IR and then NIR will have an SSBO array definition with size 1 but dereference still will have index 2 and linked_shader->Program->sh.ShaderStorageBlocks will contain just SSBO with name "SSBO[2]" So this line should be removed at least as a workaround for now to avoid error like: Failed to find the block by name "SSBO[0]" Fixes: 810dde2a "glsl/nir: Add a pass to lower UBO and SSBO access" Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* glsl: fix a binding points assignment for ssbo/ubo arraysAndrii Simiklit2020-01-063-13/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is needed to be in agreement with spec requirements: https://github.com/KhronosGroup/OpenGL-API/issues/46 Piers Daniell: "We discussed this in the OpenGL/ES working group meeting and agreed that eliminating unused elements from the interface block array is not desirable. There is no statement in the spec that this takes place and it would be highly implementation dependent if it happens. If the application has an "interface" in the shader they need to match up with the API it would be quite confusing to have the binding point get compacted. So the answer is no, the binding points aren't affected by unused elements in the interface block array." v2: - 'original_dim_size' field moved above to keep the struct packed better on 64-bit - added a comment for 'total_num_array_elements' field - fixed a binding point calculations for SSBOs array of arrays ( Ian Romanick <[email protected]> ) - fixed binding point calculations for non-packed SSBOs v3: - rename 'total_num_array_elements' to 'aoa_size' ( Jason Ekstrand <[email protected]> ) - rename 'boffset' to 'binding_stride' ( Alejandro Piñeiro <[email protected]> ) Fixes: 8cf1333b "glsl: link uniform block arrays of arrays" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532 Reported-By: Ilia Mirkin <[email protected]> Tested-by: Fritz Koenig <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* glsl: fix an incorrect max_array_access after optimization of ssbo/uboAndrii Simiklit2020-01-061-0/+1
| | | | | | | | | | | | | This is needed to fix these tests: piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_frag piglit.spec.arb_shader_storage_buffer_object.compiler.unused-array-element_comp Fixes: 8cf1333b "glsl: link uniform block arrays of arrays" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109532 Reported-By: Ilia Mirkin <[email protected]> Tested-by: Fritz Koenig <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* radeonsi: remove TGSIMarek Olšák2020-01-0616-5346/+598
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: disable SDMA on gfx8 to fix corruption on RX 580Marek Olšák2020-01-061-0/+5
| | | | | | | | | Closes: #1399 Closes: #1889 Cc: 19.2 19.3 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: move SI and CIK+ SDMA code into 1 common function for cleanupsMarek Olšák2020-01-0612-190/+104
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: rename dma_cs -> sdma_csMarek Olšák2020-01-0610-46/+46
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: add AMD_DEBUG=nodmacopyimage for debuggingMarek Olšák2020-01-063-1/+4
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: add AMD_DEBUG=nodmaclear for debuggingMarek Olšák2020-01-063-1/+4
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: remove broken and unused SI SDMA image copy codeMarek Olšák2020-01-061-181/+2
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* radeonsi: rename SDMA debug flagsMarek Olšák2020-01-064-9/+9
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-By: Timur Kristóf <[email protected]>
* panfrost: Handle PIPE_FORMAT_R10G10B10A2_USCALEDAlyssa Rosenzweig2020-01-061-0/+2
| | | | | | | | | | Same format code as UINT... might be different in how it's fed into a shader but we'll deal with that when we get there. Fixes dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.usigned_int2_10_10_10.components4_vec2_quads1 Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Report MSAA 4x supported for dEQPAlyssa Rosenzweig2020-01-061-1/+10
| | | | | | | | | Fixes dEQP-GLES3.functional.state_query.integers.max_samples_getinteger64 We'll have to actually implement multisampling next, but hey. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Cleanup tiling selection logicAlyssa Rosenzweig2020-01-061-13/+14
| | | | | | | | | | | | Make it a lot more obvious what we're doing and fix more than a few corner cases in the process. Fixes dEQP-GLES3.functional.buffer.map.write.render_as_index_array.pixel*, and likely others. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Implement sRGB blend shadersAlyssa Rosenzweig2020-01-062-8/+16
| | | | | | | | | | | | | We use the lowering in nir_format_convert. There are native ops for this so this is far from optimal and not remotely efficient but as with most blend shader things right now, it's hard enough to get it working, so let's focus on that for now. We'll make it fast later (once we have GLES3 stable, we can start optimizing these things). Fixes dEQP-GLES3.functional.fragment_ops.blend.fbo_srgb.* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Support rendering to non-zero Z/S layersAlyssa Rosenzweig2020-01-061-5/+5
| | | | | | | Fixes abort in STK's shadow implementation. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Texture from Z32F_S8 as R32FAlyssa Rosenzweig2020-01-061-0/+4
| | | | | | | | | Z32F_S8 becomes Z32F in texturing, which in turn just becomes R32F. Fixes dEQP-GLES3.functional.texture.format.sized.*.depth32f_stencil8* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* iris/query: Implement PIPE_QUERY_GPU_FINISHEDDanylo Piliaiev2020-01-061-0/+17
| | | | | | | Implementation is similar to radeonsi in 5f1cef76 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: use uint-samplers for sampling stencil buffersErik Faye-Lund2020-01-061-4/+6
| | | | | | | Otherwise, we end up mismatching the sampler types when rendering. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: use uint16_t for mipmap level pitchesSamuel Pitoiset2020-01-061-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* etnaviv: fix incorrectly failing vertex size assertJonathan Marek2020-01-051-1/+1
| | | | | | | | | Changes the assert to match the comment above. This assert was failing in some cases while running darkplaces. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* lima: fix PP stream terminator sizeVasily Khoruzhick2020-01-051-1/+3
| | | | | | | | | | PP stream terminator size seems to be 4 words, it worked with full PP stream because we align stream beginning to 32 bytes and BO is initialized with zeroes. But with partial PP stream it sometimes break if for new PP stream we reuse BO that has non-zero value at this place. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: don't reload and redraw tiles that were not updatedVasily Khoruzhick2020-01-053-7/+67
| | | | | | | | | We don't need to reload and redraw some tiles if framebuffer was not cleared and scissor test was enabled for some of draws. This simple optimization fixes cursor lag in X11 Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: postpone PP stream generationVasily Khoruzhick2020-01-051-11/+17
| | | | | | | | | This commit postpones PP stream generation till job is submitted. Doing that this late allows us to skip reloading and redrawing tiles that were not updated. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/parser: Fix VS cmd stream parserAndreas Baierl2020-01-051-2/+2
| | | | | | | prefetch is int, not bool. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima/parser: Fix rsw parserAndreas Baierl2020-01-051-2/+0
| | | | | | | Drop assert as it is not necessary and used wrong anyway. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* anv: Only enable EWA LOD algorithm when doing anisotropic filtering.Kenneth Graunke2020-01-041-1/+2
| | | | | | | | | Updated documentation renames "Anisotropic Algorithm" to "LOD Algorithm" and adds a note for Gen9+ saying "The EWA Algorithm should only be enabled for Anisotropic Filtering modes." and indicating that the extra accuracy shouldn't be necessary for other modes, and comes at a cost. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Allow HiZ for copy_region sourcesKenneth Graunke2020-01-043-5/+18
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Allow HiZ for glCopyImageSubData sourcesJason Ekstrand2020-01-041-0/+9
| | | | | | | v2 (Ken): Handle platforms without sampler support for HiZ Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> [v2 changes]
* anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9Jason Ekstrand2020-01-042-11/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use the source format when using blorp_copy with HiZJason Ekstrand2020-01-041-1/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blorp: Don't resolve HiZ unless we're reinterpretingJason Ekstrand2020-01-041-1/+1
| | | | | | | | | This eliminates 50% of pixels (2M) rendered for a blit in GS:GO. This accounts for 3% of pixels rendered in the game. Total GPU clocks for the first 900 frames of CSGO improves by 1%. Tested-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* blorp: Allow reading with HiZJason Ekstrand2020-01-042-2/+12
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* blorp: Stop whacking Z24 depth to BGRA8Jason Ekstrand2020-01-041-11/+0
| | | | | | | | | The shader code required to do this is int(sat(x) * UINT24_MAX) which isn't really worth all the effort to avoid. Doing the format conversion, on the other hand, prevents us from sampling with HiZ which is something that we very much want on gen8-9 where we can. Reviewed-by: Kenneth Graunke <[email protected]>
* etnaviv: move descriptor based texture structsChristian Gmeiner2020-01-042-40/+31
| | | | | | | | This moves the descriptor based texture structs and their helpers into the only user. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>