aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* turnip: implement CmdClearAttachmentsJonathan Marek2019-12-041-1/+65
| | | | | | | Passes these deqp tests: dEQP-VK.api.image_clearing.core.*attach*single* Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* turnip: don't skip unused attachments when setting up tiling configJonathan Marek2019-12-041-18/+10
| | | | | | | This makes it easier to find the gmem_offset associated with an attachment. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima: enable tilingVasily Khoruzhick2019-12-041-11/+30
| | | | | | | | | | | Now that we have tiled format modifier merged into linux we can enable tiling. That should improve overall performance and also workaround broken mipmapping for linear textures since now we prefer tiled textures. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* glsl: additional interface redeclaration check for SSO programsTapani Pälli2019-12-041-0/+54
| | | | | | | | | | | Patch adds additional linker check for SSO programs to make sure they are redeclaring built-in blocks as required by the desktop spec. This fixes following Piglit tests: arb_separate_shader_objects/linker/pervertex-* Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* gitlab-ci: bump piglit checkout commitTapani Pälli2019-12-044-8/+25
| | | | | | | | Commit also updates the Piglit quick_gl.txt, list modifications happened due to following Piglit commits: c248bf201,c acff58ca, 5603e2e60. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/load_store_vectorize: fix combining stores with aliasing loads betweenRhys Perry2019-12-042-2/+16
| | | | | | | | | v2: add test Fixes: ce9205c03bd ('nir: add a load/store vectorization pass') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> (v1) Reviewed-by: Connor Abbott <[email protected]> (v2)
* aco/wave32: Fix reductions.Timur Kristóf2019-12-043-30/+45
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Allow setting the subgroup ballot size to 64-bit.Timur Kristóf2019-12-042-4/+8
| | | | | | | | | Previously, it would only work when the ballot size was set to the lane mask. This patch makes is possible to set the ballot size to either 32-bit or 64-bit for wave32 mode. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Use wave_size for barrier intrinsic.Timur Kristóf2019-12-042-3/+3
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Fix load_local_invocation_index to support wave32.Timur Kristóf2019-12-041-3/+15
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Use lane mask regclass for exec/vcc.Timur Kristóf2019-12-0412-209/+250
| | | | | | | | | Currently all usages of exec and vcc are hardcoded to use s2 regclass. This commit makes it possible to use s1 in wave32 mode and s2 in wave64 mode. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Add wave size specific opcodes to aco_builder.Timur Kristóf2019-12-041-0/+78
| | | | | | | | | | | | | Several places in ACO we use SOP1 or SOP2 instructions to operate over the exec mask or VCC, and these need to be adapted to the new size in wave32 mode. This commit adds a way to deal with this problem in aco_builder: the caller can specify a wave size specific opcode and the builder will translate that to the correct opcode based on the current wave size. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Introduce emit_mbcnt which takes wave size into account.Timur Kristóf2019-12-041-17/+24
| | | | | | | | This is relevant because in wave32 mode the v_mbcnt_hi_u32_b32 instruction is superfluous. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Replace hardcoded numbers in spiller with wave size.Timur Kristóf2019-12-041-15/+16
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/wave32: Change uniform bool optimization to work with wave32.Timur Kristóf2019-12-041-1/+2
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: Optimize load_subgroup_id to one bit field extract instruction.Timur Kristóf2019-12-041-3/+2
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: Remove lower_linear_bool_phi, it is not needed anymore.Timur Kristóf2019-12-041-24/+1
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: Remove superfluous argument from emit_boolean_logic.Timur Kristóf2019-12-041-6/+6
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: Fix operand of s_bcnt1_i32_b64 in emit_boolean_reduce.Timur Kristóf2019-12-041-1/+1
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* gitlab-ci: Run piglit glslparser & quick_shader tests separatelyMichel Dänzer2019-12-044-6512/+5328
| | | | | | | | | | | And only use --process-isolation false for the quick_gl tests. This will hopefully avoid variance in the test results that we've been seeing lately. But even if it doesn't, it should at least help narrow down the cause of the variance. Tested-by: Vasily Khoruzhick <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* intel/perf: fix improper pointer accessLionel Landwerlin2019-12-041-1/+1
| | | | | | | | | | This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: simplify the processing of OA reportsLionel Landwerlin2019-12-041-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: take into account that reports read can be fairly oldLionel Landwerlin2019-12-041-3/+4
| | | | | | | | | | | | | | | If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: set read buffer len to 0 to identify empty bufferLionel Landwerlin2019-12-041-2/+3
| | | | | | | | | | | | | | | | | We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: fix invalid hw_id in query resultsLionel Landwerlin2019-12-041-2/+6
| | | | | | | | | | | | Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: display cs blit count for AMD_DEBUG=testdmaPierre-Eric Pelloux-Prayer2019-12-041-3/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement sdma for GFX9Pierre-Eric Pelloux-Prayer2019-12-041-6/+191
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radv/gfx10: fix the vertex order for triangle strips emitted by a GSSamuel Pitoiset2019-12-041-48/+47
| | | | | | | | | My fix wasn't totally correct as pointed out by Marek. Ported from RadeonSI. Fixes: deafe4cc587 ("radv/gfx10: fix primitive indices orientation for NGG GS") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: simplify a check in radv_fixup_vertex_input_fetches()Samuel Pitoiset2019-12-041-4/+2
| | | | | | | The number of loaded channels should always be > 0 now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove dead shader input/output variablesSamuel Pitoiset2019-12-041-1/+1
| | | | | | | No pipeline-db changes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* iris: Stop setting up fake paramsJason Ekstrand2019-12-042-13/+6
| | | | | | | | | | | In d1c4e64a69e, we added a parameter to tell the back-end compiler to ignore the param array and just push however many constants you ask it to push. Iris doesn't want to push anything so it gives a bogus number of parameters and trusts the back-end compiler to dead-code all of them. Now that we can tell the back-end compiler to stop re-arranging things, delete the hack and enable the new simpler code path. Reviewed-by: Kenneth Graunke <[email protected]>
* gallium/scons: fix graw-xlib build on OSX.Dave Airlie2019-12-041-0/+2
| | | | | | Fixes: 44a6b0107b37 (gallivm: add nir->llvm translation (v2)) Tested-by: Vinson Lee <[email protected]>
* llvmpipe: enable texcoord semanticsDave Airlie2019-12-042-10/+18
| | | | | | | To make NIR transitioning easier, move the driver to using texcoord semantics. Reviewed-by: Eric Anholt <[email protected]>
* anv: Respect the always_flush_cache driconf optionJason Ekstrand2019-12-033-0/+12
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* gallium/swr: Fix crash when use GL_TDFX_texture_compression_FXT1 format.Krzysztof Raszkowski2019-12-031-1/+3
| | | | | | | Reject the new formats in swr to prevent crashes because it doesn't know how to handle the new formats. Reviewed-by: Jan Zielinski <[email protected]>
* gitlab-ci: disable junit results for deqpRob Clark2019-12-031-3/+0
| | | | | | They don't seem to be hugely useful, and seem to be bogging down gitlab. Signed-off-by: Rob Clark <[email protected]>
* anv: Set up SBE_SWIZ properly for gl_ViewportJason Ekstrand2019-12-031-2/+2
| | | | | | | | | gl_Viewport is also in the VUE header so we need to whack the read offset to 0 and emit a default (no overrides) SBE_SWIZ entry in that case as well. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* gitlab-ci: Update to current ci-templates masterMichel Dänzer2019-12-032-2/+2
| | | | | | Fixes skopeo copy failures. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/llvm: fix atomic var operations if source isn't a derefSamuel Pitoiset2019-12-031-7/+9
| | | | | | | | Fixes some CTS regressions. Fixes: e61a826f396 ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Add support for T820 CI JobsNeil Armstrong2019-12-035-3/+75
| | | | | | | Tomeu: - Small rebase fixups Signed-off-by: Neil Armstrong <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]>
* gallivm/llvmpipe: add support for front facing in sysval.Dave Airlie2019-12-036-1/+14
| | | | | | | | This wires up the front facing value as a sysval, I'd like to remove the other facing code but I'd need to confirm VMware don't use it first. Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe/images: handle undefined atomic without crashingDave Airlie2019-12-031-2/+10
| | | | | | just return 0 for unbound atomic operations. Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Remove blend shader hackAlyssa Rosenzweig2019-12-032-5/+1
| | | | | | This is no longer used. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* gitlab-ci: Test Panfrost on T720 GPUsTomeu Vizoso2019-12-035-4/+63
| | | | | | | | Now that the Mali T720 GPU is supoprted at the same level as the T760, test it on PINE64 H64 boards. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* gitlab-ci: Remove non-default skips from PanfrostAlyssa Rosenzweig2019-12-034-103/+9
| | | | | | | | | During the past months, Panfrost has matured considerably and several tests stopped being flaky or failing at all. Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: White list the Mali T720Tomeu Vizoso2019-12-031-0/+1
| | | | | | | Support for this GPU is equal now to that of T760, so whitelist it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Splatter on fragment outAlyssa Rosenzweig2019-12-031-1/+20
| | | | | | | Make sure that the fragment is complete when writing it out. Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost: Simplify shader patchingTomeu Vizoso2019-12-031-41/+19
| | | | | | | We need to always upload anyway. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Simplify draw_flagsAlyssa Rosenzweig2019-12-031-11/+2
| | | | | | | | | Fixes dEQP-GLES3.functional.primitive_restart.*. Note the 0x18000 value is accidentally somehow enabling primitive restart for some reason. I'm not sure where this value came from but let's not. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Implement pan_tiler for non-hierarchy GPUsAlyssa Rosenzweig2019-12-036-136/+106
| | | | | | | | | | | | | | | | The algorithm is as described. Nothing fancy here, just need to add some new code paths depending on which model we're running on. Tomeu: - Also disable tiling when !hierarchy and !vertex_count - Avoid creating polygon lists smaller than the minimum when vertex_count > 0 but tile size smaller than 16 byte - Take into account tile size when calculating polygon list size for !hierarchy - Allow 0-sized tiles in a single dimension Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]>