summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* iris: separating out common perf codeMark Janes2019-12-105-92/+157
| | | | | | | | | | The configuration of the gen_perf vtable will be the same for INTEL_performance_query and AMD_performance_monitor. Initialize the table in a single routine that can be called from both implementations. Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: enable INTEL_PERFORMANCE_QUERYDongwon Kim2019-12-107-0/+341
| | | | | | | | | | | | | | | | | new state tracker APIs added for INTEL_performance_query This extension is enabled if all vendor specific functions for it exist. v2: add st_cb_perfquery.* to the list of sources in Makefile v3: minor code clean-up v4: - add driver hooks for intel-performance-query apis - add PIPE level performance counter and type enums that match to OpenGL enums - do conversion of pipe_perf_counter_type and pipe_perf_counter_data_type enums to GL defines in state_tracker Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meson/broadcom: libbroadcom_cle also needs zlibDylan Baker2019-12-111-1/+1
| | | | | | Fixes: 1ae8018a6af81eec4832a57d9d0346aa3dd98d28 ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <[email protected]>
* anv: Enable Gen11 Color/Z write merging optimizationKenneth Graunke2019-12-101-0/+12
| | | | | | | | | | | | | | | | | | | TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Acked-by: Jason Ekstrand <[email protected]>
* iris: Enable Gen11 Color/Z write merging optimizationKenneth Graunke2019-12-101-0/+8
| | | | | | | | | | | | | | | | | | | | | | | TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Improves performance in Manhattan 3.0 by 6% on ICL 8x8 at a fixed frequency, according to Felix Degrood. I didn't see any improvements at out-of-the-box power management settings, however. Acked-by: Jason Ekstrand <[email protected]>
* intel/genxml: Add a partial TCCNTLREG definitionKenneth Graunke2019-12-101-0/+7
| | | | | | | TCCNTLREG contains additional cache programming settings. In particular, there are several write combining controls we'd like to use. Acked-by: Jason Ekstrand <[email protected]>
* util: Detect use-after-destroy in simple_mtxKenneth Graunke2019-12-101-1/+10
| | | | | | | | | | | | | | This makes simple_mtx_destroy set the counter to an invalid canary value and then makes lock/unlock assert that the value is legal. That way, calling lock/unlock after destroy will assert fail, rather than deadlocking or potentially even working. This has caught real deadlocks in dEQP multithreaded tests (in st/mesa shader variant zombie list handling), which have since been fixed. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/a6xx: enable LRZ by defaultRob Clark2019-12-103-2/+4
| | | | | | Now that dEQP should be happy, lets flip the switch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: fix LRZ logicRob Clark2019-12-105-30/+50
| | | | | | | | | | | | In particular, we need to invalidate the LRZ state when we cannot be confident in what the Z state would be during rendering: 1) depth test modes not supported by LRZ 2) stencil test, which would require full rasterization and stencil test in the binning pass (whereas LRZ normally just needs to determine the min and max z value in an 8x8 quad) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: fix LRZ layoutRob Clark2019-12-101-7/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx+a6xx: split LRZ layout to per-genRob Clark2019-12-104-45/+70
| | | | | | Seems to be a bit different for a6xx, so let's split this out. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: disable LRZ when blendingRob Clark2019-12-103-2/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: don't rely on CLEAR_STATE to set PA_SC_GENERIC_SCISSOR_*Marek Olšák2019-12-101-3/+5
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: simplify the tess_turns_off_ngg conditionMarek Olšák2019-12-101-3/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: disable vertex groupingMarek Olšák2019-12-102-6/+3
| | | | | | based on PAL. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: enable NIR by default and document GL 4.6 supportMarek Olšák2019-12-101-1/+1
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* st/dri: assume external consumers of back buffers can write to the buffersMarek Olšák2019-12-101-6/+6
| | | | | | | This was reverted needlessly because if was part of another series. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-By: Tapani Pälli <[email protected]>
* ANV: Stop advertising smoothLines support on gen10+Jason Ekstrand2019-12-101-1/+9
| | | | Reviewed-by: Ivan Briano <[email protected]>
* meson/broadcom: libbroadcom_cle needs expat headersDylan Baker2019-12-101-1/+1
| | | | | | Fixes: 1ae8018a6af81eec4832a57d9d0346aa3dd98d28 ("meson: Add support for the vc4 driver.") Reviewed-by: Eric Anholt <[email protected]>
* anv: fix incorrect VMA alignment for CCS main surfacesLionel Landwerlin2019-12-101-3/+14
| | | | | | | | | | | | | | Maybe finer way of dealing with this requirement would be to increase the number of pdevice->memory.types[] to add a category for special alignment cases. Meanwhile this fixes the problem of CCS surface alignment and it's probably not going to cause issues given the size of our address space. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 6af8a4acc4a4 ("anv: Add aux-map translation for gen12+") Reviewed-by: Jason Ekstrand <[email protected]>
* anv: fix missing gen12 handlingLionel Landwerlin2019-12-101-0/+3
| | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 181be14d4303 ("anv: Build for gen12") Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: drop `intel_` prefix on imgui_coreEric Engestrom2019-12-101-1/+1
| | | | | | | Again, no real effect, just the name of a temporary build file. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: drop duplicate `lib` prefix on libiris_gen*Eric Engestrom2019-12-101-1/+1
| | | | | | | | This has no real effect other than the names of the temporary files in the build folder. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: implement VK_KHR_separate_depth_stencil_layoutsSamuel Pitoiset2019-12-106-7/+93
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize HTILE for separate depth/stencil aspectsSamuel Pitoiset2019-12-103-19/+29
| | | | | | | | It either clears the whole HTILE buffer or part of it depending on the HTILE mask parameter. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not init HTILE as compressed state when dst layout allows itSamuel Pitoiset2019-12-101-13/+5
| | | | | | | | I don't think this makes much differences and a potential clear following the initialization will overwrite HTILE anyways. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: synchronize after performing a separate depth/stencil fast clearsSamuel Pitoiset2019-12-101-0/+10
| | | | | | | | | | For depth+stencil images, the driver might use an optimized path if only one aspect is cleared. It either clears the depth or the stencil part of HTILE. Because the two separate aspects might use the same HTILE memory we have to synchronize. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallivm: add TGSI bit arithmetic opcodes supportKrzysztof Raszkowski2019-12-101-0/+138
| | | | | | | | | Add TGSI_OPCODE_BFI, TGSI_OPCODE_POPC, TGSI_OPCODE_LSB, TGSI_OPCODE_IMSB, TGSI_OPCODE_UMSB, TGSI_OPCODE_IBFE, TGSI_OPCODE_UBFE, TGSI_OPCODE_BREV support. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* radv: fix possibly wrong PA_SC_AA_CONFIG value for conservative rastSamuel Pitoiset2019-12-102-10/+7
| | | | | | | | | | | | PA_SC_AA_CONFIG might be updated when conversative rasterization is enabled. Because the driver only re-emits the multisample state if the number of samples is different, that register value might not be updated correctly. Found by inspection, doesn't fix anything known. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: move emission of two PA_SC_* registers to the pipeline CSSamuel Pitoiset2019-12-102-4/+3
| | | | | | | They don't have to be updated dynamically. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* st/dri: use st->flush callback to flush the backbufferPierre-Eric Pelloux-Prayer2019-12-101-39/+63
| | | | | | | | | | | | | | Previously the flush was done before the call to st->flush but could lead to problems as FLUSH_VERTICES could push some work that would change the backbuffer (or modify it). With this commit, all the backbuffer flushing code is executed right before the call to st_flush. Closes: https://gitlab.freedesktop.org/drm/amd/issues/842 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205049 Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: add a notify_before_flush callback param to flushPierre-Eric Pelloux-Prayer2019-12-1011-20/+28
| | | | | | | | | | The new callback is called right before the flush is done to allow users of st->flush to do some work after all the previous work has been flushed. This will be used by dri_flush in the next commit. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: dcc dirty flagPierre-Eric Pelloux-Prayer2019-12-106-1/+26
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix multi plane buffers creationPierre-Eric Pelloux-Prayer2019-12-101-2/+4
| | | | | | | | | | | When using 3 planes, the sequence produces this chain: plane0 -> plane2 This commit fixes this to produce: plane0 -> plane1 -> plane2 Fixes: 86e60bc2659 ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2193 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use gfx9.surf_offset to compute texture offsetPierre-Eric Pelloux-Prayer2019-12-101-1/+2
| | | | | Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2177 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use compute shader for clear 12-byte bufferSonny Jiang2019-12-094-10/+108
| | | | | Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: release the draw shader properly to fix driver crashes (iris)Marek Olšák2019-12-091-1/+5
| | | | Reviewed-by: Dave Airlie <[email protected]>
* draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVMMarek Olšák2019-12-093-2/+18
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* st/mesa: don't generate VS TGSI if NIR is enabledMarek Olšák2019-12-091-22/+14
| | | | | | it's no longer needed Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: remove struct st_vp_variant in favor of st_common_variantMarek Olšák2019-12-097-43/+24
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: remove st_vp_variant::num_inputsMarek Olšák2019-12-093-11/+5
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: use a separate VS variant for the draw moduleMarek Olšák2019-12-093-44/+22
| | | | | | | | instead of keeping the IR indefinitely in st_vp_variant. This trivially fixes Selection/Feedback/RasterPos for NIR. Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: support shader images for Selection/Feedback/RasterPosMarek Olšák2019-12-091-0/+55
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: support SSBOs for Selection/Feedback/RasterPosMarek Olšák2019-12-091-0/+40
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: support samplers for Selection/Feedback/RasterPosMarek Olšák2019-12-091-0/+107
| | | | Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: save currently bound vertex samplers and sampler views in st_contextMarek Olšák2019-12-094-3/+11
| | | | | | for st_draw_feedback.c Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: support UBOs for Selection/Feedback/RasterPosMarek Olšák2019-12-091-2/+37
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipeMarek Olšák2019-12-091-3/+36
| | | | | | | This is already used in st_draw_feedback.c, because it uses shaders generated for drivers. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: implement TEX_LZ and TXF_LZ opcodesMarek Olšák2019-12-092-5/+11
| | | | | | | gallivm receives these opcodes anyway because st_draw_feedback.c uses shaders that were assembled for drivers, not llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]>
* drirc: set allow_higher_compat_version for Faster Than LightGurchetan Singh2019-12-091-1/+9
| | | | | | | | | | | | With 781a78 ("mesa: enable ARB_direct_state_access in compat for GL3.1+), it's possible to have DSA with GL3.1+. FTL creates a GL3.1 compat context, but fails the _mesa_has_geometry_shaders(..) check in frame_buffer_texture. Bump the compat version to pass the check. Reviewed-by: Marek Olšák <[email protected]>