summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: add debug option to force emulated indirectRob Clark2017-12-033-0/+12
| | | | | | Useful mostly for debugging indirect draw. Signed-off-by: Rob Clark <[email protected]>
* freedreno: also mark draw-indirect buffer as readRob Clark2017-12-031-0/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: small cleanupsRob Clark2017-12-031-17/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: avoid unneccessary batch flushRob Clark2017-12-031-0/+2
| | | | | | | | | In some cases we can end up trying to add a write dependency on ourself, which shouldn't trigger a flush. Avoids an extra couple flushes per from in stk. Signed-off-by: Rob Clark <[email protected]>
* freedreno: avoid mem2gmem for invalidated buffersRob Clark2017-12-033-2/+17
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: deferred flush supportRob Clark2017-12-035-4/+32
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: rework fence trackingRob Clark2017-12-0312-61/+109
| | | | | | | | | ctx->last_fence isn't such a terribly clever idea, if batches can be flushed out of order. Instead, each batch now holds a fence, which is created before the batch is flushed (useful for next patch), that later gets populated after the batch is actually flushed. Signed-off-by: Rob Clark <[email protected]>
* freedreno: proper locking for iterating dependent batchesRob Clark2017-12-032-8/+20
| | | | | | | | | In transfer_map(), when we need to flush batches that read from a resource, we should be holding screen->lock to guard against race conditions. Somehow deferred flush seems to make this existing race more obvious. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: correct max_indicies for indirect drawsRob Clark2017-12-031-1/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* broadcom/vc4: Use a single-entry cached last_hindex value.Eric Anholt2017-12-012-2/+20
| | | | | | | | | Since almost all BOs will be in one CL at a time, this cache will almost always hit except for the first usage of the BO in each CL. This didn't show up as statistically significant on the minetest trace (n=340), but if I lop off the throttled lobe of the bimodal distribution, it very clearly does (0.74731% +/- 0.162093%, n=269).
* broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.Eric Anholt2017-12-011-5/+14
| | | | | | | | No significant difference in the minetest replay, but it should reduce overhead by not requiring that we write quad indices to index buffers that we repeatedly re-upload (and making the draw packet smaller, as well). Over the course of the series the actual game seems to be up by 1-2 fps.
* broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.Eric Anholt2017-12-013-3/+12
| | | | | | | | | Now that there's only one user of it, it's pretty obvious how to avoid emitting redundant ones. This should save a bunch of kernel validation overhead. No statistically sigificant difference on the minetest trace I was looking at (n=169), but the maximum FPS is up by .3%
* broadcom/vc4: Simplify the relocation handling for index buffers.Eric Anholt2017-12-012-17/+17
| | | | | | Originally there was CL code for handling various relocations back when I had relocs for the TSDA/TA buffers. Now that the kernel handles those entirely on its own, I can inline that code into the one place using it.
* broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.Eric Anholt2017-12-011-16/+27
| | | | | | | | | | | | | We failed to take the start into account for how many vertices to draw in this round, so we would end up decrementing count below 0, which as an unsigned number meant we would loop until the CLs soon ran out of space. When I wrote the code I was thinking about how to use the previously emitted shader state (no index bias baked into the elements) by emitting up to 65535 and then only re-emitting with bias for the second wround, but that doesn't work if the start is over 65535. Instead, just delay emitting shader state until we get into the drawarrays GFXH-515 loop and always bake the bias in when we're doing the workaround.
* broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.Eric Anholt2017-12-011-1/+1
| | | | For triangle strips, we step by max_verts - 2.
* meson: use dep_thread instead of dependency('threads') in freedrenoDylan Baker2017-12-011-1/+1
| | | | | | | | They are the same thing, but this is more consistent with the rest of the project. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* meson: Add lmsensors supportDylan Baker2017-12-015-4/+7
| | | | | | | | v2: - Make -Dlmsensors=false work - Simplify auto and true cases Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium/hud: use #ifdef to test for macro existenceEric Engestrom2017-12-016-11/+11
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* amd: remove always-true BRAHMA_BUILD defineEric Engestrom2017-12-011-3/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr/scons: Fix intermittent build failureGeorge Kyriazis2017-12-011-0/+1
| | | | | | | gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp. Account for new dependency. Reviewed-by: Emil Velikov <[email protected]>
* r600: add ARB_shader_storage_buffer_object support (v3)Dave Airlie2017-12-016-22/+369
| | | | | | | | | | | | | | | This just builds on the image support. Evergreen only has ssbo for fragment and compute no other stages. v2: handle images and ssbo in the same shader properly (Ilia) v3: fix RESQ on buffers, fix missing atom emit fix first element offset use R32 format write separate buffer rat store path. (from running deqp gles3.1 tests) Signed-off-by: Dave Airlie <[email protected]>
* r600/cayman: looks like cmpxchg moved to ZDave Airlie2017-12-011-2/+5
| | | | | | | | | On cayman it appears the cmp component is now in Z. Fixes: arb_shader_image_load_store-dead-fragments on cayman. Signed-off-by: Dave Airlie <[email protected]>
* r600/shader: fix 64->32 conversionsDave Airlie2017-12-011-35/+54
| | | | | | | | | | | These didn't handle the TGSI at all properly, this fixes them to use the common path for 64->32 then adds the 32->int on at the end. Fixes: generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/* Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/gfx9: fix importing shared textures with DCCMarek Olšák2017-11-301-1/+1
| | | | | | | VI has 11 dwords at least. GFX9 has 10 dwords. Cc: 17.2 17.3 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: add AllowGLSLCrossStageInterpolationMismatch workaroundTapani Pälli2017-11-303-0/+4
| | | | | | | | | | | | | This fixes issues seen with certain versions of Unreal Engine 4 editor and games built with that using GLSL 4.30. v2: add driinfo_gallium change (Emil Velikov) Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801 Acked-by: Andres Gomez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* etnaviv: GC7000: Factor out state based texture functionalityWladimir J. van der Laan2017-11-308-308/+454
| | | | | | | | | | Prepare for two texture handling paths, the descriptor-based path will be added in a future commit. These are structured so that the texture implementation handles its own state emission. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Move active_samplers_bits to textureWladimir J. van der Laan2017-11-303-12/+17
| | | | | | | This needs to be shared between texture_plain and texture_desc. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Factor out incompatible texture handling logicWladimir J. van der Laan2017-11-302-16/+31
| | | | | | | This will be shared with the texture descriptor path. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Track dirty sampler viewsWladimir J. van der Laan2017-11-304-2/+10
| | | | | | | Need this to efficiently emit texture descriptor invalidations. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Make point sprites work on HALTI5Wladimir J. van der Laan2017-11-303-6/+24
| | | | | | | | Track varying component offset of the point size output, as well as provide the offset of the point coord input. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: State changes for HALTI3..5Wladimir J. van der Laan2017-11-304-73/+218
| | | | | | | | Update state objects to add new state, and emit function to emit new state. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Update screen specs for HALTI5Wladimir J. van der Laan2017-11-301-4/+15
| | | | | | | | - This core must load shaders from memory (AFAIK) - Yet another new location for UNIFORMS Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Update context reset for ..HALTI5Wladimir J. van der Laan2017-11-301-5/+32
| | | | | | | | Update context reset for HALTI3..HALTI5, sorting states for the HALTI version that has them. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: No RS align when using BLTWladimir J. van der Laan2017-11-303-45/+53
| | | | | | | | RS align is not necessary and might even be harmful when using the BLT engine for blitting. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: BLT engine blitting supportWladimir J. van der Laan2017-11-309-3/+684
| | | | | | | | | | | Add an implemenation of key clear_blit functions using the BLT engine that replaced the RS on GC7000. Also set level->size correctly for imported resources. This is important for the BLT resolve-in-place path to work for them. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Factor out RS blit functionalityWladimir J. van der Laan2017-11-306-638/+677
| | | | | | | | | Prepare for BLT-based blitting path by moving RS-based blitting to the RS implementation file, making this self-contained. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Move etna_coalesce to emit header fileWladimir J. van der Laan2017-11-302-83/+83
| | | | | | | | Want to be able to emit state from the texture implementation, and the blitter implementation. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: GC7000: Support BLT as recipient for etna_stallWladimir J. van der Laan2017-11-301-1/+14
| | | | | | | | When the BLT is involved as source or target, add an extra BLT enable/disable sequence around the sync sequence. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Use only DRAW_INSTANCED on GC3000+Wladimir J. van der Laan2017-11-302-4/+33
| | | | | | | | | | | | | | | | The blob does this, as DRAW_INSTANCED can replace fully all the other draw commands. It is also required to handle integer vertex formats. The other path is only there for compatibility and might go away (or at least rot to become buggy due to dis-use) in newer hardware. As a by-effect this changes the behavior for GC3000-, by no longer using the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR. This should make no difference. Preparation for GC7000 support. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* etnaviv: Emit SCALE for vertex attributesWladimir J. van der Laan2017-11-303-0/+7
| | | | | | | | | | This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED. It is also necessary when switching between integer and floating point vertex element formats. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* r600: no need to reinit compute regsDave Airlie2017-11-301-13/+0
| | | | | | | Compute setup gets emitted into the normal gfx state buffer, so no need to reinit the basics. Signed-off-by: Dave Airlie <[email protected]>
* r600: split cb setup code out from evergreen compute path.Dave Airlie2017-11-301-22/+28
| | | | | | This just makes it easier to bypass for TGSI later. Signed-off-by: Dave Airlie <[email protected]>
* r600: add support for compute pkt flags to debug dumping.Dave Airlie2017-11-301-6/+7
| | | | | | This just lets us see packets marked for compute. Signed-off-by: Dave Airlie <[email protected]>
* r600: fix bfe where src/dst are same.Dave Airlie2017-11-301-5/+24
| | | | | | | | This fixes overlaps where src/dst are the same. Fixes a bunch of the deqp bitfield tests. Signed-off-by: Dave Airlie <[email protected]>
* gallium/dri2: Enable {GLX_ARB,EGL_KHR}_context_flush_controlAdam Jackson2017-11-291-0/+2
| | | | | Reviewed-and-tested-by: Nicolai Hähnle <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* r300,r600,radeonsi: replace RADEON_FLUSH_* with PIPE_FLUSH_*Marek Olšák2017-11-2929-57/+55
| | | | | | and handle PIPE_FLUSH_HINT_FINISH in r300. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove r600_common_screenMarek Olšák2017-11-2940-873/+864
| | | | | | | | | | Most files in gallium/radeon now include si_pipe.h. chip_class and family are now here: sscreen->info.family sscreen->info.chip_class Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove r600_pipe_common::barrier_flags::compute_to_L2Marek Olšák2017-11-293-8/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove query/apply_opaque_metadata callbacksMarek Olšák2017-11-293-114/+102
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move shader debug helpers out of r600_pipe_common.cMarek Olšák2017-11-297-26/+24
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>