summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* nir/radv: remove restrictions on opt_if_loop_last_continue()Timothy Arceri2019-04-093-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <[email protected]> (v1) Acked-by: Samuel Pitoiset <[email protected]>
* softpipe: add support for vertex streams (v2)Dave Airlie2019-04-091-1/+5
| | | | | | | | | This enables the ARB_gpu_shader5 vertex streams on softpipe. v2: only enable when not using llvm. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* draw: add support to tgsi paths for geometry streams. (v2)Dave Airlie2019-04-097-124/+194
| | | | | | | | | | | | | This hooks up the geometry shader processing to the TGSI support added in the previous commits. It doesn't change the llvm interface other than to keep things building. v2: fix some regressions caused by primitiveoffsets Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* softpipe: add support for indexed queries.Dave Airlie2019-04-094-14/+16
| | | | | | | We need indexed queries to retrieve the geom shader info. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: add support for geometry shader streams.Dave Airlie2019-04-093-18/+62
| | | | | | | | | | | | This adds support to retrieve the primitive counts for each stream, along with the offset for each primitive into the output array. It also adds support for parsing the stream argument to the emit and end instructions. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* draw: add stream member to stats callbackDave Airlie2019-04-094-3/+4
| | | | | | | | This just adds space for the member to the callback, doesn't change anything else. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* intel: add dependency on genxml generated filesLionel Landwerlin2019-04-081-1/+1
| | | | | | | | | | Drivers using genxml will start compilation before generated files are created, so add a dependency to it. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Cc: [email protected]
* radeonsi: fix a crash when unbinding sampler statesMarek Olšák2019-04-081-1/+1
| | | | Acked-by: James Zhu <[email protected]>
* panfrost: Remove "mali_unknown6" nonsenseAlyssa Rosenzweig2019-04-071-8/+0
| | | | | | | | This structure was used maaaany moons ago as a placeholder for the varying meta (now unified with mali_attr_meta and essentially fully decoded). I don't know why it's still in the file. Let's wack it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Enable lower_find_lsbAlyssa Rosenzweig2019-04-071-0/+1
| | | | | | This is exactly what the blob does. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add ibitcount8 opAlyssa Rosenzweig2019-04-072-0/+3
| | | | | | | | | | | The mechanics of this opcode are a little opaque, but essentially, it's used in 8-bit mode to do a bit count in parallel of a uint and then doing a ton of clever iadd/imov ops to recombine. v2: Correct opcode. Thank you to jernej on IRC for noticing this awkward typo! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add ilzcnt opAlyssa Rosenzweig2019-04-072-0/+3
| | | | | | Used for implementing findLSB/MSB Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add umin/umax opcodesAlyssa Rosenzweig2019-04-073-0/+8
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add tilebuffer load? branchAlyssa Rosenzweig2019-04-072-1/+37
| | | | | | Also document branches better. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/decode: Add flags for tilebuffer readbackAlyssa Rosenzweig2019-04-072-3/+21
| | | | | | | | These flags are set when reading back the tilebuffer from a fragment shader via various mechanisms (including ARM_shader_framebuffer_fetch and EXT_pixel_local_storage). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: use nir_src_is_const and nir_src_as_uintKarol Herbst2019-04-071-7/+4
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* vc4: Prefer nir_src_comp_as_uint over nir_src_as_const_valueJason Ekstrand2019-04-072-18/+16
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium/util: Add const to u_range_intersectKenneth Graunke2019-04-071-1/+2
| | | | | | This doesn't modify the range, so it can accept a const pointer. Reviewed-by: Eric Engestrom <[email protected]>
* gallium/hud: add CPU usage support for FreeBSDGreg V2019-04-071-0/+45
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* iris: Silence unused variable warnings in release modeKenneth Graunke2019-04-061-3/+2
|
* util: clean the 24-bit unused field to avoid an issuesAndrii Simiklit2019-04-051-3/+3
| | | | | | | | | | | | | | | | This is a field of FLOAT_32_UNSIGNED_INT_24_8_REV texture pixel. OpenGL spec "8.4.4.2 Special Interpretations" is saying: "the second word contains a packed 24-bit unused field, followed by an 8-bit index" The spec doesn't require us to clear this unused field however it make sense to do it to avoid some undefined behavior in some apps. Suggested-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110305 Signed-off-by: Andrii Simiklit <[email protected]>
* tegra: fix the build after the set_shader_buffers changeMarek Olšák2019-04-051-1/+1
|
* gallium/auxiliary/vl: Add barrier/unbind after compute shader launch.James Zhu2019-04-051-0/+13
| | | | | | | | Add memory barrier sync for multiple launch cases, and unbind completed resources after launch. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/auxiliary/vl: Fixed blank issue with compute shaderJames Zhu2019-04-051-6/+1
| | | | | | | | | Multiple init buffer within one open instance will cause blank issue. Updating viewport per frame will fix this issue. Signed-off-by: James Zhu <[email protected]> Tested-by: Bruno Milreu <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/auxiliary/vl: Fixed blur issue with weave compute shaderJames Zhu2019-04-051-1/+1
| | | | | | | | Correct wrong interpolatation with top/bottom row which caused blur issue. Signed-off-by: James Zhu <[email protected]> Tested-by: Bruno Milreu <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* softpipe: Use mag texture filter also for clamped lod == 0Gert Wollny2019-04-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | Follow the spec when selecting the magnification filter (OpenGL 4.5, section 8.14): If λ(x, y) is less than or equal to the constant c (see section 8.15) the texture is said to be magnified; While we're here also silence a potential warning about implicit float to double conversion. v2: Update commit message to contain a reference to the spec as pointed out by Eric. Fixes a number of dEQP GLES2 and GLES3 test out of: dEQP-GLES2.functional.texture.filtering.* dEQP-GLES2.functional.texture.vertex.2d.filtering.* dEQP-GLES3.functional.texture.vertex.*.filtering.* dEQP-GLES3.functional.texture.filtering.* dEQP-GLES3.functional.texture.shadow.2d.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: handle aux properly in iris_resource_get_handleTapani Pälli2019-04-041-0/+8
| | | | | | | | | Disable aux when resource seen the first time and EXPLICIT_FLUSH not being set. This fixes issues seen when launching Xorg and CCS_E getting utilized. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* v3d: Don't try to use the TFU blit path if a scissor is enabled.Eric Anholt2019-04-041-1/+2
| | | | | | | | We'll need to do a render-based blit for scissors, since the TFU (as seen in this conditional) can only update a whole surface. Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.") Fixes piglit fbo-scissor-blit.
* v3d: Bump the maximum texture size to 4k for V3D 4.x.Eric Anholt2019-04-043-2/+29
| | | | | | | 4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in the CTS at 8k and 16k. 4k at least lets us get one 4k display working. Cc: [email protected]
* v3d: Add support for handling OOM signals from the simulator.Eric Anholt2019-04-043-14/+78
| | | | | | I have v3d allocating enough initial allocation memory that we've been passing tests without it, but to match kernel behavior more it would be good to actually exercise the OOM path.
* ddebug: add compute functions to help hang detectionDave Airlie2019-04-051-2/+21
| | | | Reviewed-by: Marek Olšák <[email protected]>
* iris: avoid use after free in shader destructionDave Airlie2019-04-051-7/+49
| | | | | | | | | | | While playing with compute shaders, I was getting a random crash, noticed that bind_state was using the old shader info for comparision, but gallium allows the shader to be deleted while bound, so this could lead to a use after free. This can't happen using the cso cache. As it tracks all of this. Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: set exact shader buffer read/write usage in CSMarek Olšák2019-04-046-24/+41
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium: add writable_bitmask parameter into set_shader_buffersMarek Olšák2019-04-0417-25/+48
| | | | | | | to indicate write usage per buffer. This is just a hint (it will be used by radeonsi). Reviewed-by: Timothy Arceri <[email protected]>
* iris: Fix assert when using vertex attrib without buffer bindingDanylo Piliaiev2019-04-041-1/+2
| | | | | | | | | | | | | The GL 4.5 spec says: "If any enabled array’s buffer binding is zero when DrawArrays or one of the other drawing commands defined in section 10.4 is called, the result is undefined." The result is undefined but it should not crash. Fixes: gl-3.1-vao-broken-attrib Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: move iris_flush_resource so we can call it from get_handleTapani Pälli2019-04-041-15/+15
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA.Kenneth Graunke2019-04-043-8/+8
| | | | | | | | MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE command's calculations - it holds the result of the subtraction when the compare operation is SRCS_EQUAL or DELTAS_EQUAL. But the actual result of the predication is MI_PREDICATE_RESULT, which is what we want to copy from the render context to the compute context.
* simplify LLVM version string printingEric Engestrom2019-04-042-18/+9
| | | | | | | Figure it out once in the build system, then just use that all over the place. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_dump: util_dump_sampler_view: Dump u.tex.first_levelGuido Günther2019-04-041-1/+1
| | | | | | | Dump u.tex.first_level instead of dumping u.tex.last_level twice. Signed-off-by: Guido Günther <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: ddebug: Add missing fence related wrappersGuido Günther2019-04-042-0/+37
| | | | | | | | | | | | | | Without that `GALLIUM_DDEBUG=always kmscube -A` would segfault like #0 0x0000000000000000 in () #1 0x0000ffffa72a3c54 in dri2_get_fence_fd (_screen=0xaaaaed4f2090, _fence=0xaaaaed9ef880) at ../src/gallium/state_trackers/dri/dri_helpers.c:140 #2 0x0000ffffa8744824 in dri2_dup_native_fence_fd (drv=0xaaaaed5010c0, disp=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/drivers/dri2/egl_dri2.c:3050 #3 0x0000ffffa87339b8 in eglDupNativeFenceFDANDROID (dpy=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/main/eglapi.c:2107 #4 0x0000aaaabd29ca90 in () #5 0x0000aaaabd401000 in () Signed-off-by: Guido Günther <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* gallium/hud: fix rounding error in nic bps computationEric Engestrom2019-04-041-2/+2
| | | | | | | | While at it, fix typo in "rounding error" :P Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: prevent buffer overflowEric Engestrom2019-04-043-6/+6
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/hud: fix memory leaksEric Engestrom2019-04-043-2/+7
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add support for displayable DCC for multi-RB chipsMarek Olšák2019-04-047-4/+253
| | | | A compute shader is used to reorder DCC data from aligned to unaligned.
* radeonsi: add support for displayable DCC for 1 RB chipsMarek Olšák2019-04-043-4/+84
| | | | This is the simpler codepath - just disable RB and pipe alignment for DCC.
* radeonsi: add ability to bind images as image buffersMarek Olšák2019-04-042-3/+8
| | | | so that we can bind DCC (texture) as an image buffer.
* radeonsi/gfx9: add support for PIPE_ALIGNED=0Marek Olšák2019-04-044-12/+30
| | | | | | Needed by displayable DCC. We need to flush L2 after rendering if PIPE_ALIGNED=0 and DCC is enabled.
* iris: move variable to the scope where it is being usedTapani Pälli2019-04-041-1/+1
| | | | | | | | | iris_upload_border_color is passed a pointer which points to variable that is introduced in a different scope. CID: 1444296 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* panfrost: Size tiled temp buffers correctlyAlyssa Rosenzweig2019-04-043-8/+13
| | | | | | | | This should lower transient memory usage and improve performance slightly (due to less memory to malloc/free, better cache locality, etc). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Respect box->width in tiled storesAlyssa Rosenzweig2019-04-043-4/+6
| | | | | | | This fixes a regression uploading partial tiled textures introduced sometime during the cubemap series. Signed-off-by: Alyssa Rosenzweig <[email protected]>