summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel/isl: Add an enum for describing auxiliary compression stateJason Ekstrand2017-06-071-0/+169
| | | | | | | | | | | | This enum describes all of the states that a auxiliary compressed surface can have. All of the states as well as normative language for referring to each of the compression operations is provided in the truly colossal comment for the new isl_aux_state enum. There is also a diagram showing how surfaces move between the different states. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Combine render target resolve codeJason Ekstrand2017-06-072-52/+29
| | | | | | | | | We have two different bits of resolve code for render targets: one in brw_draw where it's always been and one in brw_context to deal with sRGB on gen9. Let's pull them together. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Be a bit more conservative about certain resolvesJason Ekstrand2017-06-074-8/+12
| | | | | | | | | There are several places where we were resolving the entire miptree when we really only needed to resolve a single slice. Let's avoid the unneeded resolving. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: Move MCS allocation earlier for clearsJason Ekstrand2017-06-071-14/+14
| | | | | | | This way it happens before we call get_aux_state. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: Refactor do_single_blorp_clearJason Ekstrand2017-06-071-17/+23
| | | | | | | | | | | Previously, we had two checks for can_fast_clear and a tiny bit of shared code in between. This commit pulls all of the fast clear code together and duplicates the tiny bit that declares some surface structs and calls blorp_surf_for_miptree. The duplication is no real loss and we're about to change the two in slightly different ways. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/blorp: Take an explicit fast clear op in resolve_colorJason Ekstrand2017-06-073-15/+18
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/miptree: Move color resolve on map to intel_miptree_mapJason Ekstrand2017-06-071-5/+1
| | | | | | | | None of the other methods such as blit work with CCS either so we need to do the resolve for all maps. This change also makes us only resolve the one slice we're mapping and not the entire image. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Inline renderbuffer_att_set_needs_depth_resolveJason Ekstrand2017-06-074-21/+21
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Get rid of intel_renderbuffer_resolve_*Jason Ekstrand2017-06-073-52/+5
| | | | | | | | There is exactly one caller so it's a bit pointless to have all of this plumbing. Just inline it at the one place it's used. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/miptree: Refactor intel_miptree_resolve_colorJason Ekstrand2017-06-076-38/+35
| | | | | | | | | The new version now takes a range of levels as well as a range of layers. It should also be a tiny bit faster because it only walks the resolve_map list once instead of once per layer. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/miptree: Clean up the depth resolve helpers a littleJason Ekstrand2017-06-071-40/+30
| | | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/surface_state: Images can't handle CCS at allJason Ekstrand2017-06-071-6/+6
| | | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Mark depth surfaces as needing a HiZ resolve after blittingJason Ekstrand2017-06-071-0/+2
| | | | | | | Cc: "17.0 17.1" <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* st_glsl_to_tgsi: cleanup variable storage search.Dave Airlie2017-06-081-6/+4
| | | | | | I forgot to put the cleanup in earlier. Signed-off-by: Dave Airlie <[email protected]>
* mesa/main: fix gl_buffer_index enum comparisonRob Herring2017-06-073-7/+8
| | | | | | | | | | | | | | For clang, enums are unsigned by default and gives the following warning: external/mesa3d/src/mesa/main/buffers.c:764:21: warning: comparison of constant -1 with expression of type 'gl_buffer_index' is always false [-Wtautological-constant-out-of-range-compare] if (srcBuffer == -1) { ~~~~~~~~~ ^ ~~ Replace -1 with an enum value to fix this. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* glsl: fix bounds check in blob_overwrite_bytesRob Herring2017-06-071-1/+1
| | | | | | | | | | | | | | | | | clang gives a warning in blob_overwrite_bytes because offset type is size_t which is unsigned: src/compiler/glsl/blob.c:110:15: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare] if (offset < 0 || blob->size - offset < to_write) ~~~~~~ ^ ~ Remove the less than 0 check to fix this. Additionally, if offset is greater than blob->size, the 2nd check would be false due to unsigned math. Rewrite the check to avoid subtraction. Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* st_glsl_to_tgsi: replace variables tracking list with a hash tableDave Airlie2017-06-081-13/+33
| | | | | | | | This removes the linear search which is fail when number of variables goes up to 30000 or so. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st_glsl_to_tgsi: rewrite rename registers to use array fully.Dave Airlie2017-06-081-26/+23
| | | | | | | | | | Instead of having to search the whole array, just use the whole thing and store a valid bit in there with the rename. Removes this from the profile on some of the fp64 tests Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* st_glsl_to_tgsi: bump index back up to 32-bitDave Airlie2017-06-081-2/+2
| | | | | | | | | | | | with some of the fp64 emulation, we are seeing shaders coming in with > 32K temps, they go out with 40 or so used, but while doing register renumber we need to store a lot of them. So bump this fields back up to 32-bit. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util/u_queue: fix a use-before-initialization race for queue->threadsMarek Olšák2017-06-072-17/+14
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: remove another unused variableGrazvydas Ignotas2017-06-081-1/+0
| | | | | | | Declared by each loop already. Trivial. Signed-off-by: Grazvydas Ignotas <[email protected]>
* radv/meta: remove an unused variableGrazvydas Ignotas2017-06-081-1/+0
| | | | | | | Trivial. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: convert several ifs to a switchGrazvydas Ignotas2017-06-081-9/+11
| | | | | | | | Also solve "outinfo may be used uninitialized" warning by putting in an unreachable(). Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: mark some arguments constGrazvydas Ignotas2017-06-081-30/+31
| | | | | | | | | Most functions are only inspecting nir, so nir related arguments can be marked const. Some more can be done if/when some nir changes are accepted. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: Use libdrm to get chipset nameSamuel Li2017-06-073-1/+20
| | | | | | | | v2: Add a func pointer to radeon_winsys to support radeon later. Change-Id: I614ea71424f9e5c97e4ae68654315d28c89eaa5f Signed-off-by: Samuel Li <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* util: Add extern c to u_dynarray.hThomas Helland2017-06-071-0/+8
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Delete nir_array.hThomas Helland2017-06-072-100/+0
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Port to u_dynarrayThomas Helland2017-06-072-5/+5
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Remove unused includeThomas Helland2017-06-071-1/+0
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: Port nir_array functionality to u_dynarrayThomas Helland2017-06-077-18/+45
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: Remove unused includes and convert to lower-case memory opsThomas Helland2017-06-071-15/+12
| | | | | | | | Also, prepare for the next commit by correcting some coding style changes. This should be all non-functional changes. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: Move u_dynarray to src/utilThomas Helland2017-06-073-1/+1
| | | | | | | This will be used as the basis for unification Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium: Add missing includesThomas Helland2017-06-074-0/+4
| | | | | | | | These will need to be in place to avoid regressions when removing these includes from the u_dynarray Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radeonsi: update clip_regs on shader state changes only when it's neededMarek Olšák2017-06-071-3/+32
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: precompute some fields for PA_CL_VS_OUT_CNTL in si_shader_selectorMarek Olšák2017-06-074-16/+25
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a new helper si_get_vsMarek Olšák2017-06-072-17/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: isolate real framebuffer changes from the decompression passes (v3)Samuel Pitoiset2017-06-073-2/+28
| | | | | | | | | | | | | | When a stencil buffer is part of the framebuffer state, it is decompressed but because it's bindless, all draw calls set stencil_dirty_level_mask to 1. v2: Marek - set the flags outside the loop - also clear and set framebuffer.do_update_surf_dirtiness there - do it in the DB->CB copy path too v3: Marek - save and restore the do_update_surf_dirtiness flag Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: do EarlyCSEMemSSA LLVM passMarek Olšák2017-06-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | so that LLVM IR looks like CSE has been run on it. It's also recommended by the instruction combining pass. This also fixes: - GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 (crash) - piglit/spec/arb_shader_ballot/execution/fs-readFirstInvocation-uint-loop (fail) The code size decrease is positive, the register usage isn't. There is a decrease in VGPR spilling for Tomb Raider, but increase in DiRT Showdown and GRID Autosport. EarlyCSEMemSSA has a -0.01% change in code size compared EarlyCSE. SGPRS: 1935420 -> 1938076 (0.14 %) VGPRS: 1645504 -> 1645988 (0.03 %) Spilled SGPRs: 2493 -> 2651 (6.34 %) Spilled VGPRs: 107 -> 115 (7.48 %) Private memory VGPRs: 1332 -> 1332 (0.00 %) Scratch size: 1512 -> 1516 (0.26 %) dwords per thread Code Size: 61981592 -> 61890012 (-0.15 %) bytes Max Waves: 371847 -> 371798 (-0.01 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove 8 bytes from si_shader_keyMarek Olšák2017-06-073-14/+17
| | | | | | | We can use a union in si_shader_key::mono. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERICMarek Olšák2017-06-072-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Heaven LDS usage for LS+HS is below. The masks are "outputs_written" for LS and HS. Note that 32K is the maximum size. Before: heaven_x64: ls=1f1 tcs=1f1, lds=32K heaven_x64: ls=31 tcs=31, lds=24K heaven_x64: ls=71 tcs=71, lds=28K After: heaven_x64: ls=3f tcs=3f, lds=24K heaven_x64: ls=7 tcs=7, lds=13K heaven_x64: ls=f tcs=f, lds=17K All other apps have a similar decrease in LDS usage, because the "outputs_written" masks are similar. Also, most apps don't write POSITION in these shader stages, so there is room for improvement. (tight per-component input/output packing might help even more) It's unknown whether this improves performance. Tested-by: Edmondo Tommasina <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* svga: Always set the alpha value to 1 when sampling using an XRGB viewThomas Hellstrom2017-06-071-13/+30
| | | | | | | | | | If the XRGB view is sampling from an ARGB svga format, change PIPE_SWIZZLE_W to PIPE_SWIZZLE_1 for all channels. Previously we unconditionally set PIPE_SWIZZLE_1 on the alpha channel which could be both insufficient and incorrect. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Fix imported surface view creationThomas Hellstrom2017-06-074-11/+33
| | | | | | | | | | When deciding to create a view with or without an alpha channel we need to look at the SVGA3D format and not the PIPE format. This fixes the glx-tfp piglit test for dri3/xa. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Set alpha to 1 for non-alpha viewsThomas Hellstrom2017-06-071-0/+18
| | | | | | | | | | | Gallium RGB textures may be backed by imported ARGB svga3d surfaces. In those and similar cases we need to set the alpha value to 1 when sampling. Fixes piglit glx::glx-tfp Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Allow format differences in 16-bit RGBA surface sharingThomas Hellstrom2017-06-071-1/+5
| | | | | | | | | | | | | For the purpose of surface sharing, treat SVGA3D_R5G6B5 and SVGA3D_B5G6R5_UNORM as identical formats. This fixes the following piglit tests with dri3/xa: glx@glx-visuals-depth -pixmap glx@glx-visuals-stencil -pixmap Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Deepak Singh Rawat <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* dri/vmwgfx: Disable a couple of glx extensions also for Ubuntu unity / compizThomas Hellstrom2017-06-071-0/+4
| | | | | | | | | | | It appears like the GLX_EXT_buffer_age extension also prevents Compiz / Ubuntu Unity from performing partial buffer swaps when it otherwise feels like doing so. So try to get them back again. We also disable GLX_OML_sync_control since it appears it had a favourable impact on gnome-shell. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]>
* dri: Turn of a couple of glx extensions for gnome-shell on vmwgfx.Thomas Hellstrom2017-06-071-0/+7
| | | | | | | | Increases performance on vmwgfx since we're avoiding full buffer damage and since we can't sync to vertical retrace anyway. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/dri: Allow gallium drivers to turn off two GLX extensionsThomas Hellstrom2017-06-071-0/+2
| | | | | | | | Allow gallium drivers to turn off GLX_EXT_buffer_age and GLX_OML_sync_control if needed, using driconf. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* dri: Optionally turn off a couple of GLX extensions based on driconf optionsThomas Hellstrom2017-06-073-3/+25
| | | | | | | | | | | | With GLX_EXT_buffer_age turned on, gnome-shell will use full-screen damage with GLX, which severely hurts performance with architectures that emulate page-flips with copies. Like vmware. We would like to be able to turn off that extension. Similarly, typically the GLX_OML_sync_control doesn't make much sense on a virtual architecture since we don't really sync to the host's vertical retrace. We'd like to be able to turn it off as well. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/dri: Allow dri users to query also driver optionsThomas Hellstrom2017-06-071-1/+64
| | | | | | | | | There will be situations where we want to control, for example, the GLX behaviour based on applications and drivers. So allow DRI users access to the driver options. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: clean up decompress blend state namesMarek Olšák2017-06-074-10/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>