summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* st/dri: drop duplicate #defineEric Engestrom2019-02-141-4/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* gbm: drop duplicate #definesEric Engestrom2019-02-141-8/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* drm-uapi: use local files, not system libdrmEric Engestrom2019-02-1494-108/+103
| | | | | | | | | There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* radv: fix radv_fixup_vertex_input_fetches()Samuel Pitoiset2019-02-141-1/+1
| | | | | | | | We should check that num_channels is 4, otherwise that breaks the world. Sorry for the short breakage. Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches") Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: reduce the number of loaded channels for vertex input fetchesSamuel Pitoiset2019-02-141-2/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | It's unnecessary to load more channels than the vertex attribute format. The remaining channels are filled with 0 for y and z, and 1 for w. 29077 shaders in 15096 tests Totals: SGPRS: 1321605 -> 1318869 (-0.21 %) VGPRS: 935236 -> 932252 (-0.32 %) Spilled SGPRs: 24860 -> 24776 (-0.34 %) Code Size: 49832348 -> 49819464 (-0.03 %) bytes Max Waves: 242101 -> 242611 (0.21 %) Totals from affected shaders: SGPRS: 93675 -> 90939 (-2.92 %) VGPRS: 58016 -> 55032 (-5.14 %) Spilled SGPRs: 172 -> 88 (-48.84 %) Code Size: 2862740 -> 2849856 (-0.45 %) bytes Max Waves: 15474 -> 15984 (3.30 %) This mostly helps Croteam games (Talos/Sam2017). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store vertex attribute formats as pipeline keysSamuel Pitoiset2019-02-143-3/+21
| | | | | | | The formats will be used for reducing the number of loaded channels. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use MAX_{VBS,VERTEX_ATTRIBS} when defining max vertex input limitsSamuel Pitoiset2019-02-141-2/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: make use of ac_build_expand_to_vec4() in visit_image_store()Samuel Pitoiset2019-02-143-8/+6
| | | | | | | And make ac_build_expand() a static function. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno: Use the NIR lowering for isign.Eric Anholt2019-02-142-14/+1
| | | | | | | I think this will save an instruction and hopefully not increase any other costs (possibly the immediate -1 and 1?), but I haven't actually tested. Reviewed-by: Kristian H. Kristensen <[email protected]>
* intel: Use the NIR lowering for isign.Eric Anholt2019-02-143-31/+1
| | | | | | | | | | | Drops one instruction from fs-sign-int.shader_test. No change in shader-db due to it having 0 instances of sign(genIType). This may hurt isign64 if algebraic runs before int64 lowering, but I wasn't sure how to mark the algebraic opt as "every bit size but 64". v2: Update commit message about shader-db. Reviewed-by: Ian Romanick <[email protected]> (v1)
* v3d: Use the NIR lowering for isign instead of rolling our own.Eric Anholt2019-02-141-16/+1
| | | | | min/max instead of comparisons saves 2 instructions on fs-sign-int.shader_test.
* nir: Move panfrost's isign lowering to nir_opt_algebraic.Eric Anholt2019-02-144-1/+6
| | | | | | | I wanted to reuse this from v3d. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: turn an ssa check in nir_search into an assertTimothy Arceri2019-02-141-2/+1
| | | | | | | Everything should be in ssa form when we call this. This is a hotpath so replace the check with an assert. Reviewed-by: Connor Abbott <[email protected]>
* nir: turn ssa check into an assertTimothy Arceri2019-02-141-3/+11
| | | | | | | Everthing should be in ssa form when this is called. Checking for it here is expensive so turn this into an assert instead. Reviewed-by: Connor Abbott <[email protected]>
* nir: prehash instruction in nir_instr_set_add_or_rewrite()Timothy Arceri2019-02-141-4/+5
| | | | | | | | There is no need to hash the instruction twice, especially as we end up adding it in the majority of cases. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: Add dependency on genxml to anvilDylan Baker2019-02-131-2/+5
| | | | | | | | | | | | Currently the Intel "anvil" driver races with the generation of genxml files, while i965 has an explicit dependency. This patch adds the same dependency to anvil. Fixes: d1992255bb29054fa51763376d125183a9f602f ("meson: Add build Intel "anv" vulkan driver") Acked-by: Jason Ekstrand <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: always export gl_SampleMask when the fragment shader uses itSamuel Pitoiset2019-02-131-4/+4
| | | | | | | | | For some reasons, this breaks trees rendering in Project Cars. Fixes: 85010585cde ("radv: only enable gl_SampleMask if MSAA is enabled too") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/aux: add PIPE_CAP_MAX_VARYINGS to u_screenAlok Hota2019-02-131-0/+3
| | | | | | | | Allows drivers using `u_pipe_screen_get_param_defaults` to use a fallback value for the new pipe cap. Default value of 8 based on GL 2.1 MAX_VARYING_FLOATS Reviewed-by: Eric Anholt <[email protected]>
* radv/winsys: fix BO list creation when RADV_DEBUG=allbos is setSamuel Pitoiset2019-02-131-0/+1
| | | | | | Fixes: 50fd253bd6e ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno/a6xx: Fix point coordKristian H. Kristensen2019-02-133-8/+4
| | | | | | | | | | | Use ir3_next_varying() for iterating through varyings and unset the global point coord invert bit. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.pointcoord Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Front facing needs UNK3 bitKristian H. Kristensen2019-02-131-2/+5
| | | | | | | | | | | We need to set UNK3 in GRAS_CNTL and RB_RENDER_CONTROL0 for the value to be reliably delivered. Fixes: dEQP-GLES3.functional.shaders.builtin_variable.frontfacing Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Update headersKristian H. Kristensen2019-02-1310-83/+266
| | | | | | | | This pulls in changes for compute shaders and a6xx ssbo/image support. FACENESS bit moved from position 1 to 2 and there's a global invert bit for point coord. Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Clean up mixed use of swap and swizzle for texture stateKristian H. Kristensen2019-02-132-39/+28
| | | | Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: small compiler warning fixRob Clark2019-02-131-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* mapi: work around GCC LTO dropping assembly-defined functionsKonstantin Kharlamov2019-02-134-0/+4
| | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109391 Signed-off-by: Konstantin Kharlamov <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: fix example in opt_peel_loop_initial_if descriptionCaio Marcelo de Oliveira Filho2019-02-121-3/+3
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/opt_if: don't mark progress if nothing changesKarol Herbst2019-02-131-0/+7
| | | | | | | | | | | | | | | | | | | | if we have something like this: loop { ... if x { break; } else { continue; } } opt_if_loop_last_continue returns true marking progress allthough nothing changes. Fixes: 5921a19d4b0c6 "nir: add if opt opt_if_loop_last_continue()" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi: Fix guardband computation for large render targetsOscar Blumberg2019-02-121-2/+28
| | | | | | | | | Stop using 12.12 quantization for viewports that are not contained in the lower 4k corner of the render target as the hardware needs to keep both absolute and relative coordinates representable. Signed-off-by: Marek Olšák <[email protected]> Cc: 18.3 19.0 <[email protected]>
* egl: fix KHR_partial_update without EXT_buffer_ageChia-I Wu2019-02-121-1/+6
| | | | | | | | EGL_BUFFER_AGE_EXT can be queried without EXT_buffer_age. Signed-off-by: Chia-I Wu <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* mesa: Advertise EXT_float_blend in ES 3.0+ contexts.Kenneth Graunke2019-02-121-0/+1
| | | | | | | | | | | | | | | | | | | | This extension simply drops a draw time restriction: "Furthermore, an INVALID_OPERATION error is generated by DrawArrays and the other drawing commands defined in section 2.8.3 (10.5 in ES 3.1) if blending is enabled (see below) and any draw buffer has 32-bit floating-point format components." We never correctly enforced this restriction anyway, so we were basically already implementing it. We just need to advertise it for our behavior to be correct. The extension requires EXT_color_buffer_float, but we already enable that via dummy_true. So we can dummy_true this one as well. Found while debugging WebGL conformance tests. Does not fix any. Reviewed-by: Tapani Pälli <[email protected]>
* gallium/swr: Param defaults for unhandled PIPE_CAPsAlok Hota2019-02-121-4/+3
| | | | | | | Without using this function, we fail the -Wswitch flag when compiling the default debugoptimized mode in Meson Reviewed-by: Bruce Cherniak <[email protected]>
* anv/cmd_buffer: check for NULL framebufferJuan A. Suarez Romero2019-02-121-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can happen when we record a VkCmdDraw in a secondary buffer that was created inheriting from the primary buffer, but with the framebuffer set to NULL in the VkCommandBufferInheritanceInfo. Vulkan 1.1.81 spec says that "the application must ensure (using scissor if neccesary) that all rendering is contained in the render area [...] [which] must be contained within the framebuffer dimesions". While this should be done by the application, commit 465e5a86 added the clamp to the framebuffer size, in case of application does not do it. But this requires to know the framebuffer dimensions. If we do not have a framebuffer at that moment, the best compromise we can do is to just apply the scissor as it is, and let the application to ensure the rendering is contained in the render area. v2: do not clamp to framebuffer if there isn't a framebuffer v3 (Jason): - clamp earlier in the conditional - clamp to render area if command buffer is primary v4: clamp also x and y to render area (Jason) v5: rename used variables (Jason) Fixes: 465e5a86 ("anv: Clamp scissors to the framebuffer boundary") CC: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi: use MEM instead of MEM_GRBM in COPY_DATA.DST_SELMarek Olšák2019-02-121-3/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add AMD_DEBUG env var as an alternative to R600_DEBUGMarek Olšák2019-02-121-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8Samuel Pitoiset2019-02-123-3/+10
| | | | | | | | | This fixes a critical issue. Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for push constants inlining when possibleSamuel Pitoiset2019-02-125-28/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | This removes some scalar loads from shaders, but it increases the number of SET_SH_REG packets. This is currently basic but it could be improved if needed. Inlining dynamic offsets might also help. Original idea from Dave Airlie. 29077 shaders in 15096 tests Totals: SGPRS: 1321325 -> 1357101 (2.71 %) VGPRS: 936000 -> 932576 (-0.37 %) Spilled SGPRs: 24804 -> 24791 (-0.05 %) Code Size: 49827960 -> 49642232 (-0.37 %) bytes Max Waves: 242007 -> 242700 (0.29 %) Totals from affected shaders: SGPRS: 290989 -> 326765 (12.29 %) VGPRS: 244680 -> 241256 (-1.40 %) Spilled SGPRs: 1442 -> 1429 (-0.90 %) Code Size: 8126688 -> 7940960 (-2.29 %) bytes Max Waves: 80952 -> 81645 (0.86 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: keep track of the number of remaining user SGPRsSamuel Pitoiset2019-02-121-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather if shaders load dynamic offsets separatelySamuel Pitoiset2019-02-122-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather more info about push constantsSamuel Pitoiset2019-02-124-1/+44
| | | | | | | | This is needed in order to inline some push constants when possible. This also adds a new helper for initializing the pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix compiler issues with GCC 9Samuel Pitoiset2019-02-121-42/+48
| | | | | | | | | | | | | | | | | "The C standard says that compound literals which occur inside of the body of a function have automatic storage duration associated with the enclosing block. Older GCC releases were putting such compound literals into the scope of the whole function, so their lifetime actually ended at the end of containing function. This has been fixed in GCC 9. Code that relied on this extended lifetime needs to be fixed, move the compound literals to whatever scope they need to accessible in." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543 Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Gustaw Smolarczyk <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: add P0x formats and propagate required scaling factorsTapani Pälli2019-02-123-0/+17
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/compiler: add scale_factors to sampler_prog_key_dataTapani Pälli2019-02-123-0/+8
| | | | | | | | Patch propagates given scale_factors to lowering options. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* dri: add P010, P012, P016 for 10bit/12bit/16bit YUV420 formatsTapani Pälli2019-02-121-0/+17
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: add option to use scaling factor when sampling planes YUV loweringTapani Pälli2019-02-122-21/+35
| | | | | | | | | Patch adds nir_lower_tex_options as parameter to sample_plane so that we don't need to extend nir_tex_instr for this. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use info->textures_used instead of prog->SamplersUsed.Kenneth Graunke2019-02-112-7/+7
| | | | | | | | | | prog->SamplersUsed is set by the linker when validating resource limits, while info->textures_used is gathered after NIR optimizations, which may have eliminated some unused surfaces. This may let us skip some work. Reviewed-by: Eric Anholt <[email protected]>
* i965: Drop unnecessary 'and' with prog->SamplerUnitsKenneth Graunke2019-02-111-1/+1
| | | | | | | textures_used_by_txf is a subset of textures_used which is a subset of prog->SamplerUnits. This should do nothing. Reviewed-by: Eric Anholt <[email protected]>
* nir: Gather texture bitmasks in gl_nir_lower_samplers_as_deref.Kenneth Graunke2019-02-117-11/+40
| | | | | | | | | | | | | | | | | | | | | | | | Eric and I would like a bitmask of which samplers are used, similar to prog->SamplersUsed, but available in NIR. The linker uses SamplersUsed for resource limit checking, but later optimizations may eliminate more samplers. So instead of propagating it through, we gather a new one. While there, we also gather the existing textures_used_by_txf bitmask. Gathering these bitfields in nir_shader_gather_info is awkward at best. The main reason is that it introduces an ordering dependency between the two passes. If gathering runs before lower_samplers_as_deref, it can't look at var->data.binding. If the driver doesn't use the full lowering to texture_index/texture_array_size (like radeonsi), then the gathering can't use those fields. Gathering might be run early /and/ late, first to get varying info, and later to update it after variant lowering. At this point, should gathering work on pre-lowered or post-lowered code? Pre-lowered is also harder due to the presence of structure types. Just doing the gathering when we do the lowering alleviates these ordering problems. This fixes ordering issues in i965 and makes the txf info gathering work for radeonsi (though they don't use it). Reviewed-by: Eric Anholt <[email protected]>
* nir: Use sampler derefs in drawpixels and bitmap lowering.Kenneth Graunke2019-02-112-13/+34
| | | | Reviewed-by: Eric Anholt <[email protected]>
* program: Make prog_to_nir create texture/sampler derefs.Kenneth Graunke2019-02-111-5/+16
| | | | | | | | | | | | | Until now, prog_to_nir has been setting texture_index and sampler_index directly. This is different than GLSL shaders, which create variable dereferences and rely on lowering passes to reach this final form. radeonsi uses variable dereferences for samplers rather than texture_index and sampler_index, so it doesn't even make sense to set them there. By moving to derefs, we ensure that both GLSL and ARB programs produce the same final form that the driver desires. Reviewed-by: Eric Anholt <[email protected]>
* st/nir: Use sampler derefs in built-in shaders.Kenneth Graunke2019-02-112-8/+24
| | | | Reviewed-by: Eric Anholt <[email protected]>