aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Add helperInvocationEXT() builtinCaio Marcelo de Oliveira Filho2019-09-301-0/+1
| | | | | | | | | | | From EXT_demote_to_helper_invocation, implemented with the existing nir_intrinsic_is_helper_invocation. Such builtin is necessary when using `demote` because we can't redefine the value of gl_HelperInvocation (since it is an input variable). Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Add ir_demoteCaio Marcelo de Oliveira Filho2019-09-302-0/+14
| | | | | | | | | | | | | | | To represent the new `demote` keyword when using EXT_demote_to_helper_invocation extension. Most of the changes are to include it in the visitors. Demote is not considered a control flow, so also include an empty visit member function in ir_control_flow_visitor. Only NIR actually supports `demote`, so assert the translations for TGSI and Mesa's gl_program -- since the demote is not expected to appear for those. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Extension boilerplate for EXT_demote_to_helper_invocationCaio Marcelo de Oliveira Filho2019-09-302-0/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* shader_enums: Move MAX_DRAW_BUFFERS to this file.Eric Anholt2019-09-271-6/+1
| | | | | | | | | We include shader_enums.h from freedreno's compiler for both GL and Vulkan, and the main/config.h include resulted in polluting the namespace with things like MAX_VIEWPORTS that other Vulkan drivers use as their driver-specific maximums. Reviewed-by: Kristian H. Kristensen <[email protected]>
* mesa/st: calculate texture size based on EGLImage miplevelTapani Pälli2019-09-261-2/+5
| | | | | | | Fixes issues with 'egl-gl_oes_egl_image' Piglit test. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel: Increase Gen11 compute shader scratch IDs to 64.Kenneth Graunke2019-09-231-1/+13
| | | | | | | | | | | | | | | | | | | | | | | From the MEDIA_VFE_STATE docs: "Starting with this configuration, the Maximum Number of Threads must be set to (#EU * 8) for GPGPU dispatches. Although there are only 7 threads per EU in the configuration, the FFTID is calculated as if there are 8 threads per EU, which in turn requires a larger amount of Scratch Space to be allocated by the driver." It's pretty clear that we need to increase this for scratch address calculations, because the FFTID has a certain bit-pattern. The quote above seems to indicate that we should increase the actual thread count programmed in MEDIA_VFE_STATE as well, but we think the intention is to only bump the scratch space. Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8. Fixes: 5ac804bd9ac ("intel: Add a preliminary device for Ice Lake") Reviewed-by: Matt Turner <[email protected]>
* Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"Kenneth Graunke2019-09-232-9/+0
| | | | | | | | | | | | | | | This reverts commit 729de1488f49033bc181b8123af5658228a51bf1. It turns out that, although the register is in the logical context, it isn't whitelisted, so we can't actually write it from userspace batch buffers. The write just becomes a noop, which is why we saw no performance changes. I manually whitelisted it, and still observed no performance gains, but it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments on the iris driver. So we might need to fix something before enabling this. To prevent it randomly getting turned on should the kernel ever whitelist this register, we revert the patch for now.
* st/mesa: Bail on incomplete attachments in discard_framebufferKenneth Graunke2019-09-221-1/+1
| | | | | | | | | | | Incomplete attachments don't have an associated pipe_surface, so this would crash. Fixes a WebGL conformance test that uses incomplete attachments: https://www.khronos.org/registry/webgl/sdk/tests/conformance2/renderbuffers/invalidate-framebuffer.html?webglVersion=2&quiet=0&quick=1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111756 Reviewed-By: Tapani Pälli <[email protected]>
* clover: add support for passing kernels as nir to the driverKarol Herbst2019-09-211-0/+3
| | | | | | | | | | | | | v2: minor formatting fixes v3: call glsl_type_singleton_init_or_ref and glsl_type_singleton_decref v4: capitalize and punctuate comments fix text_executable -> text_intermediate in TODO make glsl_type_singleton wrapper static v5: rewrite how we run the nir passes v6: fix unhandled case switch warning in st/mesa Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> (v4)
* Move blob from compiler/ to util/Jason Ekstrand2019-09-193-3/+3
| | | | | | | | There's nothing whatsoever compiler-specific about it other than that's currently where it's used. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* util/u_queue: track job size and limit the size of queue growthTimothy Arceri2019-09-191-2/+2
| | | | | | | | | | | | | | | | | | | | | When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a situation where the queue never executes and grows to a huge size due to all other threads being busy. This is the case with the shader cache when attempting to compile a huge number of shaders up front. If all threads are busy compiling shaders the cache queues memory use can climb into the many GBs very fast. The use of these two flags with the shader cache is intended to allow shaders compiled at runtime to be compiled as fast as possible. To avoid huge memory use but still allow the queue to perform optimally in the run time compilation case, we now add the ability to track memory consumed by the jobs in the queue and limit it to a hardcoded 256MB which should be more than enough. Reviewed-by: Marek Olšák <[email protected]>
* i965: support AYUV/XYUV for external import onlyHaihao Xiang2019-09-181-0/+2
| | | | | | | | | Fixes: 89785e2d56e7fa ("i965: add support for sampling from AYUV") Fixes: 7cab8d3661f243 ("i965: Add support for sampling from XYUV images") Cc: Vivek Kasireddy <[email protected]> Cc: Lionel Landwerlin <[email protected]> Signed-off-by: Haihao Xiang <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"Christian Gmeiner2019-09-181-2/+2
| | | | | | | | | | There are GPUs that do not support this feature. This reverts commit e871abe452ad40efcccb0bab6b88fc31d0551e29 Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: remove always-true expressionErik Faye-Lund2019-09-171-1/+0
| | | | | | | | | | In case the GLSL version is 130 or higher, we've already enabled ARB_shader_bit_encoding a bit earlier in this same function. So this condition will always be true. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: Increase GL_POINT_SIZE_RANGE minimum to 1.0Kenneth Graunke2019-09-161-4/+1
| | | | | | | | | | | | | | | | | | Table 23.54 of the OpenGL 4.5 spec lists the minimum values for GL_POINT_SIZE_RANGE as [1, 1]. So zero is not allowed (even though arguably this could be useful for MSAA rendering, where a sub-1px point might cover only some samples...) This fixes the WebGL 2.0 conformance suite's state.gl-get-calls test on Chromium on Linux, which uses desktop OpenGL. The test checks that the minimum value of GL_ALIASED_POINT_SIZE_RANGE is 1. Unfortunately, that query doesn't exist in desktop GL, so it checks POINT_SIZE_RANGE, which is the anti-aliased value. There's not really anything better for Chromium to do here, unfortunately. When running Chromium with --api=es3, it maps it to the correct query and the test already works. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: Prefer 5551 formats for GL_UNSIGNED_SHORT_5_5_5_1.Kenneth Graunke2019-09-161-0/+7
| | | | | | | | | | | | Previously, internalformat GL_RGBA and type GL_UNSIGNED_SHORT_5_5_5_1 was promoted to RGBA8888 as the table entry with the 5551 formats is listed below the 8888 entry, and it also doesn't have GL_RGBA as a possible internalformat. Using actual 5551 fixes the following dEQP-EGL test: - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8 Reviewed-by: Eric Anholt <[email protected]>
* scons: Make scons and meson agree about path to glapi generated headersDylan Baker2019-09-163-1/+3
| | | | | | | | Currently scons puts them in src/mapi/glapi, meosn puts them in src/mapi/glapi/gen. This results in some things being compilable only by one or the other, put them in the same places so that everyone is happy. Reviewed-by: Eric Engestrom <[email protected]>
* mesa/gl: Sync with Khronos registryHeinrich Fink2019-09-161-1/+1
| | | | | | | | | | | | | | | | | | Update GL headers and xml API from upstream Khronos registry (commit 3d0c3eb). Keep `BUILDING_MESA` quirk in glext.h. mesa/extensions: Expose EXT_EGL_sync instead of MESA_EGL_sync to reflect Khronos request of changing this extension's scope from MESA to EXT. EGL_EGL_sync is also the name of the extension that has been merged into the upstream Khronos GL registry. Remove MESA_EGL_sync spec txt from Mesa tree as it is now published as EXT by Khronos. v1: Remove MESA_EGL_sync spec and squash commits (Eric E) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* driconfig: add a new engine name/version parameterLionel Landwerlin2019-09-156-8/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vulkan applications can register with the following structure : typedef struct VkApplicationInfo { VkStructureType sType; const void* pNext; const char* pApplicationName; uint32_t applicationVersion; const char* pEngineName; uint32_t engineVersion; uint32_t apiVersion; } VkApplicationInfo; This enables the Vulkan implementations to apply workarounds based off matching this description. Here we add a new parameter for matching the driconfig options with the following : <device driver="anv"> <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42"> <option name="blaaah" value="true" /> </application> </device> v2: switch engine name match to use regexps v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric) v4: Add missing bit that went to the following commit (Eric) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: 19.2 <[email protected]>
* mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINTJiadong Zhu2019-09-121-3/+3
| | | | | | | | | | | | _mesa_texstore_z32f_x24s8 calculates source rowStride at a pace of 64-bit, this will make inaccuracy offset if the width of src image is an odd number. Modify src pointer to int_32* as source image format is gl_float which is 32-bit per pixel. Reviewed by Ilia Mirkin Signed-off-by: Jiadong Zhu <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* mesa/st: Fallback to name lookup when the variable have no ParameterCaio Marcelo de Oliveira Filho2019-09-121-2/+46
| | | | | | | | | | | | | | This brings back the fallback previously present in st_nir_lookup_parameter_index(): if there's no parameter associated with the variable, use a parameter from a variable with the same prefix. We'll have to sort out something for SPIR-V, but in the meantime let's fix GLSL. Fixes: b6384e57f5f ("mesa/st: Lookup parameters without using names") Reviewed-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]>
* prog_to_nir: VARYING_SLOT_PSIZ is a scalarIago Toral Quiroga2019-09-121-3/+5
| | | | | | | v2: remove stray change (Erik Faye-Lund) Reviewed-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* dri: Use DRM_FORMAT_* instead of defining our own copy.Eric Anholt2019-09-112-46/+47
| | | | | | | | | | | | We have only two defines that aren't from DRM_FORMAT_*: SARGB and SABGR. Keep only those as __DRI_IMAGE_FOURCC and garbage collect the rest. While this header is also used from the X server, the X server doesn't use any __DRI_IMAGE enums. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* st/mesa: Only pause queries if there are any active queries to pause.Kenneth Graunke2019-09-115-4/+17
| | | | | | | | | | | | | Previously, ReadPixels, PBO upload/download, and clears would call cso_save_state with CSO_PAUSE_QUERIES, causing cso_context to call pipe->set_active_query_state() twice for each operation. This can potentially cause driver work to enable/disable statistics counters. But often, there are no queries happening which need to be paused. By keeping a simple tally of active queries, we can skip this work. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WMAnuj Phogat2019-09-112-0/+9
| | | | | | | Initial benchmarking didn't show any performance benefits. But it might eventually. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st/nir: fix illegal designated initializer in st_glsl_to_nir.cppBrian Paul2019-09-111-1/+1
| | | | | | | | | IIRC, designated initializers are not legal C++. Fixes the MSVC build. Fixes: 83fd1e58 ("glsl/nir: Add and use a gl_nir_link() function") Reviewed-by: Neha Bhende <[email protected]>
* prog_to_nir, tgsi_to_nir: make sure kill doesn't discard NaNsMarek Olšák2019-09-111-0/+3
| | | | Reviewed-by: Connor Abbott <[email protected]>
* glsl/nir: Add and use a gl_nir_link() functionCaio Marcelo de Oliveira Filho2019-09-102-15/+8
| | | | | | | | | Perform all the NIR linking steps in order. Change iris and i965 to use it. Suggested by Alejandro. v2: Add gl_nir_linker_options struct. Reviewed-by: Alejandro Piñeiro <[email protected]> [v1]
* gallium: Add ARB_gl_spirv supportCaio Marcelo de Oliveira Filho2019-09-101-0/+21
| | | | | | | | | | | | | | The PIPE_CAP_GL_SPIRV capability enables ARB_gl_spirv and ARB_spirv_extensions, and will make sure the corresponding SPIR-V capabilities and extensions lists are initialized. The additional PIPE_CAP_GL_SPIRV_VARIABLE_POINTERS capability enables the support for Variable Pointers in SPIR-V shaders. This depends on the driver and is not mandatory for ARB_gl_spirv support. v2: Add a PIPE_CAP for Variable Pointers. (Marek) Reviewed-by: Alejandro Piñeiro <[email protected]> [v1]
* mesa/spirv: Set a few more extensionsCaio Marcelo de Oliveira Filho2019-09-101-0/+3
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* mesa/st: Don't expect prog->nir to already existCaio Marcelo de Oliveira Filho2019-09-101-6/+5
| | | | | | | | There's no such case, if we load prog->nir from the shader cache, we shouldn't hit this path. Suggested-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa/st: Add support for SPIR-V shadersCaio Marcelo de Oliveira Filho2019-09-102-50/+110
| | | | | | | | | | | | | | | | | | | | The SPIR-V codepath uses NIR linking, so we have to preprocess after the linking steps, which makes things slightly different than GLSL. To make more clear when the preprocess is happening, I've ended up inlining st_nir_get_mesa_program() into its caller. The goal was to make both GLSL and SPIR-V to use the same preprocess function, the exceptions are: - SPIR-V codepath don't support NIR state slots yet; - GLSL lowers shared memory early, so we don't do the deref lowering for those. For now I didn't bother to rename other functions and files (now that many of them apply to both GLSL and SPIR-V), but we should do this in further patches. Reviewed-by: Timothy Arceri <[email protected]>
* mesa/st: Extract preprocessing NIR stepsCaio Marcelo de Oliveira Filho2019-09-101-14/+15
| | | | | | | | | | | Refactor to split the glsl_to_nir conversion from the preprocessing NIR passes into separate functions, so we can use them in SPIR-V. Unlike in GLSL, there we'll need to perform a few passes with the NIR linker before doing the individual preprocess calls. No behavior should change with this patch. Reviewed-by: Timothy Arceri <[email protected]>
* mesa/st: Lookup parameters without using namesCaio Marcelo de Oliveira Filho2019-09-101-42/+9
| | | | | | | | | | | Use the new MainUniformStorageIndex field in Parameter instead. It was added so we could match those in the SPIR-V case, where names are optional. v2: Use MainUniformStorageIndex for all cases. Reviewed-by: Alejandro Piñeiro <[email protected]> [v1] Reviewed-by: Timothy Arceri <[email protected]>
* mesa/program: Associate uniform storage without using namesCaio Marcelo de Oliveira Filho2019-09-101-7/+1
| | | | | | | | | | Use the new UniformStorageIndex field in Parameter instead. This mechanism was added so we could match those in the SPIR-V case, where names are optional. v2: Use UniformStorageIndex for all cases. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Fill Parameter storage indices even when not using SPIR-VCaio Marcelo de Oliveira Filho2019-09-101-1/+17
| | | | | | | | | When creating Parameters, fill in the associated uniform storage indices, like it is done with the NIR linker used for SPIR-V. This will allow later code to not rely on names (which would never work for SPIR-V where names are optional). Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Fill in the Parameters in NIR linkerCaio Marcelo de Oliveira Filho2019-09-102-1/+12
| | | | | | | | | | | | | | | | | | | | | The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the *first* UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Pack gl_program_parameter structCaio Marcelo de Oliveira Filho2019-09-101-7/+9
| | | | | | | | | | | | | | The gl_register_file doesn't need 16 bits, so shorten it and use the extra room for 'Padded' (also mark it as a single bit). This shrinks the struct size from 32 bytes to 24 bytes. See also 4794fbc86e3 ("mesa: reduce the size of gl_program_parameter") that shrinked from 40 to 24 and later 7536af670b7 ("glsl: fix shader cache for packed param list") that added `Padded`. v2: Use just 5 bits for gl_register_file. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* mesa/st: Do not rely on name to identify special uniformsCaio Marcelo de Oliveira Filho2019-09-101-5/+3
| | | | | | | | | | | Every uniform that have the "gl_" name also have some state slots. So use the state_slots like we did in 57b61849310 ("i965: account for NIR uniforms without name"). This removes the dependency on names, which are optional when using ARB_gl_spirv. Reviewed-by: Alejandro Piñeiro <[email protected]>
* mesa: Eliminate gl_config::rgbModeAdam Jackson2019-09-096-21/+3
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Eliminate gl_config::have{Accum,Depth,Stencil}BufferAdam Jackson2019-09-0911-40/+16
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Remove unused gl_config::indexBitsAdam Jackson2019-09-094-5/+1
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* egl: Add GL_MESA_EGL_sync supportHeinrich Fink2019-09-081-0/+1
| | | | | | | | This commit follow OES_EGL_sync to universially enable use of EGL sync objects with desktop OpenGL contexts. Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel: Stop redirecting state cache to command streamer cache sectionKenneth Graunke2019-09-061-5/+0
| | | | | | | | | | | | | | | | | | This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.
* android: mesa: revert "Enable asm unconditionally"Mauro Rossi2019-09-062-0/+4
| | | | | | | | | | | | | | | | | | | | This patch partially reverts 20294dc ("mesa: Enable asm unconditionally, ...") Android makefile build logic needs to disable assembler optimization in 32bit builds to avoid text relocations for libglapi.so shared Fixes the following build error with Android x86 32bit target: [ 0% 4/477] target SharedLib: libglapi (out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so) FAILED: out/target/product/x86/obj/SHARED_LIBRARIES/libglapi_intermediates/LINKED/libglapi.so ... prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: warning: shared library text segment is not shareable prebuilts/gcc/linux-x86/x86/x86_64-linux-android-4.9/x86_64-linux-android/bin/ld: error: treating warnings as errors clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation) Fixes: 20294dc ("mesa: Enable asm unconditionally, now that gen_matypes is gone.") Signed-off-by: Mauro Rossi <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-061-2/+2
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* intel/dri: finish proper glthreadSergii Romantsov2019-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: dca36d5516d0 (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: Plumb through a way to disable GLSL const loweringConnor Abbott2019-09-053-0/+9
| | | | | | | | | | For radeonsi, we will prefer the NIR pass as it'll generate better code (some index calculation and a single load vs. a load, then index calculation, then another load) and oftentimes NIR optimization can kick in and make all the access indices constant. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Don't lower indirects when linkingConnor Abbott2019-09-051-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I believe this was stuck here early because otherwise nir_opt_copy_prop_vars could undo what lower_io_to_temporaries does. However that has since been fixed. Also, we now use scratch for large variables so the comment is stale. On radeonsi these are the shader-db results: Totals: SGPRS: 3955968 -> 3955968 (0.00 %) VGPRS: 2220208 -> 2220220 (0.00 %) Spilled SGPRs: 11387 -> 11387 (0.00 %) Spilled VGPRs: 97 -> 97 (0.00 %) Private memory VGPRs: 2528 -> 2528 (0.00 %) Scratch size: 2656 -> 2656 (0.00 %) dwords per thread Code Size: 76002108 -> 76002204 (0.00 %) bytes LDS: 740 -> 740 (0.00 %) blocks Max Waves: 772779 -> 772776 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 176 -> 176 (0.00 %) VGPRS: 144 -> 156 (8.33 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 12104 -> 12200 (0.79 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 25 (-10.71 %) Wait states: 0 -> 0 (0.00 %) The few small regressions are due to nir_opt_large_constants kicking in when indirect lowering happens to result in smaller code after optimization since the array is very simple. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Call nir_remove_unused_variables() in the opt loopConnor Abbott2019-09-051-0/+10
| | | | | | | | | | | | | This prevents regressions when disabling indirect lowering. Sometimes the only use of an input array was copying it to the array created by nir_lower_io_to_temporaries, and without lowering indirects we wouldn't have eliminated the temporary array until after linking, which was too late to remove unused code in the producer. No shader-db changes with radeonsi NIR. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>