summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Tidy bogus indentation left by previous commitKenneth Graunke2019-04-221-26/+24
| | | | | | | | | I left code indented one level too far in the previous commit to make the diff easier to review. Drop that extra level now. Fixes: 6981069fc80 i965: Ignore uniform storage for samplers or images, use binding info Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Ignore uniform storage for samplers or images, use binding infoKenneth Graunke2019-04-223-18/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gl_nir_lower_samplers_as_deref creates new top level sampler and image uniforms which have been split from structure uniforms. i965 assumed that it could walk through gl_uniform_storage slots by starting at var->data.location and walking forward based on a simple slot count. This assumed that structure types were walked in a particular order. With samplers and images split out of structures, it becomes impossible to assign meaningful locations. Consider: struct S { sampler2D a; sampler2D b; } s[2]; The gl_uniform_storage locations for these follow this map: 0 => a[0], 1 => b[0], 2 => a[0], 3 => b[0]. But the new split variables look like: sampler2D lowered_a[2]; sampler2D lowered_b[2]; and there is no way to know that there's effectively a stride to get to the location for successive elements of a[] or b[]. So, working with location becomes effectively impossible. Ultimately, the point of looking at uniform storage was to pull out the bindings from the opaque index fields. gl_nir_lower_samplers_as_derefs can obtain this information while doing the splitting, however, and sets up var->data.binding to have the desired values. We move gl_nir_lower_samplers before brw_nir_lower_image_load_store so gl_nir_lower_samplers_as_derefs has the opportunity to set proper image bindings. Then, we make the uniform handling code skip sampler(-array) variables, and handle image param setup based on var->data.binding. Fixes Piglit tests/spec/glsl-1.10/execution/samplers/uniform-struct, this time without regressing dEQP-GLES2.functional.uniform_api.random.3. Fixes: f003859f97c nir: Make gl_nir_lower_samplers use gl_nir_lower_samplers_as_deref Reviewed-by: Jason Ekstrand <[email protected]>
* i965: implement WaEnableStateCacheRedirectToCSLionel Landwerlin2019-04-182-0/+6
| | | | | | | | | | | | This 3d performance workaround was initially put in the kernel but the media driver requires different settings so the register has been whitelisted in i915 [1] and userspace drivers are left initializing it as they wish. [1] : https://patchwork.freedesktop.org/series/59494/ Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/perf: drop counter size fieldLionel Landwerlin2019-04-172-5/+6
| | | | | | | We can deduct the size from another field, let's just save some space. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: perf: add mdapi pipeline statistics queries on gen10/11Lionel Landwerlin2019-04-171-1/+9
| | | | | | | | | The Gen10+ expected format adds an additional counter which we can't disclose yet. We can still make the size of the expected query result match. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move mdapi guid into intel/perfLionel Landwerlin2019-04-171-2/+1
| | | | | | | One more thing we want to share between the different APIs. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move mdapi result data format to intel/perfLionel Landwerlin2019-04-173-96/+10
| | | | | | | We want to reuse this in Anv. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move brw_timebase_scale to device infoLionel Landwerlin2019-04-175-19/+15
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move OA accumulation code to intel/perfLionel Landwerlin2019-04-173-167/+45
| | | | | | | We'll want to reuse this in our Vulkan extension. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move mdapi data structure to intel/perfLionel Landwerlin2019-04-171-96/+7
| | | | | | | We'll want to reuse those structures later on. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: extract performance query metricsLionel Landwerlin2019-04-1723-148117/+206
| | | | | | | | | | We would like to reuse performance query metrics in other APIs. Let's make the query code dealing with the processing of raw counters into human readable values API agnostic. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: store device revision in gen_device_infoLionel Landwerlin2019-04-173-6/+4
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move program key debugging to the compiler.Kenneth Graunke2019-04-169-283/+36
| | | | | | | | | | | | | | | | | | | The i965 driver has a bunch of code to compare two sets of program keys and print out the differences. This can be useful for debugging why a shader needed to be recompiled on the fly due to non-orthogonal state dependencies. anv doesn't do recompiles, so we didn't need to share this in the past - but I'd like to use it in iris. This moves the bulk of the code to the compiler where it can be reused. To make that possible, we need to decouple it from i965 - we can't get at the brw program cache directly, nor use brw_context to print things. Instead, we use compiler->shader_perf_log(), and simply pass in keys. We put all of this debugging code in brw_debug_recompile.c, and only export a single function, for simplicity. I also tidied the code a bit while moving it, now that it all lives in one file. Reviewed-by: Jordan Justen <[email protected]>
* nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possibleMarek Olšák2019-04-161-0/+1
| | | | Acked-by: Timothy Arceri <[email protected]>
* compiler/glsl: handle case where we have multiple users for typesTapani Pälli2019-04-161-0/+4
| | | | | | | | | | | | | | | | | | Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* Delete autotoolsDylan Baker2019-04-1513-955/+0
| | | | | | | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Marek Olšák <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Matt Turner <[email protected]>
* st/mesa: add support for EXT_shader_image_load_formattedRhys Perry2019-04-151-0/+1
| | | | | | | | v3: rebase Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v2) Signed-off-by: Marek Olšák <[email protected]>
* mesa, glsl: add support for EXT_shader_image_load_formattedRhys Perry2019-04-152-0/+2
| | | | | | | | v3: rebase Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (v2) Signed-off-by: Marek Olšák <[email protected]>
* intel: Emit 3DSTATE_VF_STATISTICS dynamicallyKenneth Graunke2019-04-142-6/+24
| | | | | | | | | | | | | | | | | | | | | Pipeline statistics queries should not count BLORP's rectangles. (23) How do operations like Clear, TexSubImage, etc. affect the results of the newly introduced queries? DISCUSSION: Implementations might require "helper" rendering commands be issued to implement certain operations like Clear, TexSubImage, etc. RESOLVED: They don't. Only application submitted rendering commands should have an effect on the results of the queries. Piglit's arb_pipeline_statistics_query-vert_adj exposes this bug when the driver is hacked to always perform glBufferData via a GPU staging copy (for debugging purposes). Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: make nir_const_value scalarKarol Herbst2019-04-141-4/+4
| | | | | | | | | v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* mesa: don't overwrite existing shader files with MESA_SHADER_CAPTURE_PATHMarek Olšák2019-04-121-3/+17
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* ac/nir_to_llvm: add image bindless supportTimothy Arceri2019-04-121-0/+2
| | | | | | With this all piglit bindless image tests pass on radeonsi. Reviewed-by: Marek Olšák <[email protected]>
* glsl/nir: add support for lowering bindless images_derefsKarol Herbst2019-04-121-1/+1
| | | | | | | | | | | v2: handle atomics as well make use of nir_rewrite_image_intrinsic v3: remove call to nir_remove_dead_derefs v4: (Timothy Arceri) dont actually call lowering yet Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v3) Reviewed-by: Marek Olšák <[email protected]>
* nir/i965/freedreno/vc4: add a bindless bool to type size functionsTimothy Arceri2019-04-122-9/+13
| | | | | | | This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <[email protected]>
* nir: move brw_nir_rewrite_image_intrinsic into common codeKarol Herbst2019-04-121-1/+1
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swrast: fix undefined shift of 1 << 31Dave Airlie2019-04-121-1/+1
| | | | | | | Pointed out by coverity Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* intel/common: move gen_debug to intel/devMark Janes2019-04-104-4/+4
| | | | | | | | | libintel_common depends on libintel_compiler, but it contains debug functionality that is needed by libintel_compiler. Break the circular dependency by moving gen_debug files to libintel_dev. Suggested-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st: Lower uniforms in st in the !PIPE_CAP_PACKED_UNIFORMS case as well.Eric Anholt2019-04-104-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | PIPE_CAP_PACKED_UNIFORMS conflates several things: Lowering uniforms i/o at the st level instead of the backend, packing uniforms with no padding at all, and lowering to UBOs. Requiring backends to lower uniforms i/o for !PIPE_CAP_PACKED_UNIFORMS leads to the driver needing to either link against the type size function in mesa/st, or duplicating it in the backend. Given that all backends want this lower-io as far as I can tell, just move it to mesa/st to resolve the link issue and avoid the driver author needing to understand st's uniforms layout. Incidentally, fixes uniform layout failures in nouveau in: dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_array_vertex and I think in Lima as well. v2: fix indents Reviewed-by: Kenneth Graunke <[email protected]>
* virgl: Enable passing arrays as input to fragment shadersGert Wollny2019-04-101-7/+57
| | | | | | | | | | | | | | | | | | | This is needed to properly handle interpolateAt* when the input to be interpolated is passed as array in the original GLSL. Currently, the the GLSL compiler would lower selecting the correct input so that the interpolant parameter to interpolateAt* is a temporary, and this can not be used to create a valid shader on the host side, because here the parameter must a shader input. By allowing the passing the created TGSI allows to create proper GLSL. This is related to the virglrenderer bug https://gitlab.freedesktop.org/virgl/virglrenderer/issues/74 v2: Squash the two patches handling these flags into another Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* gallium: Add PIPE_CAP_COMPUTE_SHADER_DERIVATIVESCaio Marcelo de Oliveira Filho2019-04-081-0/+1
| | | | | | | | To enable NV_compute_shader_derivatives, which allows derivatives (and texture lookups with implicit derivatives) in compute shaders. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Advertise NV_compute_shader_derivativesCaio Marcelo de Oliveira Filho2019-04-081-0/+1
| | | | Reviewed-by: Ian Romanick <[email protected]>
* glsl: Parse and propagate derivative_group to shader_infoCaio Marcelo de Oliveira Filho2019-04-081-0/+6
| | | | | | | | | | | | NV_compute_shader_derivatives allow selecting between two possible arrangements (quads and linear) when calculating derivatives and certain subgroup operations in case of Vulkan. So parse and propagate those up to shader_info.h. v2: Do not fail when ARB_compute_variable_group_size is being used, since we are still clarifying what is the right thing to do here. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Extension boilerplate for NV_compute_shader_derivativesCaio Marcelo de Oliveira Filho2019-04-082-0/+2
| | | | Reviewed-by: Ian Romanick <[email protected]>
* nir/radv: remove restrictions on opt_if_loop_last_continue()Timothy Arceri2019-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <[email protected]> (v1) Acked-by: Samuel Pitoiset <[email protected]>
* intel: add dependency on genxml generated filesLionel Landwerlin2019-04-081-1/+1
| | | | | | | | | | Drivers using genxml will start compilation before generated files are created, so add a dependency to it. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Cc: [email protected]
* mesa/main: Fix multisample texture initializeIllia Iorin2019-04-051-13/+25
| | | | | | | | | | | | | | | Sampler of Multisample textures wasn't initialized correct. So when texture object created as multisample its sampler is initialized in a individual case. We change the initial state of TEXTURE_MIN_FILTER and TEXTURE_MAG_FILTER to NEAREST. These changes are approved by KhronosGroup. https://github.com/KhronosGroup/OpenGL-API/issues/45 Signed-off-by: Sergii Romantsov <[email protected]> Signed-off-by: Illia Iorin <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109057
* glsl: remember which SSBOs are not read-only and pass it to galliumMarek Olšák2019-04-042-1/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium: add writable_bitmask parameter into set_shader_buffersMarek Olšák2019-04-042-3/+3
| | | | | | | to indicate write usage per buffer. This is just a hint (it will be used by radeonsi). Reviewed-by: Timothy Arceri <[email protected]>
* st/mesa: Fix GL_MAP_COLOR with glDrawPixels GL_COLOR_INDEXDanylo Piliaiev2019-04-041-2/+32
| | | | | | | | | | | | | | | | | | | Documentation for glDrawPixels with GL_COLOR_INDEX says: "If the GL is in color index mode, and if GL_MAP_COLOR is true, the index is replaced with the value that it references in lookup table GL_PIXEL_MAP_I_TO_I" We are always in RGBA mode and there is nothing in documentation about GL_MAP_COLOR in RGBA mode for GL_COLOR_INDEX. Scale and bias are also only applicable for RGBA format and not mentioned for GL_COLOR_INDEX. Thus the behaviour will be on par with i965. Fixes: gl-1.0-drawpixels-color-index Signed-off-by: Danylo Piliaiev <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/nir: run st_nir_opts after 64bit ops loweringTapani Pälli2019-04-041-1/+1
| | | | | | | | CID: 1444309 Fixes: 9ab1b1d0227 "st/nir: Move 64-bit lowering later" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* gallium: implement ARB/KHR_parallel_shader_compileMarek Olšák2019-04-011-1/+58
|
* mesa: implement ARB/KHR_parallel_shader_compileMarek Olšák2019-04-018-0/+44
| | | | Tested by piglit.
* meson: strip rpath from megadriversEric Engestrom2019-04-011-0/+3
| | | | | | | | | | More specifically, use the library file that has been post-processed by Meson when creating the hardlinks. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108766 Fixes: 3218056e0eb375eeda47 "meson: Build i965 and dri stack" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: perf: update render basic configs for big core gen9/gen10Lionel Landwerlin2019-04-018-23/+24
| | | | | | | | | This updates allows an MI_LRI to trigger a OA report write in the global OA buffer. This isn't really useful for us, we just keep close to the internal public configs. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add ring busyness metric for cfl gt2Lionel Landwerlin2019-04-011-1/+165
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: enable Icelake metricsLionel Landwerlin2019-03-313-3/+11
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add Icelake metricsLionel Landwerlin2019-03-311-0/+11899
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: drop programming of an unused NOA registerLionel Landwerlin2019-03-311-11/+6
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: hsw: drop register programming not needed on HSWLionel Landwerlin2019-03-311-2/+1
| | | | | | | This register is flagged as IVB only in the documentation. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: chv: fixup counters namesLionel Landwerlin2019-03-311-25/+25
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>