summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* glsl/ast: don't crash when func_name is NULLDave Airlie2016-06-061-0/+4
| | | | | | | | | | | | This fixes a crash in GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types If we can't find the func_name in one of these paths, we have emitted an earlier error so just return here. Reviewed-by: Timothy Arceri <[email protected]> Cc: "11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl: handle ast_aggregate in has_sequence_subexpression. (v2)Dave Airlie2016-06-061-1/+1
| | | | | | | | | | | | | | | GL43-CTS.compute_shader.work-group-size does uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 }; The initializer triggers the GLSL 4.30/GLES3 tests for constant sequence subexpressions, so it doesn't happen unless you are using those, so just return false as this path is now reachable. v2: update commit msg with diagnosis Acked-by: Timothy Arceri <[email protected]> Cc: "11.2 12.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: Try to unbreak the MSVC build.Kenneth Graunke2016-06-052-0/+8
| | | | | | | PATH_MAX is apparently not a thing on Windows. Borrow the hack from pipe_loader.c to try and make this work. Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files.Kenneth Graunke2016-06-053-0/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This writes linked shader programs to .shader_test files to $MESA_SHADER_CAPTURE_PATH in the format used by shader-db (http://cgit.freedesktop.org/mesa/shader-db). It supports both GLSL shaders and ARB programs. All stages that are linked together are written in a single .shader_test file. This eliminates the need for shader-db's split-to-files.py, as Mesa produces the desired format directly. It's much more reliable than parsing stdout/stderr, as those may contain extraneous messages, or simply be closed by the application and unavailable. We have many similar features already, but this is a bit different: - MESA_GLSL=dump writes to stdout, not files. - MESA_GLSL=log writes each stage to separate files (rather than all linked shaders in one file), at draw time (not link time), with uniform data and state flag info. - Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH) also uses separate files per shader stage, but allows reading in files to replace an app's shader code. v2: Dump ARB programs too, not just GLSL. v3: Don't dump bogus 0.shader_test file. v4: Add "GL_ARB_separate_shader_objects" to the [require] block. v5: Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block. v6: Don't hardcode /tmp/mesa. v7: Fix memoization of getenv(). v8: Also print "SSO ENABLED" (suggested by Timothy). v9: Also handle ES shaders (suggested by Ilia). v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add _mesa_warning calls on error handling (suggested by Ben). v11: Fix crash when variable is unset introduced in v10. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nv50,nvc0: fix BGR10_A2UI vertex formatIlia Mirkin2016-06-051-1/+1
| | | | | | | | This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: do not clear surfaces bins in the validate functionSamuel Pitoiset2016-06-052-5/+2
| | | | | | | | | We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: re-validate images after launching a grid on FermiSamuel Pitoiset2016-06-051-0/+3
| | | | | | | | | | | | | | | | | | | | | Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* radeonsi: fix images with level > 0Marek Olšák2016-06-051-1/+1
| | | | | | | | | | This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43 radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nvc0: reduce overhead from always marking images dirtyIlia Mirkin2016-06-041-9/+36
| | | | | | | | | | We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: reduce overhead from always marking buffers dirtyIlia Mirkin2016-06-041-6/+20
| | | | | | | | | | We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: fix memory barrier flag handlingIlia Mirkin2016-06-041-9/+16
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: mark bound buffer range validIlia Mirkin2016-06-043-0/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* anv/entrypoints: don't go using wayland/xcb unless they are configuredDave Airlie2016-06-051-6/+9
| | | | | | | | | | | | | | The fix in: anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards breaks things if wayland headers aren't installed. Separate things out properly to avoid that problem. [airlied: fixed up to put in pre-existing sections]. Reported-by: Arjan van de Ven Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/radeon: don't use the DMA ring for pipelined buffer uploadsMarek Olšák2016-06-041-5/+4
| | | | | | | | | | | | | | | | | | | | Submitting a DMA IB flushes the GFX IB and all GPU caches. Vedran Miletić said: "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling from 1200p)." Some anonymous dude said: R9 390 results: Tomb Raider (normal settings): 80 -> 88 FPS Talos Principle (custom settings): 23 -> 56 FPS Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS Reviewed-by: Alex Deucher <[email protected]> Tested-by: Vedran Miletić <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: don't flush caches when binding shader resourcesMarek Olšák2016-06-044-31/+26
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: only do necessary cache flushes in cp_dma_copy_bufferMarek Olšák2016-06-041-14/+1
| | | | | | | | | The main impact is that {upload, draw, upload, draw, ..} doesn't flush framebuffer caches before every upload. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: only do necessary cache flushes in cp_dma_clear_bufferMarek Olšák2016-06-042-14/+18
| | | | | | | | The main impact is that fast color clear doesn't flush TC, CONST, DB. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: remove a CP DMA workaround that's not needed anymoreMarek Olšák2016-06-041-6/+0
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: fix CP DMA hazard with index buffer fetches (v3)Marek Olšák2016-06-047-7/+93
| | | | | | | | | v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel, otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: properly sync CP with CP DMA on R6xxMarek Olšák2016-06-041-1/+8
| | | | | | | | This will allow removing useless cache & IB flushes. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: write WAIT_UNTIL in the correct placeMarek Olšák2016-06-041-8/+11
| | | | | | | | | | This has been wrong all along. Fixing this will allow removing useless cache flushes. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memoryMarek Olšák2016-06-043-6/+6
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/u_suballoc: allow different alignment for each allocationMarek Olšák2016-06-048-21/+20
| | | | | | | | | Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* anv/blit: Use CLAMP_TO_EDGE for scaled blitsJason Ekstrand2016-06-031-0/+3
| | | | | | | | | | | When upscaling you can end up interpolating between the edge pixel and one past the edge. Using CLAMP_TO_EDGE seems like the most reasonable thing to do in this case. This fixes two of the new Vulkan CTS tests in dEQP-VK.api.copy_and_blit.blit_image.* Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "12.0" <[email protected]>
* anv/copy: Account for the anv_surface.offset when creating a blit2d_surfJason Ekstrand2016-06-031-17/+17
| | | | | | | | | This was causing problems if the user tried to copy to/from the stencil portion of a combined depth/stencil image. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Make a decoration switch completeJason Ekstrand2016-06-031-3/+1
| | | | | | | | Getting rid of the default case makes the compiler warn if we are missing cases. While we're here, we also add the one missing case. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Make unhandled decorations and capabilities non-fatalJason Ekstrand2016-06-032-18/+36
| | | | | | | | glslang frequently throw bogus decorations into shaders. While we are free to assert-fail, it's a bit nicer to the application to just warn. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Add a way to print non-fatal warningsJason Ekstrand2016-06-032-0/+19
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Add string lookup tables for a couple of SPIR-V enumsJason Ekstrand2016-06-033-0/+179
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Complete the list of capabilitiesJason Ekstrand2016-06-031-3/+45
| | | | | | | | | Previously we supported a subset of capabilities and just left a default case for the others. It's time to stop being lazy and actually audit the capabilities. This should bring them up-to-date with reality. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* anv/pipeline: Add support for early depth stencilJason Ekstrand2016-06-032-2/+17
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* mesa: Get rid of _mesa_active_fragment_shader_has_side_effectsJason Ekstrand2016-06-031-18/+0
| | | | | | | It is no longer used. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/ps_state: Use wm_prog_data.has_side_effectsJason Ekstrand2016-06-032-9/+6
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs Add a wm_prog_data bit for has_side_effectsJason Ekstrand2016-06-032-0/+15
| | | | | | | | | | | This is more accurate than calling _mesa_active_fragment_shader_has_side_effects because it looks at whether or not the SSBOs, images, or atomic buffers are actually written rather than just existing in the program. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/info: Get rid of uses_interp_var_at_offsetJason Ekstrand2016-06-033-10/+0
| | | | | | | | | We were using this briefly in the i965 driver to trigger recompiles but we haven't been using it since we switched to the NIR y-transform lowering pass. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Silently pass tests if depth or stencil is missingJason Ekstrand2016-06-033-3/+35
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* anv/pipeline: Unify gen7/8 emit_ds_stateJason Ekstrand2016-06-033-85/+60
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* genxml/gen6,7,75: s/BackFace/BackfaceJason Ekstrand2016-06-035-8/+8
| | | | | | | | This is more consistent with gen8+ Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Handle the WorkgroupSize builtin decorationJason Ekstrand2016-06-031-0/+22
| | | | | | | | | This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests in the latest dev version of the Vulkan CTS. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: Use breaks instead of returns in constant handlingJason Ekstrand2016-06-031-3/+4
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* anv/pipeline: Refactor specialization constant handling a bitJason Ekstrand2016-06-031-5/+4
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* nir/lower_indirect_derefs: Use the direct array deref for recursionJason Ekstrand2016-06-031-1/+1
| | | | | | | | This fixes about 100 of the new Vulkan CTS tests. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* anv/clear: Handle ClearImage on 3-D imagesJason Ekstrand2016-06-031-2/+4
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* Revert "i965/fs: Allow scalar source regions on SNB math instructions."Francisco Jerez2016-06-033-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit c1107cec44ab030c7fcc97c67baa12df1cc9d7b5. Apparently the hardware spec text I quoted in the commit message was outright lying about scalar source math being supported on SNB, the hardware seems to load 32 contiguous bits of data for each channel regardless of the regioning mode. Fixes regressions in the following CTS tests (which we didn't catch early due to CTS being temporarily disabled in our CI system): es2-cts.gtf.gl.atan.atan_vec3_frag_xvary es2-cts.gtf.gl.cos.cos_vec2_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvary es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_float_frag_xvary es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf es2-cts.gtf.gl.cos.cos_vec3_frag_xvary es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346 Reported-by: Mark Janes <[email protected]> Acked-by: Matt Turner <[email protected]>
* i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).Francisco Jerez2016-06-031-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | The conditional mod of these instructions determines the semantics of the comparison itself (rather than being evaluated based on the result of the instruction as is usually the case for most other instructions that allow conditional mods), so it's in general not legal to propagate a conditional mod into a CMP instruction. This prevents cmod propagation from (mis)optimizing: cmp.z.f0 tmp, ... mov.z.f0 null, tmp into: cmp.z.f0 tmp, ... which gives the negation of the flag result of the original sequence. I originally noticed this while working on SIMD32 in the scalar back-end, but the same scenario is likely to be possible in vec4 programs so this commit ports the bugfix with the same name from the scalar back-end to the vec4 cmod propagation pass. Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]>
* anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGSEmil Velikov2016-06-041-0/+2
| | | | | | | | | | Otherwise we will fail to find the headers in some scenarios. Cc: <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reported-by: Tobias Klausmann <[email protected]> Tested-by: Tobias Klausmann <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
* nir: automake: add nir_search_helpers.h to the sources list(s)Emil Velikov2016-06-041-0/+1
| | | | | | Fixes: dfbae7d64f4 ("nir/algebraic: support for power-of-two optimizations") Signed-off-by: Emil Velikov <[email protected]>
* freedreno/ir3: do idiv lowering after main opt loopRob Clark2016-06-031-16/+27
| | | | | | | Give algebraic-opt pass a chance to catch udiv by const power-of-two, before running lower-idiv pass. Signed-off-by: Rob Clark <[email protected]>
* nir/algebraic: support for power-of-two optimizationsRob Clark2016-06-036-5/+128
| | | | | | | | | | | | Some optimizations, like converting integer multiply/divide into left/ right shifts, have additional constraints on the search expression. Like requiring that a variable is a constant power of two. Support these cases by allowing a fxn name to be appended to the search var expression (ie. "a#32(is_power_of_two)"). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radeonsi: mark buffer texture range valid for shader imagesNicolai Hähnle2016-06-031-0/+23
| | | | | | | | | | | | | When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer <[email protected]> Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>