summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* spirv: add support for doubles to OpSpecConstantSamuel Iglesias Gonsálvez2017-01-095-8/+55
| | | | | | | | | v2 (Jason): - Fix indent in radv change - Add vtn_u64_literal() helper to take 64 bits (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir: add (un)packDouble2x32() translationSamuel Iglesias Gonsálvez2017-01-091-0/+2
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir: implement DF conversionsSamuel Iglesias Gonsálvez2017-01-093-13/+23
| | | | | | | | SPIR-V does not have special opcodes for DF conversions. We need to identify them by checking the bit size of the operand and the result. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add nir_type_conversion_op()Samuel Iglesias Gonsálvez2017-01-092-0/+83
| | | | | | | | | | | | | | This function returns the nir_op corresponding to the conversion between the given nir_alu_type arguments. This function lacks support for integer-based types with bit_size != 32 and for float16 conversion ops. v2: - Improve readiness of the code and delete cases that don't happen now (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add nir_get_nir_type_for_glsl_type()Samuel Iglesias Gonsálvez2017-01-091-0/+24
| | | | | | | | | | | v2 (Jason): - Refactor nir_get_nir_type_for_glsl_type() to avoid using unneeded helpers (Jason) v3: - Use return directly (Jason) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add support for doubles on OpComposite{Insert,Extract}Samuel Iglesias Gonsálvez2017-01-091-0/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Enable double floating points when copying variables in ↵Samuel Iglesias Gonsálvez2017-01-091-0/+1
| | | | | | | _vtn_variable_copy() Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add double support to _vtn_block_load_store()Samuel Iglesias Gonsálvez2017-01-091-0/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add double support to _vtn_variable_load_storeSamuel Iglesias Gonsálvez2017-01-091-0/+1
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add double support to SpvOpCompositeExtractSamuel Iglesias Gonsálvez2017-01-091-2/+14
| | | | | | | | v2 (Jason): - Add asserts. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix SpvOpSpecConstantOp with SpvOpVectorShuffle working with ↵Samuel Iglesias Gonsálvez2017-01-091-12/+40
| | | | | | | | | | | | | | | | | double-based vecs We need to pick two 32-bit values per component to perform the right shuffle operation. v2 (Jason): - Add assert to check matching bit sizes (Jason) - Simplify the code to pick components (Jason) v3: - Switch on bit_size once (Jason) - Add comment to explain the constant value for unused components (Erik) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add DF support to SpvOp*ConstantCompositeSamuel Iglesias Gonsálvez2017-01-091-3/+11
| | | | | | | | v2 (Jason): - Add assert. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add DF support to vtn_const_ssa_value()Samuel Iglesias Gonsálvez2017-01-091-3/+5
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add support for loading DF constantsSamuel Iglesias Gonsálvez2017-01-091-2/+10
| | | | | | | | v2 (Jason): - Add assert. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add definition of double based data typesSamuel Iglesias Gonsálvez2017-01-091-2/+4
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix typo in spec_constant_decoration_cb()Samuel Iglesias Gonsálvez2017-01-091-2/+2
| | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: drop unused fields in physical device.Dave Airlie2017-01-091-6/+0
| | | | Signed-off-by: Dave Airlie <[email protected]>
* i965: call intel_prepare_render always when reading pixelsTapani Pälli2017-01-091-6/+6
| | | | | | | | | | | | | | Currently we do this only in the fallback code (when tiled memcpy version failed) but it needs to be done always so that we have correct read and write buffer in place. No regressions seen in CI. Fixes: dEQP-EGL.functional.buffer_age.* Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98330 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* st/mesa: pass gl_program to st_bind_ubos()Timothy Arceri2017-01-091-18/+18
| | | | | | We no longer need anything from gl_linked_shader. Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: pass gl_program to st_bind_images()Timothy Arceri2017-01-091-24/+22
| | | | | | We no longer need anything from gl_linked_shader. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: stop passing gl_linked_shader to set_affected_state_flags()Timothy Arceri2017-01-091-7/+6
| | | | | | We now get everything we need from the gl_program param. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa/glsl: set num_images directly in shader_infoTimothy Arceri2017-01-096-20/+13
| | | | | | This change also removes the now duplicate NumImages field. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa: pass gl_program to st_bind_ssbos()Timothy Arceri2017-01-091-21/+21
| | | | | | We no longer need to pass gl_shader_program. Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: add another comparison simplificationTimothy Arceri2017-01-091-0/+2
| | | | | | | | | | | | | | | | On BDW: total instructions in shared programs: 13061877 -> 13060965 (-0.01%) instructions in affected programs: 133569 -> 132657 (-0.68%) helped: 566 HURT: 0 total cycles in shared programs: 256611784 -> 256599536 (-0.00%) cycles in affected programs: 861016 -> 848768 (-1.42%) helped: 379 HURT: 73 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Turn bcsel of +/- 1.0 and 0.0 into b2f sequences.Kenneth Graunke2017-01-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On BDW: total instructions in shared programs: 13074882 -> 13068703 (-0.05%) instructions in affected programs: 1823116 -> 1816937 (-0.34%) helped: 4187 HURT: 537 total cycles in shared programs: 256622718 -> 256425382 (-0.08%) cycles in affected programs: 123790120 -> 123592784 (-0.16%) helped: 3823 HURT: 2037 total spills in shared programs: 15276 -> 14929 (-2.27%) spills in affected programs: 9446 -> 9099 (-3.67%) helped: 352 HURT: 1 total fills in shared programs: 20496 -> 20144 (-1.72%) fills in affected programs: 13040 -> 12688 (-2.70%) helped: 352 HURT: 1 LOST: 2 GAINED: 21 v2: Rely on 'a' being a well-formed boolean (Connor, Eric). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Convert ineg(b2i(a)) to a if it's a boolean.Kenneth Graunke2017-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On BDW: total instructions in shared programs: 13071119 -> 13070371 (-0.01%) instructions in affected programs: 83424 -> 82676 (-0.90%) helped: 505 HURT: 45 (all TCS, all hurt by a single instruction) total cycles in shared programs: 256601322 -> 256588932 (-0.00%) cycles in affected programs: 819410 -> 807020 (-1.51%) helped: 450 HURT: 57 total loops in shared programs: 2950 -> 2942 (-0.27%) loops in affected programs: 8 -> 0 helped: 7 HURT: 0 v2: Drop unnecessary 'a@bool' annotation (Connor, Eric). Add a comment explaining the rule (Ian). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move TES input VUE map calculation out a layer.Kenneth Graunke2017-01-073-9/+11
| | | | | | | | | | | In Vulkan, we'll compile the TCS and TES at the same time, so I can just pass the TCS output VUE map to brw_compile_tes as the TES input VUE map. So, we only need to do this in GL. Move it to the GL-specific layer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Pass NULL for gl_program when compiling TES.Kenneth Graunke2017-01-071-1/+1
| | | | | | | | This isn't needed, and Vulkan doesn't have one. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Move TES spacing/domain/topology setup to brw_compile_tes().Kenneth Graunke2017-01-072-33/+34
| | | | | | | | Moving this down a layer lets us share code between Vulkan and GL. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Access TES shader info via NIR.Kenneth Graunke2017-01-071-6/+6
| | | | | | | | NIR exists in both GL and Vulkan, but gl_program is GL specific. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Introduce a compiler enum for tessellation spacing.Kenneth Graunke2017-01-0711-47/+54
| | | | | | | | | | It feels weird using GL_* enums in a Vulkan driver. v2: Fix the TESS_SPACING -> PIPE_TESS_SPACING conversion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* compiler: Change shader_info->tes.vertex_order into a ccw boolean.Kenneth Graunke2017-01-074-13/+7
| | | | | | | | | | The vertex order is either clockwise or counterclockwise. We can just store a "ccw" boolean rather than GLenum values. I don't want to use GLenums in a Vulkan driver, and even in GL a simple boolean works fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pipeline: Call NIR passes using NIR_PASS_VJason Ekstrand2017-01-071-31/+15
| | | | | | This lets us get validation without having to do it manually. Reviewed-by: Timothy Arceri <[email protected]>
* anv/pipeline: Only call remove_dead_variables onceJason Ekstrand2017-01-071-3/+3
| | | | | | | It can handle multiple modes at a time now so there's no reason to call it repeatedly. Reviewed-by: Timothy Arceri <[email protected]>
* Revert recent GLSL slot counting fiasco.Kenneth Graunke2017-01-075-62/+14
| | | | | | | | | | | | | | | | | | | | | | | | I apparently broke mark_whole_variable in ir_set_program_inouts. It was passing a type that wasn't var->type, so the wrapper didn't work out. It's all broken, revert it and start over. Fixes all kinds of things on other drivers. Revert "glsl: Make is_fixed_function_array actually check for varyings." This reverts commit 42699e12711668a142b7acf11c168cf4301c1295. Revert "glsl: Mark whole variable used for ClipDistance and TessLevel*." This reverts commit 5c580e64cc206ab160e1767c42e4d6c81f67da4d. Revert "glsl: Override the # of varying slots for ClipDistance and TessLevel*." This reverts commit 8b5749f65ac434961308ccb579fb8a816e4f29d5. Revert "glsl: Create and use a new ir_variable::count_attribute_slots() wrapper." This reverts commit 6aa5cb34d03765b7be8611aa516bc201bd337f73.
* glsl: Make is_fixed_function_array actually check for varyings.Kenneth Graunke2017-01-071-0/+4
| | | | | | | | | | | We can't check VARYING_SLOT_* locations until we've determined that the variable is actually a varying. Fixes assert failures in drivers which actually use this path, such as radeonsi and i915. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99314 Signed-off-by: Kenneth Graunke <[email protected]>
* drirc: Allow extension midshader for Divinity: Original Sin (EE)Kai Wasserbäch2017-01-071-0/+4
| | | | | | | | See also <https://bugs.freedesktop.org/show_bug.cgi?id=93551#c27> where this was first observed as a requirement. Signed-off-by: Kai Wasserbäch <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* glsl: fix opt_minmax redundancy checks against baserangeTimothy Arceri2017-01-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marking operations as redundant if they are equal to the base range is fine when the tree structure is something like this: max / \ max b / \ 3 max / \ 3 a But the opt falls apart with a tree like this: max / \ max max / \ / \ 3 a b 3 The problem is that both branches are treated the same: descending in the left branch will prune the constant, and then descending the right branch will prune the constant there as well, because limits[0] wasn't updated to take the change on the left branch into account, and so we still get [3,\infty) as baserange. In order to fix the bug we just disable the marking of redundant expressions when they match the baserange. NIR algebraic opt will clean up the first tree for anyway, hopefully other backends are smart enough to do this also. Cc: "13.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* i965/compiler: Use the new nir_opt_copy_prop_vars passJason Ekstrand2017-01-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We run this after nir_lower_vars_to_ssa so that as many load/store_var intrinsics as possible before copy_prop_vars executes. This is because the pass isn't particularly efficient (it does a lot of linear walks of a linked list) so we'd like as much of the work as possible to be done before copy_prop_vars runs. Shader DB results on Sky Lake: total instructions in shared programs: 12020290 -> 12013627 (-0.06%) instructions in affected programs: 26033 -> 19370 (-25.59%) helped: 16 HURT: 13 total cycles in shared programs: 137772848 -> 137549012 (-0.16%) cycles in affected programs: 6955660 -> 6731824 (-3.22%) helped: 217 HURT: 237 total loops in shared programs: 3208 -> 3208 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 4112 -> 4057 (-1.34%) spills in affected programs: 483 -> 428 (-11.39%) helped: 2 HURT: 0 total fills in shared programs: 5519 -> 5102 (-7.56%) fills in affected programs: 993 -> 576 (-41.99%) helped: 2 HURT: 0 LOST: 0 GAINED: 0 Broadwell had similar results. On older hardware, the impact isn't as large because they don't advertise GL 4.5. Of the hurt programs, all but one are hurt by a single instruction and the one is hurt by 3 instructions. All of the helped programs, on the other hand, are helped by at least 3 instructions and one kerbal space program shader is helped by 44.59%. The real star of the show, however, is the Gl43CSDof synmark2 benchmark which has two shaders which are cut by 28% and 40% and the over-all runtime performance of the benchmark on my Sky Lake laptop is improved by around 25-30% (it's a bit hard to be exact due to thermal throttling). Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a local variable-based copy propagation passJason Ekstrand2017-01-063-0/+816
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/builder: Add a helper for getting the most recently added instructionJason Ekstrand2017-01-061-0/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/builder: Add a load_deref_var helperJason Ekstrand2017-01-061-0/+16
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/dead_variables: Remove shader-local variables that are only writtenJason Ekstrand2017-01-061-9/+60
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/dead_variables: Removed shared variables when requestedJason Ekstrand2017-01-061-0/+3
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8Jason Ekstrand2017-01-061-2/+2
| | | | | | | | | | | | | | | | Because border color is handled pre-swizzle, when we move the alpha channel around in the format, the OPAQUE_BLACK border colors don't work correctly on B4G4R4A4_UNORM_PACK16 with the hack. This fixes the following Vulkan CTS tests on Broadwell: dEQP-VK.pipeline.sampler.view_type.2d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.1d_array.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.2d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.1d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black dEQP-VK.pipeline.sampler.view_type.3d.format.b4g4r4a4_unorm_pack16.address_modes.all_mode_clamp_to_border_opaque_black Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* isl: Mark A4B4G4R4_UNORM as supported on gen8Jason Ekstrand2017-01-061-1/+4
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* radv: fix depth transitions with layerCount = VK_REMAINING_ARRAY_LAYERSPierre-Loup A. Griffais2017-01-071-1/+1
| | | | | | | | | Interpreting layerCount literally would try to create billions of image views in radv_process_depth_image_inplace(). Signed-off-by: Pierre-Loup A. Griffais <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: Rework gl_TessLevel*[] handling to use NIR compact arrays.Kenneth Graunke2017-01-0610-364/+92
| | | | | | | | | | | | | | | | | | | | | | | | Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel* to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Inline store_output helper in quads workaround code.Kenneth Graunke2017-01-061-14/+10
| | | | | | | | | It's only used in one place, it ignores the offset parameter currently, and I want to add more parameters...at which point, passing in a bunch of integers seems less obvious than writing it out. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make glsl_to_nir compact scalar TessLevel arrays.Kenneth Graunke2017-01-061-1/+12
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>