aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* prog/nir: Simplify some load/store operationsJason Ekstrand2018-04-051-0/+6
| | | | Reviewed-by: Eric Anholt <[email protected]>
* compiler/spirv: set is_shadow for depth comparitor sampling opcodesIago Toral Quiroga2018-04-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | From the SPIR-V spec, OpTypeImage: "Depth is whether or not this image is a depth image. (Note that whether or not depth comparisons are actually done is a property of the sampling opcode, not of this type declaration.)" The sampling opcodes that specify depth comparisons are OpImageSample{Proj}Dref{Explicit,Implicit}Lod, so we should set is_shadow only for these (we were using the deph property of the image until now). v2: - Do the same for OpImageDrefGather. - Set is_shadow to false if the sampling opcode is not one of these (Jason) - Reuse an existing switch statement instead of adding a new one (Jason) Fixes crashes in: dEQP-VK.spirv_assembly.instruction.graphics.image_sampler.depth_property.* Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destinationJason Ekstrand2018-04-031-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we may end up trying to coalesce in a case such as ssa_1 = fadd r1, r2 r3.x = fneg(r2); r3 = vec4(ssa_1, ssa_1.y, ...) and that would cause us to move the writes to r3 from the vec to the fadd which would re-order them with respect to the write from the fneg. In order to solve this, we just don't coalesce if the destination of the vec is not SSA. We could try to get clever and still coalesce if there are no writes to the destination of the vec between the vec and the ALU source. However, since registers only come from phi webs and indirects, the chances of having a vec with a register destination that is actually coalescable into its source is very slim. Shader-db results on Haswell: total instructions in shared programs: 13657906 -> 13659101 (<.01%) instructions in affected programs: 149291 -> 150486 (0.80%) helped: 0 HURT: 592 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440 Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible" Reported-by: Vadym Shovkoplias <[email protected]> Tested-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: always call do_lower_jumps() after loop unrollingTimothy Arceri2018-04-041-0/+18
| | | | | | | | | | | | | | This fixes a bug in radeonsi where LLVM cannot handle the case where a break exists but its not the last instruction in the block. LLVM would fail with: Terminator found in the middle of a basic block! LLVM ERROR: Broken function found, compilation aborted! Fixes: 96fe8834f539 "glsl_to_tgsi: do fewer optimizations with GLSLOptimizeConservatively" Reviewed-by: Matt Turner <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105317
* nir+drivers: add helpers to get # of src/dest componentsRob Clark2018-04-034-14/+26
| | | | | | | | | Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* st/glsl_to_nir: gather next_stage in shader_infoTimothy Arceri2018-04-021-0/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* nir/validator: Validate that all used variables existJason Ekstrand2018-03-301-9/+8
| | | | | | We were validating this for locals but nothing else. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_indirect_derefs: Support interp_var_at intrinsicsJason Ekstrand2018-03-301-2/+11
| | | | | | | This fixes the fs-interpolateAtCentroid-block-array piglit test on i965. Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* nir/vars_to_ssa: Remove copies from the correct setJason Ekstrand2018-03-301-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* nir: Return a cursor from nir_instr_removeJason Ekstrand2018-03-303-19/+18
| | | | | | | | Because nir_instr_remove is an inline wrapper around nir_instr_remove_v, the compiler should be able to tell that the return value is unused and not emit the extra code in most cases. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add src/dest num_components helpersJason Ekstrand2018-03-301-0/+12
| | | | | | We already have these for bit_size Reviewed-by: Kenneth Graunke <[email protected]>
* spirv: s/uint/unsigned/ to fix MSVC buildBrian Paul2018-03-301-1/+1
| | | | Reviewed-by: Neil Roberts <[email protected]>
* nir/spirv: s/uint32_t/SpvOp/ in various functionsBrian Paul2018-03-303-7/+7
| | | | | | | | | | The MSVC compiler warns when the function parameter types don't exactly match with respect to enum vs. uint32_t. Use SpvOp everywhere. Alternately, uint32_t could be used everywhere. There doesn't seem to be an advantage to one over the other. Reviewed-by: Neil Roberts <[email protected]>
* nir/spirv: fix MSVC syntax error in vtn_handle_texture()Brian Paul2018-03-301-1/+2
| | | | Reviewed-by: Neil Roberts <[email protected]>
* nir/spirv: move NORETURN annotation on _vtn_fail() prototypeBrian Paul2018-03-301-2/+4
| | | | | | | This needs to before the function, not after, to compile with MSVC. This works with gcc too. Reviewed-by: Neil Roberts <[email protected]>
* nir/spirv: fix MSVC warning in vtn_align_u32()Brian Paul2018-03-301-1/+1
| | | | | | | Fixes warning that "negation of an unsigned value results in an unsigned value". Reviewed-by: Neil Roberts <[email protected]>
* spirv: Fix building with SConsNeil Roberts2018-03-303-1/+57
| | | | | | | | | | | | | The SCons build broke with commit ba975140d3c9 because a SPIR-V function is called from Mesa main. This adds a convenience library for SPIR-V and adds it to everything that was including nir. It also adds both nir and spirv to drivers/x11/SConscript. Also add nir/spirv modules to osmesa and libgl-gdi targets. (Brian Paul) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105817 Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* nir: s/uint/unsigned/ to fix MSVC/MinGW buildBrian Paul2018-03-302-2/+2
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* nir/spirv: add gl_spirv_validation methodAlejandro Piñeiro2018-03-306-14/+305
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARB_gl_spirv adds the ability to use SPIR-V binaries, and a new method, glSpecializeShader. Here we add a new function to do the validation for this function: From OpenGL 4.6 spec, section 7.2.1" "Shader Specialization", error table: INVALID_VALUE is generated if <pEntryPoint> does not name a valid entry point for <shader>. INVALID_VALUE is generated if any element of <pConstantIndex> refers to a specialization constant that does not exist in the shader module contained in <shader>."" v2: rebase update (spirv_to_nir options added, changes on the warning logging, and others) v3: include passing options on common initialization, doesn't call setjmp on common_initialization v4: (after Jason comments): * Rename common_initialization to vtn_builder_create * Move validation method and their helpers to own source file. * Create own handle_constant_decoration_cb instead of reuse existing one v5: put vtn_build_create refactoring to their own patch (Jason) v6: update after vtn_builder_create method renamed, add explanatory comment, tweak existing comment and commit message (Timothy)
* spirv: add vtn_create_builderAlejandro Piñeiro2018-03-302-17/+38
| | | | | | | | | Refactored from spirv_to_nir, in order to be reused later. Reviewed-by: Timothy Arceri <[email protected]> v2: renamed method (from vtn_builder_create), add explanatory comment (Timothy)
* spirv: Move SPIR-V building to Makefile.spirv.am and spirv/meson.buildIan Romanick2018-03-296-49/+100
| | | | | | | | | | | | | Future changes will add generated files used only from src/compiler/glsl. These can't be built from Makefile.nir.am, and we can't move all the rules from Makefile.nir.am to Makefile.spirv.am (and it would be silly anyway). v2: Do it for meson too. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (the meson bits) Reviewed-by: Alejandro Piñeiro <[email protected]> (the automake bits)
* compiler: All leaf Makefile.am should use +=Ian Romanick2018-03-292-1/+2
| | | | | | | | This slightly simplifies later changes that add more Makefile.*.am files. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* util: Include bitscan.h directlyIan Romanick2018-03-293-0/+3
| | | | | | | | | | | | | | | Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* util: Add and use util_is_power_of_two_nonzeroIan Romanick2018-03-292-10/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* spirv: add support for SPV_AMD_shader_trinary_minmaxDave Airlie2018-03-294-0/+58
| | | | | | | Co-authored-by: Daniel Schürmann <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: add support for min/max/median of 3 srcsDave Airlie2018-03-291-0/+14
| | | | | | | | | | | These are needed for SPV_AMD_shader_trinary_minmax, the AMD HW supports these. Co-authored-by: Daniel Schürmann <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* android: Use new nir intrinsics python scriptsStefan Schake2018-03-281-0/+9
| | | | | | | Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics") Signed-off-by: Stefan Schake <[email protected]> Acked-by: Rob Clark <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: add bindless to nir dataTimothy Arceri2018-03-282-0/+7
| | | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nir/intrinsics: Don't report negative dest_componentsJason Ekstrand2018-03-271-1/+1
| | | | | | | | I have no idea why but having dest_components == -1 was causing a memory leak somewhere. Without this, you can't get through a full shader-db run without running out of memory. Reviewed-by: Rob Clark <[email protected]>
* nir: fix crash in loop unroll corner caseTimothy Arceri2018-03-281-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an if nesting inside anouther if is optimised away we can end up with a loop terminator and following block that looks like this: if ssa_596 { block block_5: /* preds: block_4 */ vec1 32 ssa_601 = load_const (0xffffffff /* -nan */) break /* succs: block_8 */ } else { block block_6: /* preds: block_4 */ /* succs: block_7 */ } block block_7: /* preds: block_6 */ vec1 32 ssa_602 = phi block_6: ssa_552 vec1 32 ssa_603 = phi block_6: ssa_553 vec1 32 ssa_604 = iadd ssa_551, ssa_66 The problem is the phis. Loop unrolling expects the last block in the loop to be empty once we splice the instructions in the last block into the continue branch. The problem is we cant move phis so here we lower the phis to regs when preparing the loop for unrolling. As it could be possible to have multiple additional blocks/ifs following the terminator we just convert all phis at the top level of the loop body for simplicity. We also add some comments to loop_prepare_for_unroll() while we are here. Fixes: 51daccb289eb "nir: add a loop unrolling pass" Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
* nir: fix generated nir_intrinsics.c for MSVCRob Clark2018-03-271-0/+4
| | | | | | | | | | Apparently it is not happy about things like: .foo = {} So skip over initializers for empty lists. Fixes: 76dfed8ae2d5c6c509eb2661389be3c6a25077df Reported-by: Roland Scheidegger <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* nir: mako all the intrinsicsRob Clark2018-03-2711-619/+727
| | | | | | | | | | | | | | | | | | | | | | | I threatened to do this a long time ago.. I probably *should* have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and *mostly* remove side-effects, add autotools build support v4: scons build Signed-off-by: Rob Clark <[email protected]> Acked-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir: fix per_vertex_output intrinsicRob Clark2018-03-271-1/+1
| | | | | | | | | This is supposed to have both BASE and COMPONENT but num_indices was inadvertantly set to 1. Cc: <[email protected]> Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl_types: fix build break with intel/msvc compilerRob Clark2018-03-271-83/+24
| | | | | | | | | | | | | | | | | | | | The VECN() macro was taking advantage of a GCC specific feature that is not available on lesser compilers, mostly for the purposes of avoiding a macro that encoded a return statement. But as suggested by Ian, we could just have the macro produce the entire method body and avoid the need for this. So let's do that instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740 Fixes: f407edf3407396379e16b0be74b8d3b85d2ad7f0 Cc: Emil Velikov <[email protected]> Cc: Timothy Arceri <[email protected]> Cc: Roland Scheidegger <[email protected]> Cc: Ian Romanick <[email protected]> Signed-off-by: Rob Clark <[email protected]> Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: fix infinite loop caused by bug in loop unrolling passTimothy Arceri2018-03-271-1/+1
| | | | | | | | | | | | | | | | | | Just checking for 2 jumps is not enough to be sure we can do a complex loop unroll. We need to make sure we also have also found 2 loop terminators. Without this we were attempting to unroll a loop where the second jump was nested inside multiple ifs which loop analysis is unable to detect as a terminator. We ended up splicing out the first terminator but failed to actually unroll the loop, this resulted in the creation of a possible infinite loop. Fixes: 646621c66da9 "glsl: make loop unrolling more like the nir unrolling path" Tested-by: Gert Wollny <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
* nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditionalIan Romanick2018-03-262-18/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that i965 recognizes that a-b generates the same conditions as 'a < b', there is no reason to condition this transformation on 'is not used by conditional.' Since this was the only user of the is_not_used_by_conditional function, delete it. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14400775 -> 14400595 (<.01%) instructions in affected programs: 36712 -> 36532 (-0.49%) helped: 182 HURT: 26 helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1 helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90% 95% mean confidence interval for instructions value: -0.97 -0.76 95% mean confidence interval for instructions %-change: -0.59% -0.43% Instructions are helped. total cycles in shared programs: 532929592 -> 532926345 (<.01%) cycles in affected programs: 478660 -> 475413 (-0.68%) helped: 187 HURT: 22 helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18 helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03% HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11 HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86% 95% mean confidence interval for cycles value: -19.50 -11.57 95% mean confidence interval for cycles %-change: -1.42% -0.58% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 177851578 -> 177851810 (<.01%) cycles in affected programs: 24408 -> 24640 (0.95%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44% HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54 HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02% 95% mean confidence interval for cycles value: -7.75 85.08 95% mean confidence interval for cycles %-change: -0.39% 1.49% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl_types: vec8/vec16 supportRob Clark2018-03-256-10/+23
| | | | | | | Not used in GL but 8 and 16 component vectors exist in OpenCL. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl_types: refactor/prep for vec8/vec16Rob Clark2018-03-253-149/+49
| | | | | | | | | | | | | Refactor things so there isn't so much typing involved to add new things. Also drops a pointless conditional (out of bounds rows or columns already returns error_type in all paths.. might as well drop it rather than make the check more convoluted in the next patch by adding the vec8/vec16 case). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* compiler: glsl: silence valgrind warning on write cacheLionel Landwerlin2018-03-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't think it actually fixes anything, but that's nice not to have valgrind warnings. It manifests itself when running the piglit test : glsl-fs-raytrace-bug27060 ==2058== Uninitialised byte(s) found during client check request ==2058== at 0xC5BB040: blob_write_bytes (blob.c:152) ==2058== by 0xC595359: write_variable (nir_serialize.c:144) ==2058== by 0xC59560C: write_var_list (nir_serialize.c:192) ==2058== by 0xC5982E4: nir_serialize (nir_serialize.c:1124) ==2058== by 0xC0B729D: brw_program_serialize_nir (brw_program.c:835) ==2058== by 0xC0AB2D6: brw_link_shader (brw_link.cpp:358) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) ==2058== by 0xC1B50AF: update_program (state.c:134) ==2058== by 0xC1B56DF: _mesa_update_state_locked (state.c:352) ==2058== by 0xC1B579A: _mesa_update_state (state.c:386) ==2058== Address 0xf1eab8a is 58 bytes inside a block of size 96 alloc'd ==2058== at 0x4C2CB8F: malloc (vg_replace_malloc.c:299) ==2058== by 0xC0FD306: ralloc_size (ralloc.c:121) ==2058== by 0xC0FD5B1: ralloc_array_size (ralloc.c:208) ==2058== by 0xC452B3B: (anonymous namespace)::nir_visitor::visit(ir_variable*) (glsl_to_nir.cpp:448) ==2058== by 0xC45CE8B: ir_variable::accept(ir_visitor*) (ir.h:428) ==2058== by 0xC46D0B5: visit_exec_list(exec_list*, ir_visitor*) (ir.cpp:1898) ==2058== by 0xC451D2F: glsl_to_nir (glsl_to_nir.cpp:162) ==2058== by 0xC0B5223: brw_create_nir (brw_program.c:79) ==2058== by 0xC0AAB67: brw_link_shader (brw_link.cpp:257) ==2058== by 0xC32FE3F: _mesa_glsl_link_shader (ir_to_mesa.cpp:3169) ==2058== by 0xC36C7ED: create_new_program(gl_context*, state_key*) (ff_fragment_shader.cpp:1127) ==2058== by 0xC36C8A6: _mesa_get_fixed_func_fragment_program (ff_fragment_shader.cpp:1157) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: Rename image intrinsics to image_varJason Ekstrand2018-03-234-52/+52
| | | | | | | | | | | Generated with git grep -l nir_intrinsic_image | xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
* nir: autotools, meson: add GLSL.ext.AMD.h in the files listJuan A. Suarez Romero2018-03-222-0/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* nir: add frexp_exp and frexp_sig opcodesTimothy Arceri2018-03-222-0/+5
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add a 64-bit implementation of FrexpNeil Roberts2018-03-211-4/+56
| | | | | | | | | | The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Migrate nir_dce to instr worklistThomas Helland2018-03-211-35/+18
| | | | | | | | | Shader-db runtime change avarage of five runs: Before 125,77 seconds (+/- 0,09%) After 124,48 seconds (+/- 0,07%) Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>
* nir: Initial implementation of a nir_instr_worklistThomas Helland2018-03-211-0/+67
| | | | | | | | | | Make a simple worklist by basically just wrapping u_vector. This is intended used in nir_opt_dce to reduce the number of calls to ralloc, as we are currenlty spamming ralloc quite bad. It should also give better cache locality and much lower memory usage. Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>
* nir/dead_cf: also remove useless ifsCaio Marcelo de Oliveira Filho2018-03-211-14/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | Generalize the code for remove dead loops to also remove dead if nodes. The conditions are the same in both cases, if the node (and it's children) don't have side-effects AND the nodes after it don't use the values produced by the node. The only difference is when evaluating side effects: loops consider only return jumps as a side-effect -- they can stop execution of nodes after it; 'if' nodes outside loops should consider all kinds of jumps (return, break, continue) since all of them can cause execution of nodes after it to be skipped. After this patch, empty ifs (those which both then and else blocks are empty) will be removed by nir_opt_dead_cf. It caused no change to shader-db, in part because the removal of empty ifs is currently covered by nir_opt_peephole_select. v2: Improve the identification of cases where break/continue can cause side-effects. (Jason) v3: Move code comment changes to a different patch. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir/dead_cf: rephrase definition of a dead loop nodeCaio Marcelo de Oliveira Filho2018-03-211-8/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* st/nir: fix atomic lowering for gallium driversTimothy Arceri2018-03-202-6/+12
| | | | | | | | | | | | | | | i965 and gallium handle the atomic buffer index differently. It was just by luck that the single piglit test for this was passing. For gallium we use the atomic binding so that we match the handling in st_bind_atomics(). On radeonsi this fixes the CTS test: KHR-GL43.shader_storage_buffer_object.advanced-write-fragment It also fixes tressfx hair rendering in Tomb Raider. Reviewed-by: Marek Olšák <[email protected]>
* st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state trackerTimothy Arceri2018-03-204-100/+0
| | | | | | | | | | | | | This will only ever be used by gallium drivers so it probably doesn't belong in the nir toolkit. Also we want to pass it some non NIR things in the following patch. To avoid regressions we wrap the lowering calls that have been moved to st_glsl_to_nir with a quick hack so that they are only called for radeonsi, we will replace the hack with a check for uniform packing in a following patch. Reviewed-by: Marek Olšák <[email protected]>
* mesa: rework ParameterList to allow packingTimothy Arceri2018-03-201-3/+11
| | | | | | | | | | | | | | Currently everything is padded to 4 components. Making the list more flexible will allow us to do uniform packing. V2 (suggestions from Nicolai): - always pass existing calls to _mesa_add_parameter() true for padd_and_align - fix bindless param value offsets - remove left over wip logic from pad and align code - zero out param value padding - whitespace fix Reviewed-by: Marek Olšák <[email protected]>