summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* Revert MR 369 (Fix extract_i8 and extract_u8 for 64-bit integers)Kenneth Graunke2019-03-091-24/+10
| | | | | | | This broke piles of image load store tests (179 failures on CI, mesa_master build #15546, previous build right before this landed was green). I'd rather not leave the tree on fire over the weekend, so let's revert for now, and we can figure out what happened next week.
* nir/algebraic: Add missing 16-bit extract_[iu]8 patternsIan Romanick2019-03-081-0/+3
| | | | | | | | | | No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Add missing 64-bit extract_[iu]8 patternsIan Romanick2019-03-081-0/+3
| | | | | | | | | | No shader-db changes on any Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Remove redundant extract_[iu]8 patternsIan Romanick2019-03-081-14/+4
| | | | | | | | No shader-db changes on any Intel platform. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Fix up extract_[iu]8 after loop unrollingIan Romanick2019-03-081-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | Skylake, Broadwell, and Haswell had similar results. (Skylake shown) total instructions in shared programs: 15256840 -> 15256837 (<.01%) instructions in affected programs: 4713 -> 4710 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.06% max: 0.08% x̄: 0.06% x̃: 0.06% total cycles in shared programs: 372286583 -> 372286583 (0.00%) cycles in affected programs: 198516 -> 198516 (0.00%) helped: 1 HURT: 1 helped stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 helped stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01% HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10 HURT stats (rel) min: 0.01% max: 0.01% x̄: 0.01% x̃: 0.01% No changes on any other Intel platform. v2: Use a loop to generate patterns. Suggested by Jason. Reviewed-by: Matt Turner <[email protected]> [v1] Reviewed-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/linker: fix ARRAY_SIZE query with xfb varyingsAlejandro Piñeiro2019-03-081-1/+2
| | | | | | For a non-array varying, it is expecting ARRAY_SIZE as 1, instead of 0. Reviewed-by: Timothy Arceri <[email protected]>
* nir/linker: Fix TRANSFORM_FEEDBACK_BUFFER_INDEXAntia Puentes2019-03-081-1/+11
| | | | | | | | | | | | | | | | | | | | | | From the ARB_enhanced_layouts specification: "For the property TRANSFORM_FEEDBACK_BUFFER_INDEX, a single integer identifying the index of the active transform feedback buffer associated with an active variable is written to <params>. For variables corresponding to the special names "gl_NextBuffer", "gl_SkipComponents1", "gl_SkipComponents2", "gl_SkipComponents3", and "gl_SkipComponents4", -1 is written to <params>." We were storing the xfb_buffer value, instead of the value corresponding to GL_TRANSFORM_FEEDBACK_BUFFER_INDEX. Note that the implementation assumes that varyings would be sorted by offset and buffer. Signed-off-by: Antia Puentes <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/linker: use nir_gather_xfb_infoAlejandro Piñeiro2019-03-081-186/+54
| | | | | | | | | | | | | Instead of a custom ARB_gl_spirv xfb gather info pass. In fact, this is not only about reusing code, but the current custom code was not handling properly how many varyings are enumerated from some complex types. So this change is also about fixing some corner cases. v2: Use util_bitcount, simplify current stage check (Kenneth) Reviewed-by: Timothy Arceri <[email protected]>
* nir/xfb: handle arrays and AoA of basic typesAlejandro Piñeiro2019-03-081-10/+32
| | | | | | | | | | | | | | | | | | On OpenGL, a array of a simple type adds just one varying. So gl_transform_feedback_varying_info struct defined at mtypes.h includes the parameters Type (base_type) and Size (number of elements). This commit checks this when the recursive add_var_xfb_outputs call handles arrays, to ensure that just one is addded. We also need to take into account AoA here v2: use glsl_type_is_leaf from nir_types (Timothy Arceri) v3: simplified aoa check, without the need ot using glsl_type_is_leaf, using glsl_types_is_struct (Timothy Arceri) Reviewed-by: Timothy Arceri <[email protected]>
* nir_types: add glsl_type_is_struct helperAlejandro Piñeiro2019-03-082-0/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/xfb: sort varyings tooAlejandro Piñeiro2019-03-081-2/+17
| | | | | | | | | | Right now we are only re-sorting outputs. But it is better to sort too varyings, as linker expect them to be sorted out (as it was done on GLSL). For varyings, and to make easier to compute buffer_index, we sort also by buffer. We could do the same for outputs, but we lack a reason for that, so we left it as it is (just offset). Reviewed-by: Timothy Arceri <[email protected]>
* nir/xfb: adding varyings on nir_xfb_info and gather_infoAlejandro Piñeiro2019-03-082-7/+44
| | | | | | | | | | | | | | | | | | | | | | | | | In order to be used for OpenGL (right now for ARB_gl_spirv). This commit adds two new structures: * nir_xfb_varying_info: that identifies each individual varying. For each one, we need to know the type, buffer and xfb_offset * nir_xfb_buffer_info: as now for each buffer, in addition to the stride, we need to know how many varyings are assigned to it. For this patch, the only case where num_outputs != num_varyings is with the case of doubles, that for dvec3/4 could require more than one output. There are more cases though (like aoa), that will be handled on following patches. v2: updated after new nir general XFB support introduced for "anv: Add support for VK_EXT_transform_feedback" v3: compute num_varyings beforehand for allocating, instead of relying on num_outputs as approximate value (Timothy Arceri) Reviewed-by: Timothy Arceri <[email protected]>
* nir_types: add glsl_varying_count helperAlejandro Piñeiro2019-03-082-0/+7
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/xfb: add component_offset at nir_xfb_infoAlejandro Piñeiro2019-03-082-0/+4
| | | | | | | | | | | | | | | Where component_offset here is the offset when accessing components of a packed variable. Or in other words, location_frac on nir.h. Different places of mesa use different names for it. Technically nir_xfb_info consumer can get the same from the component_mask, it seems somewhat forced to make it to compute it, instead of providing it. v2: rename local location_frac for comp_offset, more similar to the intended use (Timothy Arceri) Reviewed-by: Timothy Arceri <[email protected]>
* nir/builder: Add a build_deref_array_imm helperJason Ekstrand2019-03-077-17/+25
| | | | | | | | Unlike most of the cases in which we do this by hand, the new helper properly handles non-32-bit pointers. Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/builder: Cast array indices in build_deref_followerJason Ekstrand2019-03-071-1/+7
| | | | | | | | | | | | There's no guarantee when build_deref_follower is called that the two derefs have the same bit size destination. Insert a cast on the array index in case we have differing bit sizes. While we're here, insert some asserts in build_deref_array and build_deref_ptr_as_array. The validator will catch violations here but they're easier to debug if we catch them while building. Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/builder: Emit better code for iadd/imul_immJason Ekstrand2019-03-072-5/+24
| | | | | | | | | Because we already know the immediate right-hand parameter, we can potentially save the optimizer a bit of work. Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: free dead_ctx in case of no progressTapani Pälli2019-03-071-1/+3
| | | | | | | | | | | | | | | Fixes a leak: ==7576== 320 (48 direct, 272 indirect) bytes in 1 blocks are definitely lost in loss record 26 of 26 ==7576== at 0x4C2EE3B: malloc (vg_replace_malloc.c:309) ==7576== by 0x53EF0E4: ralloc_size (ralloc.c:119) ==7576== by 0x53EF0C2: ralloc_context (ralloc.c:113) ==7576== by 0x5471F64: nir_split_per_member_structs (nir_split_per_member_structs.c:176) ==7576== by 0x51288CF: anv_shader_compile_to_nir (anv_pipeline.c:216) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* glsl: use NIR function inlining for drivers that use glsl_to_nir()Timothy Arceri2019-03-062-2/+83
| | | | | | | | glsl_to_nir() is still missing support for converting certain functions to NIR, so for those we use the GLSL IR optimisations to remove the functions. Reviewed-by: Eric Anholt <[email protected]>
* glsl/freedreno/panfrost: pass gl_context to the standalone compilerTimothy Arceri2019-03-063-5/+7
| | | | | | | This allows us to use the ctx with glsl_to_nir() in a following patch. Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-062-17/+35
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/deref: Expose nir_opt_deref_implJason Ekstrand2019-03-062-1/+2
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/inline_functions: Break inlining into a builder helperJason Ekstrand2019-03-063-40/+60
| | | | | | | | | | | | | This pulls the guts of function inlining into a builder helper so that it can be used elsewhere. The rest of the infrastructure is still needed for most inlining cases to ensure that everything gets inlined and only ever once. However, there are use-cases where you just want to inline one little thing. This new helper also has a neat trick where it can seamlessly inline a function from one nir_shader into another. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Inline functions in float64_funcs_to_nirJason Ekstrand2019-03-061-0/+5
| | | | | | | | | | This doesn't really change anything as the functions will all get inlined anyway. However it does let us do a bit of the work earlier and in a common place. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Add a shared helper for building float64 shadersJason Ekstrand2019-03-064-0/+65
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Teach loop unrolling about 64-bit instruction loweringJason Ekstrand2019-03-063-13/+79
| | | | | | | | | | | | | | | | The lowering we do for 64-bit instructions can cause a single NIR ALU instruction to blow up into hundreds or thousands of instructions potentially with control flow. If loop unrolling isn't aware of this, it can unroll a loop 20 times which contains a nir_op_fsqrt which we then lower to a full software implementation based on integer math. Those 20 invocations suddenly get a lot more expensive than NIR loop unrolling currently expects. By giving it an approximate estimate function, we can prevent loop unrolling from going to town when it shouldn't. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Expose double and int64 op_to_options_mask helpersJason Ekstrand2019-03-063-51/+23
| | | | | | | | | We already have one internally for int64 but we don't have a similar one for doubles so we'll have to make one. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* compiler/nir: add an is_conversion field to nir_op_infoIago Toral Quiroga2019-03-063-33/+47
| | | | | | | | | This is set to True only for numeric conversion opcodes. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-0621-39/+39
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_types -> struct_typesTimothy Arceri2019-03-062-10/+10
| | | | | | Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-065-7/+7
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename get_record_instance() -> get_struct_instance()Timothy Arceri2019-03-065-7/+7
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/get_record_instance(/get_struct_instance(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename is_record() -> is_struct()Timothy Arceri2019-03-0618-80/+80
| | | | | | | | | | Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* nir/spirv: initial handling of OpenCL.std extension opcodesKarol Herbst2019-03-0512-3/+602
| | | | | | | | | | | | | | | | | | Not complete, mostly just adding things as I encounter them in CTS. But not getting far enough yet to hit most of the OpenCL.std instructions. Anyway, this is better than nothing and covers the most common builtins. v2: add hadd proof from Jason move some of the lowering into opt_algebraic and create new nir opcodes simplify nextafter lowering fix normalize lowering for inf rework upsample to use nir_pack_bits add missing files to build systems v3: split lines of iadd/sub_sat expressions Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/vtn: add support for SpvBuiltInGlobalLinearIdKarol Herbst2019-03-055-12/+45
| | | | | | | | v2: use formula with fewer operations Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add support for address bit sized system valuesKarol Herbst2019-03-052-18/+29
| | | | | | | | | | | v2: add assert in else clause make local group intrinsics 32 bit wide v3: always use 32 bit constant for local_size v4: add comment by Jason Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: improve parsing of the memory modelKarol Herbst2019-03-053-7/+45
| | | | | | | v2: add some vtn_fail_ifs Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: replace magic numbers with M_PIKarol Herbst2019-03-051-2/+2
| | | | | | | we define it inside 'include/c99_math.h' so it is safe to use. Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add multiplier argument to nir_lower_uniforms_to_ubo.Timur Kristóf2019-03-052-9/+16
| | | | | | | | | | | | | Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Move nir_lower_uniforms_to_ubo to compiler/nir.Timur Kristóf2019-03-054-0/+102
| | | | | | | | | | | | The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Add ability for shaders to use window space coordinates.Timur Kristóf2019-03-051-0/+3
| | | | | | | | | | | | | This patch adds a shader_info field that tells the driver to use window space coordinates for a given vertex shader. It also enables this feature in radeonsi (the only NIR-capable driver that supported it in TGSI), and makes tgsi_to_nir aware of it. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Move the stores for fixed function VS output reads into NIR.Eric Anholt2019-03-051-0/+9
| | | | | | | | | | | | | | | This lets us emit the VPM_WRITEs directly from nir_intrinsic_store_output() (useful once NIR scheduling is in place so that we can reduce register pressure), and lets future NIR scheduling schedule the math to generate them. Even in the meantime, it looks like this lets NIR DCE some more code and make better decisions. total instructions in shared programs: 6429246 -> 6412976 (-0.25%) total threads in shared programs: 153924 -> 153934 (<.01%) total loops in shared programs: 486 -> 483 (-0.62%) total uniforms in shared programs: 2385436 -> 2388195 (0.12%) Acked-by: Ian Romanick <[email protected]> (nir)
* nir: Improve printing of load_input/store_output variable names.Eric Anholt2019-03-051-2/+4
| | | | | | | We were printing only when the channel was exactly the start channel, so scalarized loads/stores would be missing the name on the rest. Reviewed-by: Ian Romanick <[email protected]>
* spirv: Use the same types for resource indices as pointersJason Ekstrand2019-03-052-26/+42
| | | | | | | | We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the generic dereference function for OpArrayLengthJason Ekstrand2019-03-051-1/+1
| | | | | | | | With the new deref changes, the old pointer_offset version may not be the right one to call. Just call the generic one and let it sort it out. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Pull offset/stride from the pointer for OpArrayLengthJason Ekstrand2019-03-051-2/+10
| | | | | | | | | We can't pull it from the variable type because it might be an array of blocks and not just the one block. While we're here, throw in some error checking. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* intel,nir: Lower TXD with min_lod when the sampler index is not < 16Jason Ekstrand2019-03-042-0/+27
| | | | | | | | | | | When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: cb98e0755f8d "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* spirv: OpImageQueryLod requires a samplerJason Ekstrand2019-03-041-1/+1
| | | | | | | | | | No idea how this fell through the cracks besides the fact that the sampler bound at 0 almost always works and the CTS isn't amazing. In any case, this appears to have been broken for almost forever. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* spirv: Allow [i/u]mulExtended to use new nir opcodeSagar Ghuge2019-03-041-6/+10
| | | | | | | | | | Use new nir opcode nir_[i/u]mul_2x32_64 and extract lower and higher 32 bits as needed instead of emitting mul and mul_high. v2: Surround the switch case with curly braces (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Optimize low 32 bit extractionSagar Ghuge2019-03-041-0/+2
| | | | | | | | | Optimize a situation where we only need lower 32 bits from 64 bit result. Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>