summaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: put compact into bitfields in nir_variable_dataDave Airlie2017-09-071-1/+1
| | | | | | | | | | This being declared bool means it won't get merged with the previous bitfields, this seems like an oversight rather than deliberate. Noticed when running pahole. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Remove series of unnecessary conversionsMatt Turner2017-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | Clang warns: warning: absolute value function 'fabsf' given an argument of type 'const float64_t' (aka 'const double') but has parameter of type 'float' which may cause truncation of value [-Wabsolute-value] float64_t dst = bit_size == 64 ? fabs(src0) : fabsf(src0); The type of the ternary expression will be the common type of fabs() and fabsf(): double. So fabsf(src0) will be implicitly converted to double. We may as well just convert src0 to double before a call to fabs() and remove the needless complexity, à la float64_t dst = fabs(src0); Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Fix system_value_from_intrinsic for subgroupsJason Ekstrand2017-08-281-4/+4
| | | | | | | A couple of the cases were backwards Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* nir: Fix some whatespaceJason Ekstrand2017-08-281-5/+5
| | | | | | Somehow tabs got in there... Reviewed-by: Matt Turner <[email protected]>
* nir: fix algebraic optimizationsConnor Abbott2017-08-011-2/+2
| | | | | | | | The optimizations are only valid for 32-bit integers. They were mistakenly firing for 64-bit integers as well. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* nir: add nir_lower_uniforms_to_ubo passNicolai Hähnle2017-07-312-0/+98
| | | | | | | | | | This is a further lowering of default-block uniform loads that transforms load_uniform intrinsics into load_ubo intrinsics. This simplifies the rest of the backend. v2: transform from load_uniform instead of straight from variables Reviewed-by: Eric Anholt <[email protected]>
* nir: add nir_lower_samplers_as_deref passNicolai Hähnle2017-07-312-0/+245
| | | | | | This pass is a replacement for the nir_lower_samplers pass, which has the advantage of keeping sampler references as derefs. This allows a unified treatment of texture instructions and image intrinsics in the backend.
* nir: add load_frag_coord system value intrinsicNicolai Hähnle2017-07-313-0/+6
| | | | | | | Some drivers prefer to treat gl_FragCoord as a system value rather than a fragment shader input, see Const.GLSLFragCoordIsSysVal. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix nir_lower_wpos_ytransform when gl_FragCoord is a system valueNicolai Hähnle2017-07-311-2/+4
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add nir_instr_rewrite_derefNicolai Hähnle2017-07-312-0/+15
| | | | | | Allows modifying a texture instruction's texture and sampler derefs. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Optimize find_lsb/imsb/umsb error checksMatt Turner2017-07-201-0/+11
| | | | | | | | Two of the ARB_shader_ballot piglit tests hit the find_lsb case, removing some of the noise allowed me to better debug the test when it was failing. Reviewed-by: Connor Abbott <[email protected]>
* nir: Reduce destination size of ballot intrinsic when possibleMatt Turner2017-07-202-0/+20
| | | | | | | | | Some hardware, like i965, doesn't support group sizes greater than 32. In that case, we can reduce the destination size of the ballot intrinsic, which will simplify our code generation. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add pass to scalarize read_invocation/read_first_invocationMatt Turner2017-07-202-1/+113
| | | | | | | i965 will want these to be scalar operations. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add system values from ARB_shader_ballotMatt Turner2017-07-205-1/+81
| | | | | | | | | | | | | We already had a channel_num system value, which I'm renaming to subgroup_invocation to match the rest of the new system values. Note that while ballotARB(true) will return zeros in the high 32-bits on systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB variables do not consider whether channels are enabled. See issue (1) of ARB_shader_ballot. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add intrinsics from ARB_shader_ballotMatt Turner2017-07-201-0/+13
| | | | | Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Support lowering vote intrinsicsMatt Turner2017-07-202-2/+4
| | | | | | | ... trivially (as allowed by the spec!) by reusing the existing nir_opt_intrinsics code. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add pass to optimize intrinsicsMatt Turner2017-07-202-0/+97
| | | | | | | Specifically, constant fold intrinsics from ARB_shader_group_vote, but I suspect it'll be useful for other things in the future. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add intrinsics from ARB_shader_group_voteMatt Turner2017-07-201-0/+5
| | | | | | | | These are intrinsics rather than opcodes, because they operate across channels. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use nir_src_copy instead of direct assignments.Kenneth Graunke2017-07-183-9/+9
| | | | | | | | | | | | | | | If the source is an indirect register, there is ralloc'd data. Copying with a direct assignment will copy the pointer, but the data will still belong to the old instruction's memory context. Since we're lowering and throwing away instructions, that could free the data by mistake. Instead, use nir_src_copy, which properly handles this. This is admittedly not a common case, so I think the bug is real, but unlikely to be hit. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* nir: fix nir_opt_copy_prop_vars() for arrays of arraysTimothy Arceri2017-07-191-6/+6
| | | | | | | | | | Previously we only incremented the guide for a single dimension/wildcard. V2: rework logic to avoid code duplication Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* nir/vars_to_ssa: Handle missing struct members in foreach_deref_nodeJason Ekstrand2017-07-191-2/+6
| | | | | | | | | This can happen if, for instance, you have an array of structs and there are both direct and wildcard references to the same struct and some members only have direct or only have indirect. Reviewed-by: Timothy Arceri <[email protected]> Cc: [email protected]
* nir/lower_io_to_temporaries: don't set compact on shadow varsConnor Abbott2017-07-131-0/+1
| | | | | | | | | The compact flag doesn't make sense on local variables, since the packing on them is up to the driver. This fixes nir_validate assertions in some cases, particularly when lower_io_to_temporaries is used on per-vertex inputs/outputs. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: don't segfault when printing variables with no nameConnor Abbott2017-07-131-1/+1
| | | | | | | | | | | | While normally we give variables whose name field is NULL a temporary name when called from nir_print_shader(), when we were calling from nir_print_instr() we never bothered, meaning that we just segfaulted when trying to print out instructions with such a variable. Since nir_print_instr() is meant to be called while debugging, we don't need to bother too much about giving a consistent name, but we don't want to crash in the middle of debugging. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: copy front interpolation when creating fake back color inputIlia Mirkin2017-07-081-2/+6
| | | | | | | | Fixes a bunch of gl_BackColor interpolation tests that had explicit interpolation specified on the fragment shader gl_Color. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: add NIR_PRINT environment variableNicolai Hähnle2017-07-051-0/+19
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir: Add a lowering pass for UYVY texturesJohnson Lin2017-06-302-0/+19
| | | | | | Similar with support for YUYV but with byte order difference in sampler Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: sge operation is defined for floating-point typesJuan A. Suarez Romero2017-06-271-1/+1
| | | | | | | According to GLSL.std.450 spec, the operand for step() function must be a floating-point. It does not restrict the value to 32-bit floats. Reviewed by: Elie Tournier <[email protected]>
* nir: make various getters take const pointersGrazvydas Ignotas2017-06-102-13/+14
| | | | | | | | This will allow to constify other things. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Delete nir_array.hThomas Helland2017-06-071-99/+0
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Remove unused includeThomas Helland2017-06-071-1/+0
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* tree-wide: remove trailing backslashEric Engestrom2017-06-071-1/+1
| | | | | | | | | Simple search for a backslash followed by two newlines. If one of the newlines were to be removed, this would cause issues, so let's just remove these trailing backslashes. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir/lower-atomics-to-ssbo: remove atomic_uint arrays tooRob Clark2017-05-231-1/+9
| | | | | | | | Maybe there is a better way to do this. But by the time we get to assigning uniform locs, we want the atomic_uint's to all be gone, otherwise we assert in st_glsl_attrib_type_size(). Signed-off-by: Rob Clark <[email protected]>
* nir/lower-atomics-to-ssbo: fix num_componentsRob Clark2017-05-231-0/+5
| | | | | | Fixes some piglits like arb_shader_atomic_counters-active-counters Signed-off-by: Rob Clark <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-0911-59/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: fix (hopefully) windows buildRob Clark2017-05-082-4/+4
| | | | | Fixes: 53aa109b ("nir: add pass to lower atomic counters to SSBO") Signed-off-by: Rob Clark <[email protected]>
* nir: Fix missing snprintf symbol on Windows.Jose Fonseca2017-05-071-0/+4
| | | | | | | | | | Copy nir_print.c's snprintf definition for now, to unbreak Windows builds. We can and should cleanup all snprintf definitions in a follow up change, but I rather not leave Windows build broken any further. Trivial.
* nir: add pass to lower atomic counters to SSBORob Clark2017-05-042-0/+219
| | | | | | | | This is equivalent to what mesa/st does in glsl_to_tgsi. For most hw there isn't a particularly good reason to treat these differently. Signed-off-by: Rob Clark <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/lower_tex: Fix minor error in YUV color conversion matrixJohnson Lin2017-05-031-3/+3
| | | | | | | | | | | | | | | The matrix used for YCbCr to RGB is listed in: https://en.wikipedia.org/wiki/YCbCr There was an error in converting the offsets from integers to unorm values: 0.0625=16/256 should be 16.0/255,and 0.5=128.0/256 should be 128.0/255. With this fix, the CSC result is bit aligned with wikipedia's conversion result and FFMPeg's result. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* compiler: Add a system value and varying for ViewIndexJason Ekstrand2017-05-032-0/+5
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Pick just the channels we want for bitmap and drawpixels lowering.Eric Anholt2017-05-022-2/+6
| | | | | | | | | | | | NIR now validates that SSA references use the same number of channels as are in the SSA value. v2: Reword commit message, since the commit didn't land before the validation change did. Fixes: 370d68babcbb ("nir/validate: Validate that bit sizes and components always match") Reviewed-by: Jason Ekstrand <[email protected]> (v1) Cc: <[email protected]>
* nir/i965: add before ffma algebraic optsTimothy Arceri2017-04-242-0/+24
| | | | | | | | | | | | | | | | | | | | | | | This shuffles constants down in the reverse of what the previous patch does and applies some simpilifications that may be made possible from doing so. Shader-db results BDW: total instructions in shared programs: 12980814 -> 12977822 (-0.02%) instructions in affected programs: 281889 -> 278897 (-1.06%) helped: 1231 HURT: 128 total cycles in shared programs: 246562852 -> 246567288 (0.00%) cycles in affected programs: 11271524 -> 11275960 (0.04%) helped: 1630 HURT: 1378 V2: mark float opts as inexact Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: shuffle constants to the topTimothy Arceri2017-04-242-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V2: mark float opts as inexact If one of the inputs to an mul/add is the result of another mul/add there is a chance that we can reuse the result of that mul/add in other calls if we do the multiplication in the right order. Also by attempting to move all constants to the top we increase the chance of constant folding. For example it is a fairly common pattern for shaders to do something similar to this: const float a = 0.5; in vec4 b; in float c; ... b.x = b.x * c; b.y = b.y * c; ... b.x = b.x * a + a; b.y = b.y * a + a; So by simply detecting that constant a is part of the multiplication in ffma and switching it with previous fmul that updates b we end up with: ... c = a * c; ... b.x = b.x * c + a; b.y = b.y * c + a; Shader-db results BDW: total instructions in shared programs: 13011050 -> 12967888 (-0.33%) instructions in affected programs: 4118366 -> 4075204 (-1.05%) helped: 17739 HURT: 1343 total cycles in shared programs: 246717952 -> 246410716 (-0.12%) cycles in affected programs: 166870802 -> 166563566 (-0.18%) helped: 18493 HURT: 7965 total spills in shared programs: 14937 -> 14560 (-2.52%) spills in affected programs: 9331 -> 8954 (-4.04%) helped: 284 HURT: 33 total fills in shared programs: 20211 -> 19671 (-2.67%) fills in affected programs: 12586 -> 12046 (-4.29%) helped: 286 HURT: 33 LOST: 39 GAINED: 33 Some of the hurt will go away when we shuffle things back down to the bottom in the following patch. It's also noteworthy that almost all of the spill changes are in Deus Ex both hurt and helped. Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add flt comparision simplificationTimothy Arceri2017-04-242-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | Didn't turn out as useful as I'd hoped, but it will help alot more on i965 by reducing regressions when we drop brw_do_channel_expressions() and brw_do_vector_splitting(). I'm not sure how much sense 'is_not_used_by_conditional' makes on platforms other than i965 but since this is a new opt it at least won't do any harm. shader-db BDW: total instructions in shared programs: 13029581 -> 13029415 (-0.00%) instructions in affected programs: 15268 -> 15102 (-1.09%) helped: 86 HURT: 0 total cycles in shared programs: 247038346 -> 247036198 (-0.00%) cycles in affected programs: 692634 -> 690486 (-0.31%) helped: 183 HURT: 27 Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add GLSL_TYPE_[U]INT64 to some switch statementsJason Ekstrand2017-04-162-0/+4
| | | | | Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Destination component count of shader_clock intrinsic is 2Boyan Ding2017-04-141-1/+1
| | | | | | | | | | | This fixes the following error when using ARB_shader_clock on i965: vec1 32 ssa_0 = intrinsic shader_clock () () () intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */ error: src->ssa->num_components == num_components (nir/nir_validate.c:204) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* nir/print: add compute shader infoRob Clark2017-04-141-0/+13
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Add support for 8 and 16-bit typesJason Ekstrand2017-03-303-2/+24
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/constant_expressions: Don't switch on bit size when not neededJason Ekstrand2017-03-301-14/+23
| | | | | | | | | | | | For opcodes such as the nir_op_pack_64_2x32 for which all sources and destinations have explicit sizes, the bit_size parameter to the evaluate function is pointless and *should* do nothing. Previously, we were always switching on the bit_size and asserting if it isn't one of the sizes in the list. This generates way more code than needed and is a bit cruel because it doesn't let us have a bit_size of zero on an ALU op which shouldn't need a bit_size. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/constant_expressions: Pull the guts out into a helper blockJason Ekstrand2017-03-301-98/+101
| | | | Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_wpos_center: support adding sample position to fragment coordinateIago Toral Quiroga2017-03-242-8/+25
| | | | | | | | | | | | | According to section 14.6 of the Vulkan specification: "When sample shading is enabled, the x and y components of FragCoord reflect the location of the sample corresponding to the shader invocation." So add a boolean parameter to the lowering pass to select this behavior when we need it. Reviewed-by: Jason Ekstrand <[email protected]>