aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir/lower-atomics-to-ssbo: remove atomic_uint arrays tooRob Clark2017-05-231-1/+9
| | | | | | | | Maybe there is a better way to do this. But by the time we get to assigning uniform locs, we want the atomic_uint's to all be gone, otherwise we assert in st_glsl_attrib_type_size(). Signed-off-by: Rob Clark <[email protected]>
* nir/lower-atomics-to-ssbo: fix num_componentsRob Clark2017-05-231-0/+5
| | | | | | Fixes some piglits like arb_shader_atomic_counters-active-counters Signed-off-by: Rob Clark <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-0911-59/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: fix (hopefully) windows buildRob Clark2017-05-082-4/+4
| | | | | Fixes: 53aa109b ("nir: add pass to lower atomic counters to SSBO") Signed-off-by: Rob Clark <[email protected]>
* nir: Fix missing snprintf symbol on Windows.Jose Fonseca2017-05-071-0/+4
| | | | | | | | | | Copy nir_print.c's snprintf definition for now, to unbreak Windows builds. We can and should cleanup all snprintf definitions in a follow up change, but I rather not leave Windows build broken any further. Trivial.
* nir: add pass to lower atomic counters to SSBORob Clark2017-05-042-0/+219
| | | | | | | | This is equivalent to what mesa/st does in glsl_to_tgsi. For most hw there isn't a particularly good reason to treat these differently. Signed-off-by: Rob Clark <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir/lower_tex: Fix minor error in YUV color conversion matrixJohnson Lin2017-05-031-3/+3
| | | | | | | | | | | | | | | The matrix used for YCbCr to RGB is listed in: https://en.wikipedia.org/wiki/YCbCr There was an error in converting the offsets from integers to unorm values: 0.0625=16/256 should be 16.0/255,and 0.5=128.0/256 should be 128.0/255. With this fix, the CSC result is bit aligned with wikipedia's conversion result and FFMPeg's result. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* compiler: Add a system value and varying for ViewIndexJason Ekstrand2017-05-032-0/+5
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Pick just the channels we want for bitmap and drawpixels lowering.Eric Anholt2017-05-022-2/+6
| | | | | | | | | | | | NIR now validates that SSA references use the same number of channels as are in the SSA value. v2: Reword commit message, since the commit didn't land before the validation change did. Fixes: 370d68babcbb ("nir/validate: Validate that bit sizes and components always match") Reviewed-by: Jason Ekstrand <[email protected]> (v1) Cc: <[email protected]>
* nir/i965: add before ffma algebraic optsTimothy Arceri2017-04-242-0/+24
| | | | | | | | | | | | | | | | | | | | | | | This shuffles constants down in the reverse of what the previous patch does and applies some simpilifications that may be made possible from doing so. Shader-db results BDW: total instructions in shared programs: 12980814 -> 12977822 (-0.02%) instructions in affected programs: 281889 -> 278897 (-1.06%) helped: 1231 HURT: 128 total cycles in shared programs: 246562852 -> 246567288 (0.00%) cycles in affected programs: 11271524 -> 11275960 (0.04%) helped: 1630 HURT: 1378 V2: mark float opts as inexact Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: shuffle constants to the topTimothy Arceri2017-04-242-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | V2: mark float opts as inexact If one of the inputs to an mul/add is the result of another mul/add there is a chance that we can reuse the result of that mul/add in other calls if we do the multiplication in the right order. Also by attempting to move all constants to the top we increase the chance of constant folding. For example it is a fairly common pattern for shaders to do something similar to this: const float a = 0.5; in vec4 b; in float c; ... b.x = b.x * c; b.y = b.y * c; ... b.x = b.x * a + a; b.y = b.y * a + a; So by simply detecting that constant a is part of the multiplication in ffma and switching it with previous fmul that updates b we end up with: ... c = a * c; ... b.x = b.x * c + a; b.y = b.y * c + a; Shader-db results BDW: total instructions in shared programs: 13011050 -> 12967888 (-0.33%) instructions in affected programs: 4118366 -> 4075204 (-1.05%) helped: 17739 HURT: 1343 total cycles in shared programs: 246717952 -> 246410716 (-0.12%) cycles in affected programs: 166870802 -> 166563566 (-0.18%) helped: 18493 HURT: 7965 total spills in shared programs: 14937 -> 14560 (-2.52%) spills in affected programs: 9331 -> 8954 (-4.04%) helped: 284 HURT: 33 total fills in shared programs: 20211 -> 19671 (-2.67%) fills in affected programs: 12586 -> 12046 (-4.29%) helped: 286 HURT: 33 LOST: 39 GAINED: 33 Some of the hurt will go away when we shuffle things back down to the bottom in the following patch. It's also noteworthy that almost all of the spill changes are in Deus Ex both hurt and helped. Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add flt comparision simplificationTimothy Arceri2017-04-242-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | Didn't turn out as useful as I'd hoped, but it will help alot more on i965 by reducing regressions when we drop brw_do_channel_expressions() and brw_do_vector_splitting(). I'm not sure how much sense 'is_not_used_by_conditional' makes on platforms other than i965 but since this is a new opt it at least won't do any harm. shader-db BDW: total instructions in shared programs: 13029581 -> 13029415 (-0.00%) instructions in affected programs: 15268 -> 15102 (-1.09%) helped: 86 HURT: 0 total cycles in shared programs: 247038346 -> 247036198 (-0.00%) cycles in affected programs: 692634 -> 690486 (-0.31%) helped: 183 HURT: 27 Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add GLSL_TYPE_[U]INT64 to some switch statementsJason Ekstrand2017-04-162-0/+4
| | | | | Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Destination component count of shader_clock intrinsic is 2Boyan Ding2017-04-141-1/+1
| | | | | | | | | | | This fixes the following error when using ARB_shader_clock on i965: vec1 32 ssa_0 = intrinsic shader_clock () () () intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */ error: src->ssa->num_components == num_components (nir/nir_validate.c:204) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* nir/print: add compute shader infoRob Clark2017-04-141-0/+13
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Add support for 8 and 16-bit typesJason Ekstrand2017-03-303-2/+24
| | | | | Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/constant_expressions: Don't switch on bit size when not neededJason Ekstrand2017-03-301-14/+23
| | | | | | | | | | | | For opcodes such as the nir_op_pack_64_2x32 for which all sources and destinations have explicit sizes, the bit_size parameter to the evaluate function is pointless and *should* do nothing. Previously, we were always switching on the bit_size and asserting if it isn't one of the sizes in the list. This generates way more code than needed and is a bit cruel because it doesn't let us have a bit_size of zero on an ALU op which shouldn't need a bit_size. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/constant_expressions: Pull the guts out into a helper blockJason Ekstrand2017-03-301-98/+101
| | | | Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir/lower_wpos_center: support adding sample position to fragment coordinateIago Toral Quiroga2017-03-242-8/+25
| | | | | | | | | | | | | According to section 14.6 of the Vulkan specification: "When sample shading is enabled, the x and y components of FragCoord reflect the location of the sample corresponding to the shader invocation." So add a boolean parameter to the lowering pass to select this behavior when we need it. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_convert_from_ssa().Matt Turner2017-03-232-8/+15
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_io().Matt Turner2017-03-232-6/+15
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_regs_to_ssa().Matt Turner2017-03-232-6/+10
| | | | | | And from nir_lower_regs_to_ssa_impl() as well. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_samplers().Matt Turner2017-03-232-12/+19
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_atomics().Matt Turner2017-03-232-7/+13
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_clamp_color_outputs().Matt Turner2017-03-232-10/+22
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_clip_fs().Matt Turner2017-03-232-3/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_clip_vs().Matt Turner2017-03-232-5/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_move_vec_src_uses_to_dest().Matt Turner2017-03-232-6/+18
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_to_source_mods().Matt Turner2017-03-232-6/+29
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_clip_cull_distance_arrays().Matt Turner2017-03-232-5/+21
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_var_copies().Matt Turner2017-03-232-4/+16
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_load_const_to_scalar().Matt Turner2017-03-232-7/+21
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_64bit_pack().Matt Turner2017-03-232-4/+14
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_doubles().Matt Turner2017-03-232-22/+42
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Return progress from nir_lower_vars_to_ssa().Matt Turner2017-03-232-3/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix syntax.Matt Turner2017-03-232-6/+6
| | | | | | | et is not an abbreviation. Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix misspellings.Matt Turner2017-03-234-7/+7
| | | | | Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Stop using apostrophes to pluralize.Matt Turner2017-03-2321-43/+43
| | | | | Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: consistently use ifndef guards over pragma onceEmil Velikov2017-03-2210-11/+38
| | | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Vedran Miletić <[email protected]> Acked-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nir: Add positional argument specifiers.Vinson Lee2017-03-212-2/+2
| | | | | | | | | | | | | Fix build with Python < 2.7. File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module> from nir_opcodes import opcodes File "src/compiler/nir/nir_opcodes.py", line 178, in <module> unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size), ValueError: zero length field name in format Fixes: 762a6333f21f ("nir: Rework conversion opcodes") Signed-off-by: Vinson Lee <[email protected]>
* nir/constant_expressions: Refactor helper functionsJason Ekstrand2017-03-141-24/+27
| | | | | | | | Apart from avoiding some unneeded size cases, this shouldn't have any actual functional impact. Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-148-189/+121
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* nir: Rewrite nir_type_conversion_opJason Ekstrand2017-03-141-63/+92
| | | | | | | | | The original version was very convoluted and tried way too hard to not just have the nested switch statement that it needs. Let's just write the obvious code and then we know it's correct. This fixes a bunch of missing cases particularly with int64. Reviewed-by: Plamena Manolova <[email protected]>
* nir: Add a get_nir_type_for_glsl_base_type helperJason Ekstrand2017-03-141-2/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Rework ALU bit-size rule validationJason Ekstrand2017-03-141-32/+33
| | | | | | | | | | | The original bit-size validation wasn't capable of properly dealing with instructions with variable bit sizes. An attempt was made to handle it by looking at source and destinations but, because the validation was done in validate_alu_(src|dest), it didn't really have the needed information. The new validation code is much more straightforward and should be more correct. Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Validate that bit sizes and components always matchJason Ekstrand2017-03-141-38/+63
| | | | | | | | | | | | | We've always required bit sizes to match but the rules for number of components have been a bit loose. You've never been allowed to source from something with less components than you consume, but more has always been fine. This changes the validator to require that they match exactly. The fact that they don't always match has been a source of confusion in NIR for quite some time and it's time we got rid of it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Make image_size a variable-width intrinsicJason Ekstrand2017-03-141-1/+1
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_tex: Use tex_instr_dest_size for txs destinationsJason Ekstrand2017-03-141-1/+2
| | | | | | | | | Using coord_components of the source texture is correct for everything except cube maps where it's off by one. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/copy_prop: Respect the source's number of componentsJason Ekstrand2017-03-141-33/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the near future we are going to require that the num_components in a src dereference match the num_components of the SSA value being dereferenced. To do that, we need copy_prop to not remove our MOVs from a larger SSA value into an instruction that uses fewer channels. Because we suddenly have to know how many components each source has, this makes the pass a bit more complicated. Fortunately, copy propagation is the only pass that cares about the number of components are read by any given source so it's fairly contained. Shader-db results on Sky Lake: total instructions in shared programs: 13318947 -> 13320265 (0.01%) instructions in affected programs: 260633 -> 261951 (0.51%) helped: 324 HURT: 1027 Looking through the hurt programs, about a dozen are hurt by 3 instructions and the rest are all hurt by 2 instructions. From a spot-check of the shaders, the story is always the same: They get a vec4 from somewhere (frequently an input) and use the first two or three components as a texture coordinate. Because of the vector component mismatch, we have a mov or, more likely, a vecN sitting between the texture instruction and the input. This means that the back-end inserts a bunch of MOVs and split_virtual_grfs() goes to town. Because the texture coordinate is also used by some other calculation, register coalesce can't combine them back together and we end up with an extra 2 MOV instructions in our shader. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/intrinsics: Make load_barycentric_input take a 2-component coorJason Ekstrand2017-03-141-1/+3
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0 13.0" <[email protected]>