summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-069-63/+53
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/deref: Expose nir_opt_deref_implJason Ekstrand2019-03-062-1/+2
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/inline_functions: Break inlining into a builder helperJason Ekstrand2019-03-063-40/+60
| | | | | | | | | | | | | This pulls the guts of function inlining into a builder helper so that it can be used elsewhere. The rest of the infrastructure is still needed for most inlining cases to ensure that everything gets inlined and only ever once. However, there are use-cases where you just want to inline one little thing. This new helper also has a neat trick where it can seamlessly inline a function from one nir_shader into another. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Inline functions in float64_funcs_to_nirJason Ekstrand2019-03-061-0/+5
| | | | | | | | | | This doesn't really change anything as the functions will all get inlined anyway. However it does let us do a bit of the work earlier and in a common place. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Add a shared helper for building float64 shadersJason Ekstrand2019-03-067-99/+70
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/nir: Drop an unneeded lower_constant_initializers callJason Ekstrand2019-03-061-2/+0
| | | | | | | | | | | Even though this is technically a step in the function inlining process as laid out in nir_inline_functions.c, it's not really needed. We already have constant initializers lowered here and no new ones are added by appending the softfp64 functions. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/debug: Add a debug flag to force software fp64Jason Ekstrand2019-03-063-2/+4
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compile the fp64 program based on nir optionsJason Ekstrand2019-03-061-1/+2
| | | | | | | | | | Instead of looking the devinfo directly, look at the lowering options we provided to NIR. This is more accurate as it's now checking for "do we need full software lowering" rather than a hardware bit. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Teach loop unrolling about 64-bit instruction loweringJason Ekstrand2019-03-063-13/+79
| | | | | | | | | | | | | | | | The lowering we do for 64-bit instructions can cause a single NIR ALU instruction to blow up into hundreds or thousands of instructions potentially with control flow. If loop unrolling isn't aware of this, it can unroll a loop 20 times which contains a nir_op_fsqrt which we then lower to a full software implementation based on integer math. Those 20 invocations suddenly get a lot more expensive than NIR loop unrolling currently expects. By giving it an approximate estimate function, we can prevent loop unrolling from going to town when it shouldn't. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Expose double and int64 op_to_options_mask helpersJason Ekstrand2019-03-063-51/+23
| | | | | | | | | We already have one internally for int64 but we don't have a similar one for doubles so we'll have to make one. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* compiler/nir: add an is_conversion field to nir_op_infoIago Toral Quiroga2019-03-063-33/+47
| | | | | | | | | This is set to True only for numeric conversion opcodes. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Fix extract_u8 of an odd byte from a 64-bit integerIan Romanick2019-03-061-0/+7
| | | | | | | | | | | | | | | In the old code, we would generate the exact same instruction for extract_u8(some_u64, 0) and extract_u8(some_u64, 1). The mask-a-word trick only works for even numbered bytes. This fixes the (new) piglit test tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test. v2: Use a SHR instead of an AND. This saves an instruction compared to using two moves. Suggested by Jason. Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: nir_op_extract_i8 extracts a byte, not a wordIan Romanick2019-03-061-2/+4
| | | | | Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Silence unused parameter warning in brw_interpolation_map.cIan Romanick2019-03-063-7/+4
| | | | | | | | | | | | The parameter is never used, and it's not part of a common interface idiom. Remove it. src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’: src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo) ^~~~~~~ Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Silence many unused parameter warnings in brw_eu.hIan Romanick2019-03-061-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In file included from src/intel/compiler/brw_eu_util.c:34:0: src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’: src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_desc_header_present(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’: src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’: src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc_ex_mlen(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’: src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_binding_table_index(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’: src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_sampler(const struct gen_device_info *devinfo, uint32_t desc) ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’: src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_return_format(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’: src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_dp_desc_binding_table_index(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’: src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, /**< 0 for SIMD4x2 */ ^~~~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’: src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, ^~~~~~~~~ Reviewed-by: Jason Ekstrand <[email protected]>
* meson: remove unused include_directories(vulkan)Eric Engestrom2019-03-061-1/+1
| | | | | | | The correct include path is "vulkan/…". Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: set num_components on vulkan_resource_index intrinsicLionel Landwerlin2019-03-063-10/+20
| | | | | | | | | | In 61e009d2c4e4df we changed the number of components in the vulkan_resource_index intrinsic and forgot the update Radv's code for it. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 61e009d2c4e4df ("spirv: Use the same types for resource indices as pointers") Reviewed-by: Samuel Pitoiset [email protected]
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-0625-45/+45
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_types -> struct_typesTimothy Arceri2019-03-062-10/+10
| | | | | | Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-068-11/+11
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename get_record_instance() -> get_struct_instance()Timothy Arceri2019-03-065-7/+7
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/get_record_instance(/get_struct_instance(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename is_record() -> is_struct()Timothy Arceri2019-03-0622-90/+90
| | | | | | | | | | Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* nir/spirv: initial handling of OpenCL.std extension opcodesKarol Herbst2019-03-0512-3/+602
| | | | | | | | | | | | | | | | | | Not complete, mostly just adding things as I encounter them in CTS. But not getting far enough yet to hit most of the OpenCL.std instructions. Anyway, this is better than nothing and covers the most common builtins. v2: add hadd proof from Jason move some of the lowering into opt_algebraic and create new nir opcodes simplify nextafter lowering fix normalize lowering for inf rework upsample to use nir_pack_bits add missing files to build systems v3: split lines of iadd/sub_sat expressions Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/vtn: add support for SpvBuiltInGlobalLinearIdKarol Herbst2019-03-055-12/+45
| | | | | | | | v2: use formula with fewer operations Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add support for address bit sized system valuesKarol Herbst2019-03-052-18/+29
| | | | | | | | | | | v2: add assert in else clause make local group intrinsics 32 bit wide v3: always use 32 bit constant for local_size v4: add comment by Jason Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: improve parsing of the memory modelKarol Herbst2019-03-053-7/+45
| | | | | | | v2: add some vtn_fail_ifs Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: replace magic numbers with M_PIKarol Herbst2019-03-051-2/+2
| | | | | | | we define it inside 'include/c99_math.h' so it is safe to use. Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Implement VK_EXT_external_memory_hostCaio Marcelo de Oliveira Filho2019-03-054-1/+133
| | | | | | | v2: Ignore the import if handleType == 0. (Jason) Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Drop the V3D 3.x vpm read dead code elimination.Eric Anholt2019-03-051-33/+2
| | | | | We now have NIR dead code eliminating our VPM reads, so this shouldn't be necessary.
* v3d: Eliminate the TLB and TLBU files.Eric Anholt2019-03-054-41/+20
| | | | We can just use the magic register file like we do for other magic waddrs.
* v3d: Use ldunif instructions for uniforms.Eric Anholt2019-03-0511-270/+27
| | | | | | | | | | | | | | The idea is that for repeated use of the same uniform, we could avoid loading it on each consumer. The results look pretty good. total instructions in shared programs: 6413571 -> 6521464 (1.68%) total threads in shared programs: 154214 -> 154000 (-0.14%) total uniforms in shared programs: 2393604 -> 2119629 (-11.45%) total spills in shared programs: 4960 -> 4984 (0.48%) total fills in shared programs: 6350 -> 6418 (1.07%) Once we do scheduling at the NIR level, the register pressure (and thus also instructions) issues we see here will drop back down.
* v3d: Add support for register-allocating a ldunif to a QFILE_TEMP.Eric Anholt2019-03-052-14/+77
| | | | | On V3D 4.x, we can use ldunifrf to load uniforms to any register, and this will let us schedule the ldunif wherever we want in the program.
* v3d: Drop the old class bits splitting up the accumulators.Eric Anholt2019-03-051-7/+3
| | | | This seems to be left over from vc4, and I don't use them any more.
* v3d: Add support for vir-to-qpu of ldunif instructions to a temp.Eric Anholt2019-03-051-2/+15
| | | | | We can load a uniform to any register, so add support for non-ALU instructions with sig.ldunif to a temp.
* v3d: Switch implicit uniforms over to being any qinst->uniform != ~0.Eric Anholt2019-03-0510-123/+77
| | | | | I'm not sure why I didn't do this before -- it's clearly much simpler to add dumping of the extra thing than to have it as another implicit source.
* v3d: Do uniform rematerialization spilling before dropping threadcountEric Anholt2019-03-051-8/+10
| | | | | | | | | This feels like the right tradeoff for threads vs uniforms, particularly given that we often have very short thread segments right now: total instructions in shared programs: 6411504 -> 6413571 (0.03%) total threads in shared programs: 153946 -> 154214 (0.17%) total uniforms in shared programs: 2387665 -> 2393604 (0.25%)
* v3d: Fix temporary leaks of temp_registers and when spilling.Eric Anholt2019-03-051-5/+4
| | | | | | | On each iteration of successfully spilling a reg, we'd allocate another copy of temp_registers, and when decrementing thread conut we'd allocate another copy of the graph. These all got cleaned up on freeing the compile.
* tgsi_to_nir: Set correct location for uniforms.Timur Kristóf2019-03-051-0/+1
| | | | | | | | | | | | Previously, only the driver_location was set for all variables, but constants need to use the location field instead. This change is necessary because the nine state tracker can produce non-packed constants whose location needs to be explicitly set. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* tgsi_to_nir: Improve interpolation modes.Timur Kristóf2019-03-051-15/+21
| | | | | | | | | | | | | | This patch extracts the interpolation mode translation into a separate function called ttn_translate_interp_mode, adds support for TGSI_INTERPOLATE_COLOR which was missing, and also sets the proper interpolation mode to output variables, which were not set previously. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: use sampler variables and derefsKenneth Graunke2019-03-051-10/+79
| | | | | | | | | | | | | | v2: fix is_shadow, is_array and txq Some drivers (eg. iris) need the presence of sampler variables and derefs so that they can count them to determine the number of samplers used. This change also makes the output NIR closer to what glsl_to_nir outputs. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Support FACE and POSITION properly.Timur Kristóf2019-03-051-12/+68
| | | | | | | | | | | | | | Previously, FACE was hard-coded as a sysval, but TTN emulated it incorrectly. Also, POSITION was not supported when it was a sysval. This patch fixes these by allowing both of them to be sysvals or inputs, based on driver capabilities. It also fixes the TGSI FACE emulation based on the TGSI spec. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Extract ttn_emulate_tgsi_front_face into its own function.Timur Kristóf2019-03-051-14/+20
| | | | | | | | | | | We'll need to use the same logic in other places, so it makes sense to have a separate function for this. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Restructure system value loads.Timur Kristóf2019-03-051-10/+6
| | | | | | | | | | Minor cleanup to the way system value loads work in tgsi_to_nir. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Produce optimized NIR for a given pipe_screen.Timur Kristóf2019-03-058-13/+153
| | | | | | | | | | | | | | | | | | | With this patch, tgsi_to_nir will output NIR that is tailored to the given pipe, by reading its capabilities and adjusting the NIR code to those capabilities similarly to how glsl_to_nir works. It also adds an optimization loop that brings the output NIR in line with what glsl_to_nir outputs. This is necessary for the same reason why glsl_to_nir has its own optimization loop: currently not every driver does these optimizations yet. For uses which cannot pass a pipe_screen we also keep a variant called tgsi_to_nir_noscreen which keeps the old behavior. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Acked-By: Eric Anholt <[email protected]>
* freedreno: Plumb pipe_screen through to irX_tgsi_to_nir.Timur Kristóf2019-03-0512-19/+37
| | | | | | | | | | This patch makes it possible for freedreno to pass a pipe_screen to tgsi_to_nir. This will be needed when tgsi_to_nir supports reading pipe capabilities. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: Add multiplier argument to nir_lower_uniforms_to_ubo.Timur Kristóf2019-03-054-11/+18
| | | | | | | | | | | | | Note that locations can be set in different units, and the multiplier argument caters to supporting these different units. For example, st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4, while tgsi_to_nir uses bytes, so the multiplier should be 16. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Move nir_lower_uniforms_to_ubo to compiler/nir.Timur Kristóf2019-03-059-11/+10
| | | | | | | | | | | | The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Split to smaller functions.Timur Kristóf2019-03-051-26/+56
| | | | | | | | | | | | Previously, tgsi_to_nir was a single big function, and this patch intends to make the code easier to understand by splitting it up to multiple smaller pieces. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Acked-By: Tested-by: Rob Clark <[email protected]>
* tgsi_to_nir: Make the TGSI IF translation code more readable.Timur Kristóf2019-03-051-4/+5
| | | | | | | | | | | This patch is a minor cleanup that only intends to make the TGSI IF translation a bit easier to read. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Fix TGSI LIT translation by using flt.Timur Kristóf2019-03-051-3/+3
| | | | | | | | | | | | TGSI spec says LIT needs a "greater than" comparison. NIR doesn't have that, so let's use "less than" and swap the arguments. Previously "greater than or equal" was used by tgsi_to_nir which is incorrect. Signed-off-by: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>